Kunlun Wanwei releases Q* algorithm to improve 7B model reasoning capability 100 times

Author：Eve Cole Update Time：2025-02-25 19:25:02

Kunlun Wanwei cooperated with Nanyang Technological University in Singapore and successfully developed an algorithm called Q*, which can significantly improve the reasoning capabilities of existing large models. This breakthrough development enables small models to reach or even surpass the reasoning capabilities of models with dozens or even hundreds of times larger parameters, while significantly reducing the demand for computing resources. The emergence of the Q* algorithm has opened a new chapter for the widespread application of artificial intelligence, heralding the coming of a new era of efficient intelligence. This research result has been published in the paper "Q*: Improving Multi-step Reasoning for LLMs with Deliberative Planning" and provides a detailed technical explanation.

昆仑万维

The researchers cleverly improved the performance of the open source model on inference tasks by decomposing the inference trajectory of the large language model into several states and using the A* search algorithm to achieve overall planning. By defining the Path Cost function and Accumulated Reward function, the comprehensive consideration of historical state returns and future expected returns was achieved, and significant accuracy improvements were achieved in experiments, surpassing some well-known models. At present, the research on Q* is still in its infancy, but its potential is huge. In the future, it is expected to further improve the reasoning capabilities of domestic open source models and contribute more to the development of artificial intelligence technology.

Specifically, Q* optimizes the reasoning process by comprehensively considering historical state returns and future expected returns. Experimental results show that Q* has achieved significant performance improvements on multiple data sets, which provides a new direction for the advancement of artificial intelligence technology.

Currently, research on Q* is still in its infancy and there is still room for improvement. In the future, Kunlun Wanwei will continue to conduct in-depth research to improve the reasoning capabilities of domestic open source models and bring more possibilities to the development of artificial intelligence technology.

Paper link:

https://arxiv.org/abs/2406.14283

The successful development of the Q* algorithm marks important progress in the field of artificial intelligence and points out the direction for the future development of artificial intelligence technology. It is worth looking forward to its applications and breakthroughs in more fields.