MAG-SQL: Using multi-agent generation method to improve text-to-SQL conversion accuracy to 61%

Author：Eve Cole Update Time：2024-12-22 14:32:01

Text-to-SQL technology aims to simplify database queries so that ordinary users can easily obtain data without learning the SQL language. However, in the face of increasingly complex database structures, accurately converting natural language into SQL commands is still challenging. Research teams from South China University of Technology and Tsinghua University have proposed an innovative solution - MAG-SQL, which significantly improves the accuracy and efficiency of text-to-SQL conversion through multi-agent collaboration.

In the field of natural language (NLP), text-to-SQL (Text-to-SQL) technology is developing rapidly. This technology allows ordinary users to easily query databases using Japanese language without the need to master professional programming languages such as SQL. However, as the database structure becomes increasingly complex, how to accurately convert natural language into SQL commands has become a big challenge.

Research teams from South China University of Technology and Tsinghua University recently proposed a new solution - MAG-SQL (Multiple Intelligence Generating Model), aiming to improve the effect of converting text to SQL. This method utilizes the cooperation of multiple agents and strives to improve the accuracy of SQL generation.

The way MAG-SQL works is quite clever. The core components of include the "Soft Mode Linker", "Target-Conditional Resolver", "Sub-SQL Generator" and "Sub-SQL Modifier". First, the soft mode linker will filter out the database columns most relevant to the query, thereby reducing unnecessary information interference and improving the accuracy of generating SQL commands. Next, the goal-conditional decomposer breaks the complex query into smaller subqueries for easier processing.

The sub-SQL generator then generates sub-SQL queries based on the previous results, ensuring that the SQL commands can be gradually refined. Finally, the sub-SQL corrector is responsible for correcting generated SQL errors, further improving overall accuracy. This multi-step processing method makes MAG-SQL perform well in complex databases.

In recent tests, MAG-SQL performed very well on the BIRD data set. When using the GPT-4 model, the system achieved an execution accuracy of 61.08%, which was significantly improved compared to the 46.35% of traditional GPT-4. Even when GPT-3.5 is used, the accuracy of MAG-SQL reaches 57.62%, surpassing the previous MAC-SQL method. In addition, MAG-SQL performs equally well on another complex data set, Spider, showing its good versatility.

The introduction of MAG-SQL not only improves the accuracy of converting text to SQL, but also provides new ideas for solving complex queries. This multi-agent framework, through repeated and iterative refinement, has greatly enhanced the ability of large language models in practical applications, especially when dealing with complex databases and difficult queries.

Paper entrance: https://arxiv.org/pdf/2408.07930

Highlight:

? ** Improved accuracy **: MAG-SQL achieved an execution accuracy of 61.08% on the BIRD data set, far exceeding the 46.35% of traditional GPT-4.

**Multi-agent collaboration**: This method uses multiple agents to divide labor and cooperate, making the SQL generation process more efficient and accurate.

**Wide application prospects**: MAG-SQL also performs well on other data sets (such as Spider), demonstrating its good usability and applicability.

MAG-SQL's multi-agent framework has brought significant performance improvements to text-to-SQL technology. Its excellent performance on complex data sets indicates the huge potential of this technology in practical applications, and will pave the way for future innovations in database query methods. Provides new directions.