In the field of artificial intelligence, training large language models (LLM) has always been a resource-intensive task, usually only a few tech giants can undertake. However, Google's recently launched SALT (small model assisted large model training) method may completely change this situation. This innovation not only reduces training costs, but also improves the performance of the model, opening the door to AI development for more research institutions and enterprises.

Image source notes: The image is generated by AI, and the image authorized service provider Midjourney
The core of the SALT method lies in its two-stage training process. The first stage is knowledge distillation, and the small language model (SLM) acts as a "teacher", passes the knowledge learned to the large model through "soft tags". This stage is particularly suitable for basic tasks that small models have mastered, helping large models lay a solid foundation in early learning.
The second stage is self-supervised learning, with large models starting to learn independently, focusing on more complex tasks. This phase of transition requires careful design, including strategies such as linear attenuation and linear proportional attenuation, to ensure that large models can gradually reduce their dependence on small models and ultimately achieve independent learning and optimization.
Google's research shows that using the SALT method to train a large model with 2.8 billion parameters has a 28% reduction in time and has improved accuracy in mathematical problems and reading comprehension tasks by 3% and 4% respectively. This significant performance improvement not only demonstrates the efficiency of SALT, but also demonstrates its strong potential in complex tasks.
The emergence of SALT not only improves training efficiency, but also lowers the threshold for AI development. In the past, training costs only large tech companies could afford, and now many small research institutions and companies can participate. This will promote the emergence of more innovative and professional AI solutions and further promote the development of the field of artificial intelligence.
In general, the SALT method not only improves the performance of large models by introducing auxiliary training of small models, but also greatly reduces the training cost. This innovation is expected to trigger a revolution in the field of AI, allowing more institutions to participate in AI research and development and promote the progress of the entire industry.