Tencent applied for a patent for "big language model training method" to improve model generalization ability and accuracy - AI Articles

Author：Eve Cole Update Time：2025-02-14 17:16:01

Recently, Tencent Technology (Shenzhen) Co., Ltd. applied for a patent called "Training Methods, Devices, Computer Equipment and Storage Media for Large Language Models". This patent provides more learnable information for the model during the training of the large language model by introducing the first abstract text and the second abstract text.

According to the patent description, the first abstract text and the second abstract text contain different amounts of information, and the first abstract text also contains correct statements and incorrect statements. By comparing and learning these two different abstracts of the same text and distinguishing the correct statements and wrong statements in them, we can effectively avoid the possible problems such as model overfitting and inaccurate generation caused by the single summary text.

腾讯 (2)

The innovation of this method is that it improves the generalization performance of the model and effectively improves the accuracy of the model. By introducing diverse abstract text content, Tencent's training method has brought more efficient and accurate improvements to the training process of large language models.

This progress not only reflects Tencent’s technical strength in the field of artificial intelligence, but also lays a solid foundation for the application and development of large language models in the future.