This article reports on the 2024 Zhiyuan Research Institute progress report released by Wang Zhongyuan, director of the Zhiyuan Research Institute, at the 6th Beijing Zhiyuan Conference, focusing on the Zhiyuan large model family bucket. The report shows the latest research results of Intelligent Source Research Institute in the fields of language, multi-modality, embodiment, and biological computing large models, as well as the upgrade and layout of its open source technology base. The editor of Downcodes will interpret the content of the report in detail, especially the composition of the Zhiyuan large model family bucket and its core technology.

On June 14, the 6th "Beijing Wisdom Conference" hosted by the Wisdom Research Institute was held at the Zhongguancun Exhibition Center. During this meeting, Wang Zhongyuan, President of the Zhiyuan Research Institute, made a progress report on the Zhiyuan Research Institute in 2024, and focused on the Zhiyuan large model family bucket.
In the 2024 Zhiyuan Research Institute progress report, Zhiyuan Research Institute shared its cutting-edge exploration and research progress in language, multimodality, embodiment, and biological computing large models, as well as the iterative upgrade and development of the large model full-stack open source technology base. Layout of the territory. According to Zhiyuan Research Institute, the development of large language models at this stage has already possessed the very core understanding and reasoning capabilities of general artificial intelligence, and has formed a technical route that uses large language models as the core to align and map other modalities. The model has preliminary multi-modal understanding and generation capabilities. But this is not the ultimate technical route for artificial intelligence to perceive and understand the physical world. Instead, it should adopt a unified model paradigm to realize multi-modal input and output, so that the model has native multi-modal expansion capabilities and evolves into a world model. .
"In the future, large models will be integrated with intelligent hardware in the form of digital agents, and will enter the physical world from the digital world in the form of embodied intelligence. At the same time, the technical means of large models can provide a new knowledge expression paradigm for scientific research and accelerate Humanity's exploration and research of the laws of the microphysical world are constantly approaching the ultimate goal of general artificial intelligence," Wang Zhongyuan said.
The Zhiyuan large model family bucket is a highlight in this 2024 Zhiyuan Research Institute progress report. The reporter learned at the meeting that the Zhiyuan large model family bucket consists of four large model research directions: large language model series, multimodal large model series, embodied intelligence large model and biological computing large model, with a total of 12 studies. Taking the Zhiyuan language large model series as an example, this direction includes two large model studies, the world's first low-carbon single-body dense trillion language model Tele-FLM-1T and the general language vector model BGE (BAAI General Embedding) series.
"In response to the problem of high computing power consumption in large model training, Zhiyuan Research Institute and China Telecom Artificial Intelligence Research Institute (TeleAI) jointly developed and launched the world's first low-carbon monomer dense trillion based on key technologies such as model growth and loss prediction. Language model Tele-FLM-1T. This model, together with the 10-billion-level 52B version and the 100-billion-level 102B version, constitute the Tele-FLM series model,” the person in charge of the relevant business of the Tele-FLM series model told reporters. It is reported that the Tele-FLM series models have achieved low-carbon growth. With only 9% of the computing power resources of the industry's ordinary training scheme, based on 112 A800 servers, it took 4 months to complete the training of 3 models totaling 2.3Ttokens, and successfully trained 10,000 Billion dense model Tele-FLM-1T. "The whole model training process is zero adjustment and zero retry, with high computing power efficiency and good model convergence and stability. At present, the TeleFLM series model has been fully open sourced in version 52B, with core technologies (growth technology, optimal hyperparameter prediction), Training details (loss curve, optimal hyperparameters, data ratio and G radNorm, etc.) are all open source. It is hoped that the open source technology can have a beneficial impact on the large model community. The Tele-FLM-1T version will be open source soon. It is hoped that it can provide an excellent initial parameter for the community to train trillion dense models and avoid the difficulty of convergence in trillion model training. and other issues,” the person in charge said.
The BGE series of universal semantic vector models independently developed by Zhiyuan Research Institute are based on retrieval-enhanced RAG technology, which can achieve precise semantic matching between data and support the invocation of external knowledge in large models. "Since August 2023, the BGE model series has undergone three iterations, achieving the best performance in the industry in the three tasks of Chinese and English retrieval, multi-language retrieval, and refined retrieval. Its comprehensive capabilities are significantly better than OpenAI, Similar models from Google, Microsoft, Cohere and other institutions are currently available for download. It ranks first in domestic AI models and has been integrated by international mainstream AI development frameworks such as HuggingFace, Langchain, and LlamaIndex, as well as major cloud service providers such as Tencent, Huawei, Alibaba, Byte, Microsoft, and Amazon, and provides commercial services to the outside world. The person in charge of related business of the semantic vector model BGE series told reporters.
All in all, Zhiyuan Research Institute has made significant progress in promoting the development of large model technology. Its "big model family bucket" and open source strategy will further promote innovation and development in the AI field and deserve continued attention. The editor of Downcodes looks forward to more breakthrough results in the future.