On February 27, 2025, Tencent officially released the Hunyuan new generation of Kuaishou Model Turbo S. This release marks a major breakthrough in big model technology in response speed and performance optimization. Compared with traditional slow thinking models such as Deepseek R1 and Hunyuan T1, Hunyuan Turbo S realizes the "second return" function, which significantly improves the speed of outputting answers, doubles the vocabulary speed, and reduces the delay of the first word by 44%. This innovation has made the model perform well in many fields such as knowledge, mathematics and science, and creation, providing a new solution for the rapid response capabilities of large models.
The design inspiration of Hunyuan Turbo S is derived from the fast thinking model that relies on intuition in human daily decision-making, and combines the slow thinking model of rational analysis to provide large models with smarter and more efficient problem-solving capabilities. Through the integration of long and short thinking chains, the model not only maintains a fast experience on liberal arts issues, but also significantly improves the science reasoning ability, and greatly improves the overall performance. In multiple public Benchmark tests commonly used in the industry, the Hunyuan Turbo S has shown similar effects to leading models such as DeepSeek V3, GPT4o, and Claude.

In terms of architectural innovation, Hunyuan Turbo S adopts the Hybrid-Mamba-Transformer fusion model, which effectively reduces the computational complexity and KV-Cache cache usage of traditional Transformer structures, and significantly reduces the training and inference costs. This hybrid architecture breaks through the problems of high cost of long article training and inference in traditional big models, and plays the advantage of Mamba architecture in processing long sequences, while retaining the ability of Transformer to capture complex contexts, becoming the first case in the industry to successfully apply Mamba architecture to super-large MoE models for lossless use.
As the core base of Tencent Hunyuan series, Hunyuan Turbo S will provide basic capabilities for derivative models such as reasoning, long articles, and code in the future. Based on Turbo S, Tencent has also launched the inference model T1 with deep thinking ability. This model has been fully launched on Tencent Yuanbao and will soon provide API access services.
At present, developers and enterprise users can call Hunyuan Turbo S through the API on Tencent Cloud official website and enjoy the discount for free trial within a week. The price of this model is to input 0.8 yuan/million tokens and output 2 yuan/million tokens, which is a significant price reduction compared to the previous generation of Hunyuan Turbo models. In addition, Hunyuan Turbo S will be gradually launched in Tencent Yuanbao. Users can select the "Hunyuan" model in Yuanbao and turn off the deep thinking function to experience it.
Tencent Hunyuan turbos model API free trial application: https://cloud.tencent.com/apply/p/i2zophus2x8