Alibaba's open source video generation model Wan 2.1 is online and it can run smoothly - AI Articles

Author：Eve Cole Update Time：2025-05-17 03:25:01

Recently, Alibaba released the new open source video generation model Wan2.1 late at night. This model quickly topped the VBench list with its 14B parameter volume and became the current leader in the field of video generation. Compared with the previously released QwQ-Max, Wan2.1 performs particularly well in the details of complex movements, and can smoothly realize the synchronous dance of multiple characters, demonstrating its strong technical strength.

In the official demonstration, Wan2.1 not only successfully solved the difficulties in static image generation, but also reached a new level in word processing. Although there are certain difficulties in deploying the 14B parameters on personal consumer graphics cards, Alibaba has specially launched a small version of 1.3B, which supports 480P resolution and can run smoothly using a 4070 graphics card with 12GB graphics memory, providing more choices for ordinary users.

大脑大模型

In addition to versions 14B and 1.3B, Alibaba has also released two additional video generation models, both using the Apache2.0 protocol, which users can use for free. Users can access this model through the platform provided by Alibaba to quickly generate videos. However, due to the surge in user volume, there may be cases where waiting time is too long. For users with a certain technical foundation, they can also install and debug them by themselves through various channels such as HuggingFace and Modai Community.

The biggest highlight of Wan2.1 is its technological innovation. The model adopts the Diffusion Transformer architecture and combines a 3D variational autoencoder to design specifically for video generation. By introducing a variety of compression and parallel strategies, the model greatly improves the generation efficiency while ensuring quality. Research shows that Wan's reconstruction speed is 2.5 times that of current similar technologies, which significantly saves computing resources.

In terms of user experience, Wan2.1 has also received widespread praise. Whether it is generating details in dynamic scenes or natural physical effects, the performance of the model is impressive. Through this model, users can not only produce high-quality video works, but also easily realize dynamic presentation of text, bringing more possibilities to their creation.

Alibaba's Wan2.1 model is not only technologically advanced, but also provides more creative freedom for creators, marking another major breakthrough in video generation technology. The release of this model will undoubtedly further promote the development of the video generation field and bring more innovative experiences to users.