Beijing Zhipu Huazhang Technology Co., Ltd. has launched CogVideoX v1.5. The latest version of this video generation model has been open source! Since its release in early August, the CogVideoX series has quickly become a popular choice in the video generation field with its leading technology and developer-friendly features. The editor of Downcodes learned that CogVideoX v1.5 has achieved significant improvements in video generation capabilities and image-to-video conversion (I2V) models, bringing users a better and more convenient video creation experience.

This open source includes two models: CogVideoX v1.5-5B and CogVideoX v1.5-5B-I2V. They have been simultaneously launched on the Qingying platform and combined with the CogSound sound effect model to provide a more powerful AI video generation service, supporting higher-definition resolution, variable proportions to adapt to different scenes, multi-channel output, and AI video generation with sound effects. . At the technical level, CogVideoX v1.5 significantly improves video generation quality and content coherence through technologies such as automated screening framework, end-to-end video understanding model CogVLM2-caption, and efficient three-dimensional variational autoencoder (3D VAE). In addition, the independently developed Transformer architecture that integrates the three dimensions of text, time and space further optimizes model performance.
In terms of training, CogVideoX v1.5 builds an efficient diffusion model training framework and achieves rapid training of long video sequences through a variety of parallel computing and time optimization technologies. Zhipu Huazhang said that they have verified the effectiveness of scaling law in the field of video generation, and plan to expand the amount of data and model scale in the future, and explore innovative model architectures to compress video information more efficiently and better integrate text and Video content.
Code: https://github.com/thudm/cogvideo
Model: https://huggingface.co/THUDM/CogVideoX1.5-5B-SAT
The open source of CogVideoX v1.5 will undoubtedly further promote technological development and application innovation in the field of video generation, providing developers with more powerful tools and resources. Looking forward to more surprises from the CogVideoX series in the future!