The performance evaluation of artificial intelligence image generation models has always been the focus of the industry. Recently, Artificial Analysis launched the "Text-to-Image Ranking and Arena" to provide a new platform for objective evaluation of these models. Through large-scale human preference data collection and ELO scoring system, the platform ranks and compares multiple leading models including Midjourney, DALL·E, Stable Diffusion, etc., providing valuable reference for researchers and users.
Recently, the artificial intelligence research organization Artificial Analysis launched a new initiative called "Artificial Analysis Text to Image Leaderboard & Arena" to comprehensively evaluate the performance of these models.
Evaluation platform overviewSince the introduction of diffusion-based image generators two years ago, AI image models have achieved near-photographic quality. Artificial Analysis Text to Image Leaderboard & Arena is dedicated to comparing open source and proprietary image generation models to determine their effectiveness and accuracy based on human preferences.
The platform's rankings are based on over 45,000 human image preferences collected through Artificial Analysis Image Arena, updated using the ELO scoring system. The evaluation covers multiple leading image models, including Midjourney, OpenAI's DALL·E, Stable Diffusion and Playground AI, etc.

The platform uses crowdsourcing to collect large-scale human preference data. Participants were shown a cue word and two generated images and selected the one that best matched the cue word. Each model generates more than 700 images covering different styles and categories, such as portraits, groups, animals, nature and art. The preference data collected is used to calculate the ELO score for each model, resulting in a comparative ranking.
Initial insightsThe ranking shows that while proprietary models lead in performance, open source alternatives are becoming increasingly competitive. Models such as Midjourney, Stable Diffusion3, and DALL·E3HD topped the list, while the open source model Playground AI v2.5 also made significant progress, surpassing OpenAI's DALL·E3.
Notably, the landscape of image generation models is changing rapidly. For example, DALL·E2, which was still in the leading position last year, is now selected in less than 25% of the arena and has fallen to the lowest ranking model.
public participationArtificial Analysis encourages the public to participate in this review. Users can access leaderboards on Hugging Face and participate in the ranking process through Image Arena. After completing 30 image selections, participants can view personalized model rankings to gain insights specific to their preferences.
This move is an important step towards understanding and improving AI image generation models. By leveraging human preferences and a rigorous crowdsourcing approach, the platform provides valuable insights into the comparative performance of leading image models. As the field continues to evolve, platforms like these will play a key role in guiding future developments and innovations in AI-driven image generation.
List link address: https://huggingface.co/spaces/ArtificialAnalysis/Text-to-Image-Leaderboard
All in all, Artificial Analysis' "Text-to-Image Ranking and Arena" brings a transparent and competitive evaluation platform to the field of AI image generation, and its continuous updates and public participation will further promote technological progress in this field. We look forward to seeing more models added and ranking changes in the future.