In the 2024 college entrance examination, nine large AI models participated in the unprecedented challenge of taking the college entrance examination, especially the extremely difficult Henan examination. This test organized by the media aims to evaluate the actual capabilities of AI in the academic field and provide valuable data for our understanding of the differences between AI and human intelligence. The results of this test are eye-catching. The results of some AI models even exceeded the first-line results, which aroused widespread attention and heated discussion, and also provided new thinking for the future development direction of AI technology.
Among the nine AIs that participated in the test, the scores of four AIs exceeded the first line of the Henan College Entrance Examination. GPT-4o won the first place with a high score of 562 points, surpassing the first line by 41 points, while ByteDance’s Doubao followed closely with 542.5 points, becoming the best among domestic models.

AI performs well in liberal arts subjects, especially Chinese and English subjects, but not as well in science subjects, especially mathematics. It can be seen that AI has shown obvious advantages in language subjects, and its ability to understand ancient poetry is impressive.

The AI's performance on simple reasoning questions is acceptable, but its performance on questions requiring complex derivation and proof is poor, showing that its logical ability needs to be improved. In the liberal arts comprehensive, the geography subject performed the worst, while in the science comprehensive, the biology subject performed relatively well. GPT-4o performed outstandingly in the political subject with a high score of 91.5 points.
Test methods and scoring standards
Test rounds: In order to reduce the impact of randomness, all subjects were tested for two rounds, and the average score was taken as the final score.
Input format: Formulas are input in Markdown/LaTeX format. For image questions, corresponding pictures and text are input according to the model’s recognition capabilities.
Test operation: A professional AI data service provider conducts unified and standardized test screenshots to ensure the fairness of the test.
Scoring method: The same scoring standards are used with human candidates to ensure the fairness of scoring.
This attempt by AI to participate in the college entrance examination not only demonstrates AI's advantages in specific fields, but also exposes its shortcomings in logical reasoning and mathematical proofs. As one AI candidate quoted in his essay: "The road is long and long, and I will search up and down." This is not only a portrayal of the development of AI, but also a vivid description of human beings' continuous exploration of the unknown world. Through this test, we have a deeper understanding of the intelligence level of AI, and it also provides valuable reference for the future development direction of AI.
The list of candidates includes well-known AI products such as OpenAI's GPT-4o, ByteDance's Doubao, and Baidu's Wenxin 4.0. Their performance in this college entrance examination will undoubtedly have a profound impact on the development of AI technology.
This AI college entrance examination experiment provides us with profound insights into the current status and future direction of artificial intelligence development, and also highlights the challenges we still face in the pursuit of general artificial intelligence. I believe that in the future, AI will demonstrate its potential in more fields and bring greater progress to human society.