In the gaming world, who is the real king? Recently, AI technology has challenged the classic game "Super Mario Brothers", which has attracted widespread attention. The Hao Artificial Intelligence Laboratory of the University of California, San Diego released a shocking research result: in an AI "Malio" battle, Anthropic's Claude 3.7 model stood out, defeating other competitors, and winning the title of "the strongest AI Mario". Claude 3.5 is followed by, while Google's Gemini1.5 Pro and OpenAI's GPT-4o are underperforming, which is surprising. This result has triggered new thinking on AI technology.
This AI "Malio" tournament was not played on a traditional red and white machine, but was played in a high-tech simulator. Researchers have developed a framework called GamingAgent to serve as a bridge between AI and the gaming world. In this virtual environment, AI becomes "Malio" and controls the game by receiving system instructions. The instructions include "There is an obstacle ahead! Jump!" and "Enemy is coming! Dodge!", which are simple and clear but challenging. The system will also provide game screenshots to help AI better understand game scenes. What is even more amazing is that AI can write Python code in real time, direct "Malio" to complete various complex operations, showing an extremely high technical level.

However, the results of the game were unexpected. Some AI models known for their reasoning capabilities, such as OpenAI's o1, perform far less than expected. The reason is that these "reasoning masters" react too slowly in real-time games and cannot make decisions quickly. In a game like "Super Mario Brothers", a delay of a few seconds may lead to failure. Therefore, reaction speed has become a key factor in determining the outcome. This discovery reveals the limitations of AI in real-time tasks and also provides new directions for future research.
Although games have become an important stage for AI competition, some experts are reserved about this. They believe that the gaming world is too simple and abstract to fully reflect AI's ability in the real world. AI can continuously accumulate experience in games, but whether these experiences can be transformed into practical applications remains to be verified. OpenAI research scientist Andre Kapasi even raised questions about the "assessment crisis", which triggered people's in-depth thinking about the standards for AI technology evaluation.
Despite the doubts, AI's performance in the game is still impressive. This "Malio" tournament not only demonstrates the rapid development of AI technology, but also provides us with a window to see the future. Who would have thought that AI, which once could only plan on the chessboard, can now show its strength in the game world? Perhaps in the near future, AI can really surpass human players and become the real king of the gaming industry. Let us wait and see and witness the future development of this technology.