In the field of artificial intelligence, Noam Brown, director of reasoning research at OpenAI recently made a thought-provoking remark at the Nvidia GTC conference. He proposed that if researchers had mastered the correct methods and algorithms 20 years ago, some forms of "inference" AI models might have been available long ago. This view reveals possible blind spots and neglected research directions in the development of artificial intelligence.

Brown reviews his experience in gaming AI research at Carnegie Mellon University, especially developing a Pluribus system that can beat top human poker players. He emphasized that the uniqueness of this AI system is its "reasoning" ability rather than relying on simple violent computing. This reasoning ability allows AI to think deeply in complex situations, similar to the thinking process of humans when facing difficulties.
As one of the architects of OpenAI's AI model o1, Brown introduced an innovative technology called "Inference on Testing". This technique allows the model to "think" before responding to a query, driving some form of "inference" through additional calculations. This inference model shows higher accuracy and reliability in fields such as mathematics and science, opening up new directions for the development of AI.
During the discussion, Brown also talked about the role of academia in AI research. Although colleges and universities generally lack computing resources, he believes that academics can still play an important role by exploring areas with low computing requirements, such as model architecture design. He stressed the importance of collaboration between Frontier Laboratories and the academic community, noting that Frontier Laboratories will closely monitor academic publications and evaluate whether the arguments they put forward are sufficiently convincing.
Brown specifically mentioned the field of AI benchmarking, believing that academia can play an important role in it. He criticized the current status of AI benchmarks, pointing out that these tests often examine esoteric knowledge and are less relevant to the proficiency of tasks that most people care about. He called for improvements in AI benchmarks, believing that this does not require a large amount of computing resources, but can significantly improve the understanding of model capabilities and improvements.
It is worth noting that Brown's remarks are mainly based on his experience in gambling AI research before joining OpenAI, rather than inference models like o1. His views provide a new perspective for research in the field of artificial intelligence, highlighting the importance of reasoning capabilities in the development of AI and the potential for collaboration between academia and cutting-edge laboratories.