Princeton University has released the world's first open source AI programmer SWE-agent, which is based on GPT-4 technology and can automatically fix bugs in GitHub repositories. SWE-agent performed well on the SWE-bench test set, successfully solving 12.29% of the problems, taking an average of only 93 seconds, and having an accuracy comparable to existing AI programmers. Its open source nature allowed it to quickly gain a lot of attention, demonstrating the huge potential of AI in the field of software engineering. This article will delve into the working principle, performance and impact of SWE-agent on future software engineering.
Princeton University recently launched an AI programmer called SWE-agent, which is the world's first open source AI programmer. It is based on GPT-4 technology and can automatically fix bugs in GitHub repositories. The birth of this agent-computer interface marks an important step in the application of AI in the field of software engineering. SWE-agent's performance on the SWE-bench test set is impressive. It successfully solved 12.29% of the problems, taking an average of only 93 seconds, and its accuracy was comparable to the previously launched AI programmer Devin. The open source nature of SWE-agent means that it quickly gained 1.6k stars and 109 Forks on GitHub, showing the high recognition and interest of the open source community in this technology. SWE-agent works by interacting with a specialized terminal, which can open, scroll and search files, edit specific lines and automatically perform syntax checks, write and execute tests. This design is similar to UI designed for humans, preventing errors and providing feedback. For example, when dealing with a matrix operation bug, SWE-agent can reproduce the problem, locate the problem code, make modifications and successfully solve the problem. Princeton University researchers designed a concise instruction and feedback format for SWE-agent, making it easier for the model to browse the code repository, view, edit and run code files. The workflow of SWE-agent is divided into two stages: reasoning and evaluation. In the reasoning phase, SWE-agent handles issues on GitHub and generates repair solutions; in the evaluation phase, it confirms whether the solution actually solves the problem. The core authors of this study, John Yang and Carlos E. Jimenez, are both research assistants and doctoral students at Princeton University. Their research interests focus on language foundations, interaction, LLM benchmarking, software engineering, and code generation. Their work demonstrates the potential of AI in the field of software engineering and triggers in-depth thinking about the roles and capabilities of AI and humans in the engineering process. With the development of AI technology, we have seen the advancement of AI in the field of programming. They can not only plan and execute complex engineering tasks, but also fix bugs and be responsible for the entire project development process. However, despite the progress AI has made in writing secure code, human oversight remains critical. Although the rise of AI has not yet reached the stage of completely replacing software engineers, it is changing the face and future direction of the technology field. This open source AI programmer from the Princeton team not only brings new research and application prospects to the field of software engineering, but also provides us with an opportunity to think about the cooperation between artificial intelligence and humans to create a future. As technology continues to advance, we may see AI play an even more important role in the field of software engineering.The emergence of SWE-agent marks a new milestone in AI-assisted programming. Although human supervision is still needed, its potential in improving development efficiency and code quality cannot be ignored. In the future, the collaborative cooperation between AI and human programmers will become a mainstream trend in the field of software engineering, jointly promoting technological progress.