A team of researchers from the University of Surrey and Stanford University in the UK has achieved a breakthrough in the field of artificial intelligence: they have developed a new method that enables artificial intelligence to understand line drawing sketches drawn by humans, even if these sketches were made by non-experts of. This research result enables artificial intelligence to achieve near-human-level accuracy in identifying scene sketches, laying the foundation for more powerful human-computer interaction and more efficient design workflows. This technological breakthrough is not only reflected in the recognition of objects in the sketch, but more importantly, the ability to understand the meaning of each stroke in the sketch, which provides new ideas for artificial intelligence to understand human visual expression.
A team of researchers from the University of Surrey and Stanford University in the UK has developed a new method to teach artificial intelligence (AI) to understand human line drawing sketches, even when drawn by non-artists. The model approaches human-level performance in recognizing scene sketches.

Dr Yulia Gryaditskaya, lecturer at the University of Surrey's Center for Vision, Speech and Signal Processing (CVSSP) and Surrey People's Central Artificial Intelligence Institute (PAI), said: "Sketching is a powerful visual communication language. It is sometimes even more powerful than spoken language. Expressive and flexible. Developing tools to understand sketches is a step toward more powerful human-computer interaction and more efficient design workflows. ” Regardless of age and background, people use drawing to explore new ideas and communicate. However, AI systems have always had problems understanding sketches. AI must learn to understand images. Typically, this requires a time-consuming and laborious process of collecting labels for every pixel in the image. The AI then learns from these labels.
However, the research team taught the AI through a combination of sketches and written descriptions. It learned to group pixels, matching them to categories in the description. As a result, AI is demonstrating richer and closer human understanding than ever before. It was able to correctly identify and tag kites, trees, giraffes and other objects with 85% accuracy, outperforming other models that relied on tagged pixels. In addition to identifying objects in complex scenes, it can also determine which object each stroke is used to depict. This new method works not only for informal sketches made by non-artists, but also for sketches made by subjects without explicit training.
Judith Fan, assistant professor of psychology at Stanford University, said: "Drawing and writing are among the most quintessential human activities and have long been used to capture people's observations and thoughts. This work is an important step forward in AI systems' ability to understand the nature of the ideas people are trying to convey. Exciting progress has been made, whether they use images or text.” The research was conducted as part of the University of Surrey’s People’s Center for Artificial Intelligence Institute, specifically its SketchX initiative. SketchX uses artificial intelligence to try to understand the way we see the world through the way we draw.
Professor Song Yizhe, co-director of the Institute of Artificial Intelligence at the People's Center and head of SketchX, said: "This research is a prime example of how AI can enhance basic human activities such as sketching. By understanding rough sketches with near-human accuracy, this technology has huge potential Potential to enhance people’s natural creativity, regardless of artistic talent.”
Paper address: https://arxiv.org/abs/2312.12463
This research result has brought new breakthroughs to artificial intelligence in the fields of image understanding and human-computer interaction. It is expected to be widely used in design, artistic creation and other fields in the future, further promoting the collaborative development of humans and artificial intelligence. Advances in this technology also demonstrate the huge potential of artificial intelligence in understanding human unstructured information.