Although large language models (LLM) such as ChatGPT, Claude and Gemini are powerful, they also have a common major flaw: they often produce hallucinations, that is, fictitious information. This is not only embarrassing, but also hinders the wider application of LLM. Even Apple has expressed concerns about how its own AI systems will respond to this problem. To solve this problem, researchers have developed a new AI hallucination detector that can effectively identify false content generated by AI, laying the foundation for the advancement of AI technology and safer and more reliable applications.

These illusions lead to many embarrassing and intriguing missteps, and they are one of the main reasons why artificial intelligence like ChatGPT has not yet become more practical. We once saw Google having to revise its AI search overview because the AI started telling people it was safe to eat rocks and that it was safe to put glue on pizza. There were even lawyers who used ChatGPT to help write court documents, only to be fined because the chatbot made up citations in the documents.
According to the paper, the new algorithm developed by the researchers can help discern whether AI-generated answers are accurate about 79 percent of the time. Of course, this is not a perfect record, but it is 10% better than other current mainstream methods.
Chatbots like Gemini and ChatGPT can be useful, but they can also easily generate fictitious answers. The research was conducted by members of the Department of Computer Science at the University of Oxford. The researchers explained in their paper that the method they used was relatively simple.
First, they asked the chatbot to respond to the same prompt multiple times, typically five to ten times. They then calculated a value we call semantic entropy, which is a measure of how similar or different the answers are in meaning. If the model answers each prompt item differently, the semantic entropy score will be higher, indicating that the AI may be making up the answers. However, if the answers are all the same or have similar meanings, the semantic entropy score will be lower, indicating that the answers it provides are more consistent and likely to be true. This is not a 100% accurate AI hallucination detector, but it is an interesting way to deal with it.
Other methods rely on so-called naive entropy, which typically checks whether the wording of an answer differs, rather than its meaning. Therefore, it is less likely to detect hallucinations as accurately as calculating semantic entropy because it does not focus on the meaning behind the words in the sentence.
The researchers say the algorithm could be added to chatbots like ChatGPT via a button that would give users a "certainty score" for answers to their prompts. It is tempting to build AI hallucination detectors directly into chatbots, so it is understandable to add such tools to various chatbots.
Although this AI hallucination detector based on semantic entropy is not perfect, its 79% accuracy and 10% advantage over existing methods provide new ideas and methods for solving the AI hallucination problem. This research will undoubtedly promote the advancement of AI technology and make AI systems more reliable and trustworthy.