For starters, not all RAGs are the same caliber. The accuracy of your custom database content is critical to solid results, but that’s not the only variable. “It’s not just about the quality of the content itself,” says Joel Hron, global head of AI at Thomson Reuters. “It’s the quality of the search and retrieval of the right content based on the question.” Mastering each step of the process is essential, since one wrong step can completely unbalance the model.
“Any lawyer who has ever tried to use a natural language search within one of the research engines will see that there are often cases where semantic similarity leads to completely irrelevant material,” says Daniel Ho, a Stanford professor and fellow main of Human Centered AI Institute. Ho’s research on AI legal tools that depend on RAG found a higher rate of errors in the results than the companies that built the models found.
Which brings us to the thorniest question of the discussion: how are hallucinations defined within a RAG implementation? Is it only when the chatbot generates a result without quotes and invents information? Is it also true when the tool may overlook relevant data or misinterpret aspects of a quote?
According to Lewis, hallucinations in a RAG system come down to whether the result is consistent with what the model found during data retrieval. However, Stanford research on AI tools for lawyers expands this definition a bit by examining whether the result is based on the data provided and is objectively correct—a high bar for legal professionals who often analyze complicated cases. and navigate complex hierarchies. of precedent.
While a RAG system attuned to legal issues is clearly better at answering questions about case law than OpenAI’s ChatGPT or Google’s Gemini, it can still miss the finer details and make random errors. All of the AI experts I spoke to emphasized the continued need for thoughtful human interaction throughout the process to verify citations and verify the overall accuracy of the results.
Law is an area where there is a lot of activity around RAG-based AI tools, but the potential of the process is not limited to a single administrative job. “Take any profession or any business. It is necessary to obtain answers that are anchored in real documents,” says Arredondo. “So I think RAG will become the staple that will be used in basically all professional applications, at least in the short to medium term.” Risk-averse executives seem excited by the prospect of using AI tools to better understand their proprietary data, without having to upload sensitive information to a standard public chatbot.
However, it is critical that users understand the limitations of these tools and that AI-focused companies refrain from over-promising about the accuracy of their responses. Anyone using an AI tool should avoid completely trusting the result and should approach their answers with a healthy sense of skepticism, even if the answer is improved through RAG.
“Hallucinations are here to stay,” Ho says. “We still don’t have methods ready to really eliminate hallucinations.” Even when RAG reduces the prevalence of errors, human judgment prevails. And that’s not a lie.
You Might Also Like
- Don’t you think breakdancing is a “real” Olympic sport? The world champion agrees (more or less)
- Why the Pope has G7 leaders’ attention on AI ethics
- The Titan submersible disaster shocked the world. The inside story is more disturbing than anyone imagined
- Delta is a Game Boy emulator for iOS that (probably) won’t be removed