Why is AI halllucinating more frequently, and how can we stop it?
When you buy through links on our articles, Future and its syndication partners may earn a commission.
The more advanced artificial intelligence (AI) gets, the more it "hallucinates" and provides incorrect and inaccurate information.
Research conducted by OpenAI found that its latest and most powerful reasoning models, o3 and o4-mini, hallucinated 33% and 48% of the time, respectively, when tested by OpenAI's PersonQA benchmark. That's more than double the rate of the older o1 model. While o3 delivers more accurate information than its predecessor, it appears to come at the cost of more inaccurate hallucinations.
This raises a concern over the accuracy and reliability of large language models (LLMs) such as AI chatbots, said Eleanor Watson, an Institute of Electrical and Electronics Engineers (IEEE) member and AI ethics engineer at Singularity University.
"When a system outputs fabricated information — such as invented facts, citations or events — with the same fluency and coherence it uses for accurate content, it risks misleading users in subtle and consequential ways," Watson told Live Science.
Related: Cutting-edge AI models from OpenAI and DeepSeek undergo 'complete collapse' when problems get too difficult, study reveals
The issue of hallucination highlights the need to carefully assess and supervise the information AI systems produce when using LLMs and reasoning models, experts say.
The crux of a reasoning model is that it can handle complex tasks by essentially breaking them down into individual components and coming up with solutions to tackle them. Rather than seeking to kick out answers based on statistical probability, reasoning models come up with strategies to solve a problem, much like how humans think.
In order to develop creative, and potentially novel, solutions to problems, AI needs to hallucinate —otherwise it's limited by rigid data its LLM ingests.
"It's important to note that hallucination is a feature, not a bug, of AI," Sohrob Kazerounian, an AI researcher at Vectra AI, told Live Science. "To paraphrase a colleague of mine, 'Everything an LLM outputs is a hallucination. It's just that some of those hallucinations are true.' If an AI only generated verbatim outputs that it had seen during training, all of AI would reduce to a massive search problem."
"You would only be able to generate computer code that had been written before, find proteins and molecules whose properties had already been studied and described, and answer homework questions that had already previously been asked before. You would not, however, be able to ask the LLM to write the lyrics for a concept album focused on the AI singularity, blending the lyrical stylings of Snoop Dogg and Bob Dylan."
In effect, LLMs and the AI systems they power need to hallucinate in order to create, rather than simply serve up existing information. It is similar, conceptually, to the way that humans dream or imagine scenarios when conjuring new ideas.
However, AI hallucinations present a problem when it comes to delivering accurate and correct information, especially if users take the information at face value without any checks or oversight.
"This is especially problematic in domains where decisions depend on factual precision, like medicine, law or finance," Watson said. "While more advanced models may reduce the frequency of obvious factual mistakes, the issue persists in more subtle forms. Over time, confabulation erodes the perception of AI systems as trustworthy instruments and can produce material harms when unverified content is acted upon."
And this problem looks to be exacerbated as AI advances. "As model capabilities improve, errors often become less overt but more difficult to detect," Watson noted. "Fabricated content is increasingly embedded within plausible narratives and coherent reasoning chains. This introduces a particular risk: users may be unaware that errors are present and may treat outputs as definitive when they are not. The problem shifts from filtering out crude errors to identifying subtle distortions that may only reveal themselves under close scrutiny."
Kazerounian backed this viewpoint up. "Despite the general belief that the problem of AI hallucination can and will get better over time, it appears that the most recent generation of advanced reasoning models may have actually begun to hallucinate more than their simpler counterparts — and there are no agreed-upon explanations for why this is," he said.
The situation is further complicated because it can be very difficult to ascertain how LLMs come up with their answers; a parallel could be drawn here with how we still don't really know, comprehensively, how a human brain works.
In a recent essay, Dario Amodei, the CEO of AI company Anthropic, highlighted a lack of understanding in how AIs come up with answers and information. "When a generative AI system does something, like summarize a financial document, we have no idea, at a specific or precise level, why it makes the choices it does — why it chooses certain words over others, or why it occasionally makes a mistake despite usually being accurate," he wrote.
The problems caused by AI hallucinating inaccurate information are already very real, Kazerounian noted. "There is no universal, verifiable, way to get an LLM to correctly answer questions being asked about some corpus of data it has access to," he said. "The examples of non-existent hallucinated references, customer-facing chatbots making up company policy, and so on, are now all too common."
Both Kazerounian and Watson told Live Science that, ultimately, AI hallucinations may be difficult to eliminate. But there could be ways to mitigate the issue.
Watson suggested that "retrieval-augmented generation," which grounds a model's outputs in curated external knowledge sources, could help ensure that AI-produced information is anchored by verifiable data.
"Another approach involves introducing structure into the model's reasoning. By prompting it to check its own outputs, compare different perspectives, or follow logical steps, scaffolded reasoning frameworks reduce the risk of unconstrained speculation and improve consistency," Watson, noting this could be aided by training to shape a model to prioritize accuracy, and reinforcement training from human or AI evaluators to encourage an LLM to deliver more disciplined, grounded responses.
RELATED STORIES
—AI benchmarking platform is helping top companies rig their model performances, study claims
—AI can handle tasks twice as complex every few months. What does this exponential growth mean for how we use it?
—What is the Turing test? How the rise of generative AI may have broken the famous imitation game
"Finally, systems can be designed to recognise their own uncertainty. Rather than defaulting to confident answers, models can be taught to flag when they're unsure or to defer to human judgement when appropriate," Watson added. "While these strategies don't eliminate the risk of confabulation entirely, they offer a practical path forward to make AI outputs more reliable."
Given that AI hallucination may be nearly impossible to eliminate, especially in advanced models, Kazerounian concluded that ultimately the information that LLMs produce will need to be treated with the "same skepticism we reserve for human counterparts."
Hashtags

Try Our AI Features
Explore what Daily8 AI can do for you:
Comments
No comments yet...
Related Articles
Yahoo
36 minutes ago
- Yahoo
2 Biotech Stocks to Buy Before They Soar 84% and 240%, According to Certain Wall Street Analysts
These biotech companies have several catalysts ahead -- and in the past have soared on good news. They both offer innovative candidates that could result in game-changing treatments for patients. 10 stocks we like better than Viking Therapeutics › If you're looking to add growth to your portfolio, biotech stocks can be a great choice. Exciting research is happening in these companies' labs, and in some cases, game-changing treatment candidates are approaching important milestones or even going over the finish line. As an investor in these companies, you can benefit as they report positive clinical trial news, score a regulatory approval, or start generating product revenue. Wall Street considers two candidates extremely compelling right now, with forecasts for potential gains of more than 80% and 200% in the coming 12 months. One of these players is working in the high-growth area of weight loss drugs, and the other candidate showed its strengths by winning the world's first-ever approval of a product based on CRISPR gene editing. Let's check out these two biotech stocks to buy before they skyrocket. Viking Therapeutics (NASDAQ: VKTX) soared early last year when it reported strong data from the phase 2 trial of its weight loss candidate, VK2735, but the stock has since given back those gains and is trading closer to the level it was at prior to that data announcement. Since, the company has continued to advance VK2735 in injectable form and a version in pill form, and demand for these sorts of drugs remains high -- these are two reasons to believe that Viking has the potential to take off again. And catalysts may be on the horizon. The drug works in a manner similar to Eli Lilly's blockbuster tirzepatide, sold under the names Mounjaro and Zepbound. These drugs interact with hormones involved in digestion and have helped people quickly and safely lose weight. Viking is beginning the phase 3 trial for injectable VK2735 in the second quarter and expects data from its phase 2 trial of the pill version in the second half. Any data announcements could result in big moves for the stock, as there is plenty of room for a new company to enter the weight loss drug market -- one forecast to approach $100 billion in a few years. Wall Street is optimistic about Viking's prospects, with the average price forecast predicting an increase of about 240% in the stock price from today's level. Of course, Viking depends heavily on the outcome of these trials, so some risk is involved -- but data have been strong, so growth investors may want to get in on Viking now to potentially post a big win later. CRISPR Therapeutics (NASDAQ: CRSP) stock surged in the year leading up to a major milestone: its first product approval. But since last year's launch of Casgevy, a gene-editing treatment for blood disorders, the stock has been on the decline. Sometimes, investors buy a stock well before the company wins approval or launches a product, then lock in gains after the good news lands -- and I think this is what's happened here. But what this does is offer us a chance to get in at a very good price on a promising company that could deliver fantastic news down the road. Casgevy, as a gene-editing treatment, requires a longer time to roll out than a pill or injection, as it includes several steps that happen over a period of months. The company recently said new patient initiations should increase "significantly" this year -- so there's reason to be optimistic about revenue growth ahead. CRISPR Therapeutics also recently reported positive phase 1 data for a gene editing candidate addressing the problem of high cholesterol. And the company expects to report data soon from a phase 1 trial of a candidate targeting patients with elevated levels of lipoprotein(a) -- a risk factor for cardiovascular events. These could represent huge markets for CRISPR Therapeutics if the candidates reach the finish line, and in the meantime, any potential positive news could boost the stock. The company also expects other trial updates in candidates for oncology and autoimmune diseases this year -- so this biotech's calendar is full of possible catalysts. Wall Street's average price forecast calls for an 84% gain for CRISPR Therapeutics from today's price -- if all goes well in clinical trials and Casgevy starts to show revenue growth, now could represent a golden buying opportunity for growth investors. Before you buy stock in Viking Therapeutics, consider this: The Motley Fool Stock Advisor analyst team just identified what they believe are the for investors to buy now… and Viking Therapeutics wasn't one of them. The 10 stocks that made the cut could produce monster returns in the coming years. Consider when Netflix made this list on December 17, 2004... if you invested $1,000 at the time of our recommendation, you'd have $664,089!* Or when Nvidia made this list on April 15, 2005... if you invested $1,000 at the time of our recommendation, you'd have $881,731!* Now, it's worth noting Stock Advisor's total average return is 994% — a market-crushing outperformance compared to 172% for the S&P 500. Don't miss out on the latest top 10 list, available when you join . See the 10 stocks » *Stock Advisor returns as of June 9, 2025 Adria Cimino has no position in any of the stocks mentioned. The Motley Fool has positions in and recommends CRISPR Therapeutics. The Motley Fool recommends Viking Therapeutics. The Motley Fool has a disclosure policy. 2 Biotech Stocks to Buy Before They Soar 84% and 240%, According to Certain Wall Street Analysts was originally published by The Motley Fool Error in retrieving data Sign in to access your portfolio Error in retrieving data Error in retrieving data Error in retrieving data Error in retrieving data
Yahoo
40 minutes ago
- Yahoo
Dan Ives Says Market Is 'Massively Underestimating' This AI Play, Urges Investors To Look Beyong Mag 7
Benzinga and Yahoo Finance LLC may earn commission or revenue on some items through the links below. Dan Ives' new artificial intelligence exchange-traded fund (ETF) holds securities beyond the Magnificent 7 stocks, as he believes in looking past valuations for investments in the technology sector. What Happened: The Dan IVES Wedbush AI Revolution ETF (NYSE:IVES), managed by the Wedbush analyst, started trading on June 4, earlier this month. Ives boasts of the fund by saying that it just doesn't have the top four, five Magnificent 7 names, but stocks which investors wouldn't even thematically consider as an AI name today. "I believe the market is still massively underestimating what the growth is going to look like for the AI revolution in tech," he told CNBC. Trending: Maker of the $60,000 foldable home has 3 factory buildings, 600+ houses built, and big plans to solve housing — According to him, tech trade remains significant even for the investors who missed out on its growth in the past few years. "If you focus just on valuation, you miss every transformational tech stock of the last 20 years," Ives said. Ives says Oracle Corp. (NYSE:ORCL) will be the 'epicenter' of the AI theme, while highlighting other 'AI 30' stocks which are part of his fund. Palantir Technologies Inc. (NASDAQ:PLTR), International Business Machines Corp. (NYSE:IBM), Salesforce Inc. (NYSE:CRM), SoundHound AI Inc. (NASDAQ:SOUN), and Innodata Inc. (NASDAQ:INOD) are a few notable names that are a part of his ETF's 'AI 30' basket. Microsoft Corp. (NASDAQ:MSFT), Nvidia Corp. (NASDAQ:NVDA), and Broadcom Inc. (NASDAQ:AVGO) are the top three holdings of the IVES It Matters: The 'AI 30' stocks, which are a part of the IVES ETF, hold the AI plays from multiple industries. They include hyperscalers, cybersecurity, consumer platforms, and robotics. According to Ives, the list was compiled from his deep dives into major AI players. The ETF has $183 million in assets under management as of June 17 close. Ives said that the AI space was experiencing a "golden age." The Dan IVES Wedbush AI Revolution ETF has risen by 2.76% since its inception. A comparable index, S&P Kensho Global Artificial Intelligence Enablers, rose 6.08% on a month-to-date basis. Meanwhile, the SPDR S&P 500 ETF Trust (NYSE:SPY) and Invesco QQQ Trust ETF (NASDAQ:QQQ), which track the S&P 500 index and Nasdaq 100 index, respectively, declined slightly on Wednesday. The SPY was down 0.015% at $597.44, while the QQQ was 0.017% lower at $528.99, according to Benzinga Pro data. Read Next: Invest early in CancerVax's breakthrough tech aiming to disrupt a $231B market. Back a bold new approach to cancer treatment with high-growth potential. If there was a new fund backed by Jeff Bezos offering a 7-9% target yield with monthly dividends would you invest in it? Photo courtesy: Shutterstock This article Dan Ives Says Market Is 'Massively Underestimating' This AI Play, Urges Investors To Look Beyong Mag 7 originally appeared on


Forbes
41 minutes ago
- Forbes
Google's Quiet Confirmation Of The Pixel 10
Google's "Talking Phones Podcast" title card Google is having a little bit of fun in its latest iPhone vs Pixel video. While the Pixel 9 Pro takes pride of place, there's also a surprising cameo for what comes next… the Pixel 10. The Pixel 10 Easter Egg The details come at the end of Google's latest promotional video posted on its Made By Google YouTube channel. In it, an iPhone and a Pixel talk about the recent advances made by Apple's smartphone, with a not at all surprised Pixel noting when that feature arrived on Pixel and Android. These include live translation, hold assistance and call screening arriving four, five and seven years ago respectively., Of course, the capabilities of all these features have improved significantly over the last few years, so there are shades of Oranges to (ahem) Apples here, but it makes Google's point effectively. The story of Android being ahead of iOS is here for all to hear. The little easter egg at the end is part of another story, leading into the next Pixel release. The iPhone quietly asks the Pixel 9 Pro, 'so, what are you working on for Pixel 10… just out of curiosity?' Naming The Pixel 10 It's no secret that Google is working on the next family of Pixel smartphones. Neither is it a secret that we're expecting an entry-level Pixel 10, a premium Pixel 10 Pro, a larger Pixel 10 Pro XL and an innovative Pixel 10 Pro Fold. Yet this is the first time Google has publicly acknowledged that the new handsets will carry the Pixel 10 branding. Although the sharp-eared watchers will note that it was the iPhone confirming the new Pixel 10, not the Pixel 9 Pro. So, did Apple confirm the new name, rather than Google? Google is expected to announce the Pixel 10 family at an upcoming Made By Google event. While dates have not been confirmed, talk within the community picks out Wednesday, Aug. 20 for the launch, and Thursday, Aug 28. for the first handsets to go on sale to the public. Now read how the Pixel 10 will impact every Android smartphone in 2025 and beyond…