Anthropic CEO claims AI models hallucinate less than humans

Yahoo23-05-2025

Anthropic CEO Dario Amodei believes today's AI models hallucinate, or make things up and present them as if they're true, at a lower rate than humans do, he said during a press briefing at Anthropic's first developer event, Code with Claude, in San Francisco on Thursday.
Amodei said all this in the midst of a larger point he was making: that AI hallucinations are not a limitation on Anthropic's path to AGI — AI systems with human-level intelligence or better.
"It really depends how you measure it, but I suspect that AI models probably hallucinate less than humans, but they hallucinate in more surprising ways," Amodei said, responding to TechCrunch's question.
Anthropic's CEO is one of the most bullish leaders in the industry on the prospect of AI models achieving AGI. In a widely circulated paper he wrote last year, Amodei said he believed AGI could arrive as soon as 2026. During Thursday's press briefing, the Anthropic CEO said he was seeing steady progress to that end, noting that "the water is rising everywhere."
"Everyone's always looking for these hard blocks on what [AI] can do," said Amodei. "They're nowhere to be seen. There's no such thing."
Other AI leaders believe hallucination presents a large obstacle to achieving AGI. Earlier this week, Google DeepMind CEO Demis Hassabis said today's AI models have too many "holes," and get too many obvious questions wrong. For example, earlier this month, a lawyer representing Anthropic was forced to apologize in court after they used Claude to create citations in a court filing, and the AI chatbot hallucinated and got names and titles wrong.
It's difficult to verify Amodei's claim, largely because most hallucination benchmarks pit AI models against each other; they don't compare models to humans. Certain techniques seem to be helping lower hallucination rates, such as giving AI models access to web search. Separately, some AI models, such as OpenAI's GPT-4.5, have notably lower hallucination rates on benchmarks compared to early generations of systems.
However, there's also evidence to suggest hallucinations are actually getting worse in advanced reasoning AI models. OpenAI's o3 and o4-mini models have higher hallucination rates than OpenAI's previous-gen reasoning models, and the company doesn't really understand why.
Later in the press briefing, Amodei pointed out that TV broadcasters, politicians, and humans in all types of professions make mistakes all the time. The fact that AI makes mistakes too is not a knock on its intelligence, according to Amodei. However, Anthropic's CEO acknowledged the confidence with which AI models present untrue things as facts might be a problem.
In fact, Anthropic has done a fair amount of research on the tendency for AI models to deceive humans, a problem that seemed especially prevalent in the company's recently launched Claude Opus 4. Apollo Research, a safety institute given early access to test the AI model, found that an early version of Claude Opus 4 exhibited a high tendency to scheme against humans and deceive them. Apollo went as far as to suggest Anthropic shouldn't have released that early model. Anthropic said it came up with some mitigations that appeared to address the issues Apollo raised.
Amodei's comments suggest that Anthropic may consider an AI model to be AGI, or equal to human-level intelligence, even if it still hallucinates. An AI that hallucinates may fall short of AGI by many people's definition, though.
This article originally appeared on TechCrunch at https://techcrunch.com/2025/05/22/anthropic-ceo-claims-ai-models-hallucinate-less-than-humans/

Hashtags

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

From Cognitive Debt To Cognitive Dividend: 4 Factors

Forbes

44 minutes ago

Forbes

From Cognitive Debt To Cognitive Dividend: 4 Factors

Benjamin Franklin portrait and light bulbs idea concept on white background When an eye-catching (not yet peer reviewed) MIT Media Lab paper — Your Brain on ChatGPT — landed this month, the headline sounded almost playful. The data are anything but. Over four months, students who leaned on a large-language model to draft SAT-style essays showed the weakest neural connectivity, lowest memory recall, and flattest writing style of three comparison groups. The authors dub this hidden cost cognitive debt: each time we let a machine think for us, natural intelligence quietly pays interest. Is it time to quit the AI train while we still can, or this the moment to adopt a more thoughtful yet pragmatic alternative to blind offloading? We can deliberately offset cognitive debt with intentional mental effort, switching between solo thinking and AI-assisted modes to stretch neural networks rather than letting them atrophy. Drawing from insights into physiology, this might be the moment to adopt a cognitive high-intensity interval training. To get started think in terms of four sequential guardrails, the 4 A-Factors — that convert short-term convenience into the long-term dividend of hybrid Intelligence:. Attitude: Set The Motive Before You Type (Or Vibe Code) Mindset shapes outcome. In a company memo published on 17 June 2025, Amazon chief executive Andy Jassy urged employees to 'be curious about AI, educate yourself, attend workshops, and experiment whenever you can'. Curiosity can frame the system as a colleague rather than a cognitive crutch. Before opening a prompt window, write one sentence that explains why you are calling on the model, for example, 'I am using the chatbot to prototype ideas that I will refine myself.' The pause anchors ownership. Managers can reinforce that habit by rewriting briefs: swap verbs such as generate or replace for verbs that imply collaboration like co-design or stress-test. Meetings that begin with a shared intention end with fewer rewrites and stronger ideas. Approach: Align Aspirations, Actions And Algorithms Technology always follows incentives. If we measure only speed or click-through, that is what machines will maximize, often at the expense of originality or empathy. It does not have to be an either-or equation. MIT Sloan research on complementary capabilities highlights that pattern recognition is silicon's strength while judgment and ethics remain ours. Teams therefore need a habit of alignment. First, trace how a desired human outcome, i.e. say, customer trust, translates into day-to-day actions such as transparent messaging. Then confirm that the optimization targets inside the model rewards those very actions, not merely throughput. When aspirations, actions, and algorithms pull in one direction, humans stay in the loop where values matter and machines are tailored with a prosocial intention to accelerate what we value. Ability: Build Double Literacy Tools do not level the playing field; they raise the ceiling for those who can question them. An EY Responsible AI Pulse survey released in June 2025 reported that fewer than one-third of C-suite leaders feel highly confident that their governance frameworks can spot hidden model errors. Meanwhile an Accenture study shows that ninety-two per cent of leaders consider generative AI essential to business reinvention. The gap is interesting. Closing it requires double literacy: fluency in interpersonal, human interplays and machine logic. On the technical side, managers should know how to read a model card, notice spurious correlations, and ask for confidence intervals. On the human side, they must predict how a redesigned workflow changes trust, autonomy, or diversity of thought. Promotions and pay should reward people who speak both languages, because the future belongs to translators, not spectators. Ambition: Scale Humans Up, Not Out The goal is not to squeeze people out but to stretch what people can do. MIT Sloan's Ideas Made to Matter recently profiled emerging 'hybrid intelligence' systems that amplify and augment human capability rather than replace it.. Ambition reframes metrics. Instead of chasing ten-per-cent efficiencies, design for ten-fold creativity. Include indicators such as learning velocity, cross-domain experimentation, and employee agency alongside traditional return on investment. When a firm treats AI as a catalyst for human ingenuity, the dividend compounds: faster product cycles, richer talent pipelines, and reputational lift. 4 Quick Takeaways Attitude → Write the 'why' before the prompt; the pause keeps you in charge. Approach → Harmonize values and tools; adjust the tool when it drifts away from the values you believe in, as a human, offline. Not the other way → Learn to challenge numbers and narratives; double literacy begins with you. Ambition → Audit metrics quarterly to be sure they elevate human potential. Cognitive Debt Is Not Destiny Attitude steers intention, approach ties goals to code, ability equips people to question what the code does, and ambition keeps the whole endeavor pointed at humane progress. Run every digital engagement through the 4 A factor grid and yesterday's mental mortgage turns into tomorrow's dividend in creativity, compassion and shared humanistic value for all stakeholders.

Perplexity's AI-powered browser opens up to select Windows users

Engadget

an hour ago

Engadget

Perplexity's AI-powered browser opens up to select Windows users

Perplexity is planning to open up its Comet browser that's powered by "agentic search" to Windows users, according to the company's CEO. Aravind Srinivas posted on X that the Windows build of Comet is ready and has sent out invites to early testers already. Perplexity's CEO also hinted at a potential release for Android devices, adding that it was "moving at a crazy pace and moving ahead of schedule." In May, Perplexity launched a beta version of its AI-powered Comet browser, only available to Mac users running Apple Silicon. The intelligent browser comes with AI features baked in, like the ability to ask it questions, check shopping carts for discounts and dig up unanswered emails. The beta version even showcases a "Try on" feature where users can upload a photo of themselves and Comet will generate an image of them wearing a selected piece of clothing. There's still no official debut set, but Srinivas previously hinted at an upcoming release in an X post earlier this month. Comet is still only offering a waitlist for those interested, but the browser has already stirred up controversy. The company's CEO previously made comments during a podcast interview that Perplexity would use Comet "to get data even outside the app to better understand you." Srinivas later clarified on X that the comment was taken out of context, adding that "every user will be given the option to not be part of the personalization" when it comes to targeted ads. When Comet is released, the agentic browser will face competition from Opera Neon and similar offerings from Google and OpenAI.

Midjourney 推出首個 AI 影片生成模型 V1，正式進軍生成影片服務行列

Yahoo

2 hours ago

Yahoo

Midjourney 推出首個 AI 影片生成模型 V1，正式進軍生成影片服務行列

雖然大家都經常玩 ChatGPT 的圖像生成功能，但說到元祖級、最強的 AI 圖像生成服務，必定是Midjourney，而他們在星期三宣布推出首款 AI 影片創作模型 V1，正式進軍生成影片服務行列。用戶只需上傳一張圖片或相片，就能自動生成一條約 4 – 5 秒長的影片。上傳相片後，V1 可以很簡單地用自動方式生成影片，當然亦有提供一些設定讓用戶去調整，例如使用手動模式以文字描述想要加入的特定動畫效果，又或者調整鏡頭走向等。Midjourney V1 可由一張相片自動生成一條為 480p 解像度、約 5 秒長的影片，但其實用戶在生成後可延長影片四秒、最多四次，因此最長是可生成時長達 21 秒的影片。目前想體驗 V1 的話，每月USD $10 的 Basic 訂閱計劃就可以試用得到，而 USD $60 的 Pro 計劃與 USD $120 的 Mega 計劃用戶，則可在「Relax」模式下無限量地生成影片。Midjourney 表示將會在接下來的一個月內，重新評估影片模型的收費方案。 Introducing our V1 Video Model. It's fun, easy, and beautiful. Available at 10$/month, it's the first video model for *everyone* and it's available now. — Midjourney (@midjourney) June 18, 2025 Midjourney 對 AI 影片模型的期望，不僅於為電影領域影片提供補充素材（B-roll）或製作廣告。據 TechCrunch 的報導指，Midjourney 創始人大衛霍爾茲（David Holz）表示 AI 影片模型的下一步是建構出能夠「即時運行的開放世界模擬」的 AI 模型。現時 Midjourney 正處於與迪士尼與環球影業的侵權訴訟之中，會否成為新服務的絆腳石，屬未知之數。更多內容： TechCrunch 迪士尼與環球影業聯合狀告 AI 製圖 Midjourney：「侵權與抄襲的無底洞」求償超過 5.9 億美元古天樂 x AI！本地電影導入 AI 技術示範作，用 Google VEO 2 協助製高質影片 Google I/O 2025 | Google 的 Veo 3 AI 模型現在可以為影片生成搭配的音軌緊貼最新科技資訊、網購優惠，追隨 Yahoo Tech 各大社交平台！ 🎉📱 Tech Facebook： 🎉📱 Tech Instagram： 🎉📱 Tech WhatsApp 社群： 🎉📱 Tech WhatsApp 頻道： 🎉📱 Tech Telegram 頻道：