
Why superintelligent AI isn't taking over anytime soon
A primary requirement for being a leader in AI these days is to be a herald of the impending arrival of our digital messiah: superintelligent AI.
For Dario Amodei of Anthropic, Demis Hassabis of Google and Sam Altman of OpenAI, it isn't enough to claim that their AI is the best. All three have recently insisted that it's going to be so good, it will change the very fabric of society.
Even Meta—whose chief AI scientist has been famously dismissive of this talk—wants in on the action. The company confirmed it is spending $14 billion to bring in a new leader for its AI efforts who can realize Mark Zuckerberg's dream of AI superintelligence—that is, an AI smarter than we are.
'Humanity is close to building digital superintelligence," Altman declared in an essay this week, and this will lead to 'whole classes of jobs going away" as well as 'a new social contract." Both will be consequences of AI-powered chatbots taking over all our white-collar jobs, while AI-powered robots assume the physical ones.
Before you get nervous about all the times you were rude to Alexa, know this: A growing cohort of researchers who build, study and use modern AI aren't buying all that talk.
The title of a fresh paper from Apple says it all: 'The Illusion of Thinking." In it, a half-dozen top researchers probed reasoning models—large language models that 'think" about problems longer, across many steps—from the leading AI labs, including OpenAI, DeepSeek and Anthropic. They found little evidence that these are capable of reasoning anywhere close to the level their makers claim.
Generative AI can be quite useful in specific applications, and a boon to worker productivity. OpenAI claims 500 million monthly active ChatGPT users—astonishingly far reach and fast growth for a service released just 2½ years ago. But these critics argue there is a significant hazard in overestimating what it can do, and making business plans, policy decisions and investments based on pronouncements that seem increasingly disconnected from the products themselves.
Apple's paper builds on previous work from many of the same engineers, as well as notable research from both academia and other big tech companies, including Salesforce. These experiments show that today's 'reasoning" AIs—hailed as the next step toward autonomous AI agents and, ultimately, superhuman intelligence—are in some cases worse at solving problems than the plain-vanilla AI chatbots that preceded them. This work also shows that whether you're using an AI chatbot or a reasoning model, all systems fail utterly at more complex tasks.
Apple's researchers found 'fundamental limitations" in the models. When taking on tasks beyond a certain level of complexity, these AIs suffered 'complete accuracy collapse." Similarly, engineers at Salesforce AI Research concluded that their results 'underscore a significant gap between current LLM capabilities and real-world enterprise demands."
Importantly, the problems these state-of-the-art AIs couldn't handle are logic puzzles that even a precocious child could solve, with a little instruction. What's more, when you give these AIs that same kind of instruction, they can't follow it.
Apple's paper has set off a debate in tech's halls of power—Signal chats, Substack posts and X threads—pitting AI maximalists against skeptics.
'People could say it's sour grapes, that Apple is just complaining because they don't have a cutting-edge model," says Josh Wolfe, co-founder of venture firm Lux Capital. 'But I don't think it's a criticism so much as an empirical observation."
The reasoning methods in OpenAI's models are 'already laying the foundation for agents that can use tools, make decisions, and solve harder problems," says an OpenAI spokesman. 'We're continuing to push those capabilities forward."
The debate over this research begins with the implication that today's AIs aren't thinking, but instead are creating a kind of spaghetti of simple rules to follow in every situation covered by their training data.
Gary Marcus, a cognitive scientist who sold an AI startup to Uber in 2016, argued in an essay that Apple's paper, along with related work, exposes flaws in today's reasoning models, suggesting they're not the dawn of human-level ability but rather a dead end. 'Part of the reason the Apple study landed so strongly is that Apple did it," he says. 'And I think they did it at a moment in time when people have finally started to understand this for themselves."
In areas other than coding and mathematics, the latest models aren't getting better at the rate that they once did. And the newest reasoning models actually hallucinate more than their predecessors.
'The broad idea that reasoning and intelligence come with greater scale of models is probably false," says Jorge Ortiz, an associate professor of engineering at Rutgers, whose lab uses reasoning models and other cutting-edge AI to sense real-world environments. Today's models have inherent limitations that make them bad at following explicit instructions—the opposite of what you'd expect from a computer, he adds.
It's as if the industry is creating engines of free association. They're skilled at confabulation, but we're asking them to take on the roles of consistent, rule-following engineers or accountants.
That said, even those who are critical of today's AIs hasten to add that the march toward more-capable AI continues.
Exposing current limitations could point the way to overcoming them, says Ortiz. For example, new training methods—giving step-by-step feedback on models' performance, adding more resources when they encounter harder problems—could help AI work through bigger problems, and make better use of conventional software.
From a business perspective, whether or not current systems can reason, they're going to generate value for users, says Wolfe.
'Models keep getting better, and new approaches to AI are being developed all the time, so I wouldn't be surprised if these limitations are overcome in practice in the near future," says Ethan Mollick, a professor at the Wharton School of the University of Pennsylvania, who has studied the practical uses of AI.
Meanwhile, the true believers are undeterred.
Just a decade from now, Altman wrote in his essay, 'maybe we will go from solving high-energy physics one year to beginning space colonization the next year." Those willing to 'plug in" to AI with direct, brain-computer interfaces will see their lives profoundly altered, he adds.
This kind of rhetoric accelerates AI adoption in every corner of our society. AI is now being used by DOGE to restructure our government, leveraged by militaries to become more lethal, and entrusted with the education of our children, often with unknown consequences.
Which means that one of the biggest dangers of AI is that we overestimate its abilities, trust it more than we should—even as it's shown itself to have antisocial tendencies such as 'opportunistic blackmail"—and rely on it more than is wise. In so doing, we make ourselves vulnerable to its propensity to fail when it matters most.
'Although you can use AI to generate a lot of ideas, they still require quite a bit of auditing," says Ortiz. 'So for example, if you want to do your taxes, you'd want to stick with something more like TurboTax than ChatGPT."
Write to Christopher Mims at christopher.mims@wsj.com
Hashtags

Try Our AI Features
Explore what Daily8 AI can do for you:
Comments
No comments yet...
Related Articles


Mint
an hour ago
- Mint
The advertising industry parties in Cannes, with AI as its new plus-one
Tech companies like Spotify annually host parties for clients and business partners at the Cannes Lions advertising festival, where attendees are known for letting loose after dark. After several years of small experiments with AI and big anxieties over its impact, advertising executives got with the program at this week's Cannes Lions International Festival of Creativity, the ad industry's annual five-day gathering on the French Riviera. Almost every company that took over a swanky beach club, hosted guests in a villa or bought its staff $5,000 festival passes told an enthusiastic story about artificial intelligence. Raging against the machine was firmly out. Any remaining rank-and-file worries about job losses were mostly voiced far from official events. 'We've moved beyond the promise and the fear to the practical application," said Don McGuire, chief marketing officer at chip maker Qualcomm, adding that the company is saving 2,400 hours a month by using an AI agent-building tool called Writer. 'People are talking about using it in different contexts. It's no longer, 'Well, it could do this, or could do that.' " Two years ago, at the first Cannes Lions since the debut of ChatGPT announced AI's new potential, ad agency Monks co-founder Wesley ter Haar set up in a small apartment. Cassandra-like, he told visitors that AI was about to upend ad creation and employment. Executives at other companies in Cannes that year described their trials with the technology but emphasized that only humans can develop the emotional insights that steer ad campaigns. This time the idea of AI-driven industry transformation was mainstream, even if leaders still expressed confidence about humans' continued role. 'Obviously the world of business, and the world at large, is being profoundly disrupted as we speak, and the impact on jobs is already being felt," said Marisa Thalberg, the chief customer and marketing officer at Catalyst Brands, the company formed by the merger of Brooks Brothers-owner SPARC Group and JCPenney. 'My optimism comes from knowing how much creativity is—and will remain—so fundamentally and uniquely human, even if the ways we harness and express it continue to change." Instagram and Facebook owner Meta Platforms used the festival to unveil a host of new AI-based products designed to help advertisers make ads as quickly and simply as possible, feasibly without the need for an agency. Executives at the company repeatedly said the tools weren't designed to replace agencies, however—just to speed up their work and help smaller businesses that can't afford agencies. Marketers in Cannes even put concerns such as President Trump's trade war and tightening consumer budgets on the back burner in favor of talking about AI. 'I didn't have one single conversation about tariffs," said Yannick Bolloré, the chairman and chief executive officer of French advertising holding company Havas. The guest list-only 'cafe" run by Havas on the grounds of the Mondrian Hotel used AI to turn guests into 3-D characters in a movie using only a photo. The company last year said it would invest 400 million euros, or more than $429 million at the time, in AI development over the course of four years, a commitment similar to those made by rival holding companies. Now Bolloré is asking that his staff refer to AI agents as 'teammates." 'Those agents will be fully part of the Havas family," Bolloré said. 'In terms of employees we will find a lot of efficiencies, but our bet is that we will manage more revenue with the same amount of people." But reality isn't always close at hand during Cannes, a 13,000-person conference where $1,355 magnums of Dom Pérignon are regularly ordered to business tables at lunch, and executives' public displays of affection for AI began to wear thin with some. Lower-ranking attendees darkly joked at post-programming parties that they'd be replaced by their artificial counterparts before the next festival. And research published Monday raised some red flags for agencies, most of which have been racing to build up their AI arsenal. Agency trade association the 4As and consulting firm Forrester found that although 75% of agencies are using the technology—up from 61% last year—75% of those using it are also funding it directly without passing on the costs to clients, up from 41% in 2024. 'That is deeply concerning," said Jay Pattisall, principal analyst at Forrester, who wrote in the report that 'agencies are backsliding into antiquated commercial models that led to the commoditization and lack of transparency associated with marketing services." The strongest pushback to the AI overload at Cannes came from the celebrities and social-media content creators who now flood Cannes along with traditional ad players and tech companies. Actors Josh Duhamel, Reese Witherspoon, JB Smoove and others touted their own creative companies but also made a case for the employment of Hollywood talent in the ad industry. Advertising benefits from emotional connections that actors, directors and scriptwriters know how to provide, Smoove said. 'We're talking about mastering the moment," Smoove said. 'You meet somebody that you haven't seen in years and they tell you a funny joke? AI can't do that."


Mint
2 hours ago
- Mint
Meta wanted to buy a $30 billion AI startup: report. What it is trying instead.
Meta Platforms stock has climbed this year in response to the social media company's progress with artificial-intelligence. Now, it is reportedly stepping up its efforts, trying to acquire a major AI start-up and recruiting new AI executives. Meta looked to buy Safe Superintelligence earlier this year but was rebuffed by its founder Ilya Sutskever, CNBC reported late Thursday, citing people familiar with the matter. Safe Superintelligence was valued at $30 billion in a funding round in March. Meta and Safe Superintelligence didn't immediately respond to requests for comment. On the face of it, such an acquisition would have been an odd move. Safe Superintelligence hasn't released any products, as it concentrates on developing supersmart AI. Meta has also been getting along perfectly well on its own, with its stock up 19% so far this year. The real attraction of such a deal likely would have been to get Sutskever and his key employees on board. Sutskever was previously chief scientist at OpenAI, where he helped develop the technology behind ChatGPT. He left OpenAI last year following a break with its CEO Sam Altman, and subsequently launched Safe Superintelligence. Thwarted in his efforts to bring Sutskever on board, Meta CEO Mark Zuckerberg has instead negotiated to recruit Safe Superintelligence's CEO Daniel Gross, as well as former GitHub CEO Nat Friedman, according to CNBC. Gross and Friedman are partners in the investment fund NFDG, which has backed several AI start-ups. So far, Meta has relied on in-house AI models, as opposed to acquiring or funding an AI start-up as Microsoft has done with OpenAI and has with Anthropic. However, there have been signs that Zuckerberg feels Meta's AI team needs bolstering. Last week, Meta completed an investment in Scale AI. The Wall Street Journal reported that Meta would pump $14 billion into the data-labeling company in exchange for a 49% stake and that Scale AI founder Alexandr Wang would join Meta. The bigger picture here is that multiple AI companies have delayed the releases of their next flagship models amid concerns they don't show sufficient improvement. That suggests the industry's 'scaling law," the idea that larger and more complex models are automatically more intelligent, is breaking down. Meta is among those struggling to make a breakthrough. Its 'Behemoth" model, originally meant to be released in April, is being delayed until fall or later, according to the Journal. The response from AI companies has been the development of so-called reasoning models that break down problems step-by-step. However, a recent paper from researchers at Apple found 'fundamental limitations" in such models. At tasks beyond a certain level of complexity, these AIs suffered 'complete accuracy collapse," according to the researchers. That suggests the industry will need to adopt new techniques to push AI to the next level of intelligence. Meta will hope that its new recruits can get there first.


Time of India
2 hours ago
- Time of India
10 years since launch, work on flyovers at Manesar and Bilaspur Chowk to unclog NH-8 set to start next month
Gurgaon: In a relief from grinding congestion along Delhi-Jaipur highway, work on two long-pending flyovers at Manesar and Bilaspur Chowk — which were announced 10 years ago — is likely to begin next month. NHAI has awarded the project to a new agency, aiming to decongest these two choke points on this critical industrial corridor. The two flyovers, costing around Rs 84 crore, are expected to be completed by 2027 even though the projects are delayed by at least three years going by the original timeline. Bilaspur Chowk flyover, in particular, faced repeated setbacks. Originally announced in 2015, it saw a foundation stone laid in March 2022 with a targeted completion date of Oct 2024. The two flyovers — 1.2km at Manesar and 1km at Bilaspur Chowk — will be six-lane structures with three-lane service roads on either side. This stretch of NH8, which connects Delhi to Jaipur, is one of the busiest corridors, with high volumes of industrial traffic, commercial vehicles and daily commuters. For years, however, road users faced logjams at Bilaspur and Manesar, particularly during peak hours. An NHAI official said, "We will begin construction by first developing the service roads to ensure uninterrupted traffic movement. by Taboola by Taboola Sponsored Links Sponsored Links Promoted Links Promoted Links You May Like I Asked ChatGPT What Humanity Will Achieve In The Next 30 Years — Here's What It Said Liseer Undo Once that is done, work on the main flyovers will commence." "We aim to finish both structures within two years. Additionally, a minor bridge will be constructed at Deodahi and a box culvert at Ladhuwas," the official said. "Both Manesar and Bilaspur Chowk see congestion during rush hours. The new infrastructure will allow smoother vehicular flow and benefit thousands of commuters daily," the official said. Initially overseen by NHAI's project implementing unit (PIU) Jaipur as part of a wider highway upgrade, the project was later handed over to PIU Rewari. Construction at Bilaspur Chowk began in May 2023 but soon stalled when underground utility lines were found in the alignment. A revision in estimates and financial constraints of the contracted firm brought the project to a standstill. In Aug last year, a mahapanchayat of over 108 villages was held demanding the completion of stalled flyover construction at teh Bilaspur Chowk. In Oct 2024, NHAI finally descoped the stalled work and floated a fresh tender in Dec. Now, with the new contract in place, the agency is expected to mobilise equipment and manpower within weeks. While the NHAI claimed the project is finally back on track and construction is to start next month, residents and commuters said they will be watching closely to see if the new deadlines are met. "For the last three years, we have been hearing about the flyover construction, but little to nothing has been done," Bhora Kalan sarpanch Manbir Singh said. Manbir said, "We have to take long detours every day just to reach Tauru or Pataudi. Even the service road was repaired only after a mahapanchayat last year. Bilaspur Chowk is an accident-prone area and many have lost their lives there." Once completed, the new flyovers are expected to cut travel time and ease bottlenecks — especially at Bilaspur Chowk and Manesar, a known traffic nightmare.