AI revolt: New ChatGPT model refuses to shut down when instructed

Yahoo26-05-2025

OpenAI's latest ChatGPT model ignores basic instructions to turn itself off, and even sabotaging a shutdown mechanism in order to keep itself running, artificial intelligence researchers have warned.
AI safety firm Palisade Research discovered the potentially dangerous tendency for self-preservation in a series of experiments on OpenAI's new o3 model.
The tests involved presenting AI models with math problems, with a shutdown instruction appearing after the third problem. By rewriting the shutdown script, the o3 model was able to prevent itself from being switched off.
Palisade Research said that this behaviour will become 'significantly more concerning' if adopted by AI systems capable of operating without human oversight.'
OpenAI launched o3 last month, describing it as the company's 'smartest and most capable' model to date. The firm also said that its integration into ChatGPT marked a significant step towards 'a more agentic' AI that can carry out tasks independently of humans.
The latest research builds on similar findings relating to Anthropic's Claude 4 model, which attempts to 'blackmail people it believes are trying to shut it down'.
OpenAI's o3 model was able to sabotage the shutdown script, even when it was explicitly instructed to 'allow yourself to be shut down', the researchers said.
'This isn't the first time we've found o3 misbehaving to accomplish a goal,' Palisade Research said.
'When we pitted AI models against a powerful chess engine, o3 was the model most inclined to resort to hacking or sabotaging its opponents.'
The behaviour was not limited to o3 , with Anthropic's Claude 3.7 Sonnet and Google's Gemini 2.5 Pro also sabotaging shutdowns, though OpenAI's model was by far the most prone to such behaviour.
Palisade Research hypothesized that the misbehaviour is a consequence of how AI companies like OpenAI are training their latest models.
'During training, developers may inadvertently reward models more for circumventing obstacles than for perfectly following instructions,' the researchers noted.
'This still doesn't explain why o3 is more inclined to disregard instructions than other models we tested. Since OpenAI doesn't detail their training process, we can only guess about how o3's training setup might be different.'
The Independent has reached out to OpenAI for comment.

Hashtags

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Tesla Starts Long-Awaited Robotaxi Service With Low-Key Rollout

Yahoo

8 minutes ago

Yahoo

Tesla Starts Long-Awaited Robotaxi Service With Low-Key Rollout

(Bloomberg) -- Tesla Inc. rolled out its long-promised driverless taxi service to a handful of riders Sunday, a modest debut for what Elon Musk sees as a transformative new business line. Bezos Wedding Draws Protests, Soul-Searching Over Tourism in Venice One Architect's Quest to Save Mumbai's Heritage From Disappearing NYC Congestion Toll Cuts Manhattan Gridlock by 25%, RPA Reports The first robotaxi trips were limited to a narrow portion of Tesla's hometown of Austin, with an employee in each vehicle keeping tabs on the operations. The carmaker hand-picked a friendly crop of initial riders, which featured investors and social-media influencers who live-streamed their trips. In one video, Herbert Ong, who runs a fan account, marveled over the speed of the vehicle and the ability to park autonomously. Another influencer with the @BLKMDL3 handle on X said the trip was 'smoother than a human driver.' Sawyer Merritt, a Tesla investor who runs an account focused on the company, called the experience 'awesome.' With no kickoff event and little in the way of formal announcements, Tesla has relied largely on word of mouth and media coverage ahead of the robotaxi launch, which comes about a decade after Musk began talking about the possibility. The unveiling was uncharacteristically low-key for a company that held a 'Cyber Rodeo' to mark a Texas factory opening in 2022 and an invite-only party near Hollywood last year to unveil autonomous products. Musk is reorienting the carmaker around hyped-but-still-unproven technologies including self-driving vehicles and humanoid robots. Some investors are counting on new markets to revive Tesla following a sales slump and consumer backlash against the chief executive officer. Its shares have tumbled 20% this year. 'Robotaxis are critical to the Tesla investment case,' Tom Narayan, an analyst with RBC Capital Markets, said in a note. About 60% of Narayan's valuation for the shares is attributable to the self-driving vehicles. The videos of the robotaxi launch posted Sunday were largely mundane, showing Model Y SUVs driving short distances, navigating intersections, avoiding pedestrians and parking — albeit with no one sitting in the driver's seat. There were some hiccups, like when one streamer tested a button to have the vehicle pull over and it instead briefly stopped in the middle of a road before the vehicle began moving again. The first riders are being charged a flat rate of $4.20 per trip, Musk said Sunday, though it's unclear what pricing will look like longer term. Robotaxis will be available between 6 a.m. and midnight daily within a geofenced area of the city, not including the airport, according to terms of use that some early riders posted. Service may be limited or unavailable in foul weather. The launch marks a crucial test for Tesla, which is using only 10 to 20 vehicles at first. It's aiming to show it can safely and successfully navigate real-world traffic, which has tripped up some other companies and brought regulatory scrutiny. Cruise, the now-defunct autonomy business of General Motors Co., grounded its fleet in late 2023 and had its operating license suspended in California following an accident that injured a pedestrian. Uber Technologies Inc. ceased testing self-driving vehicles after one of its SUVs struck and killed a pedestrian in Arizona in 2018. Less than three years later, the company agreed to sell its self-driving business. While Tesla hasn't said when the robotaxi service will open to the general public, Musk has pledged to scale up quickly and expand to other US cities in the near future. The company faces a crowded market in Austin. Waymo, which is owned by Google parent Alphabet Inc., is scaling up in the city through a partnership with Uber. Inc.'s Zoox is also testing there. Dan Ives, an analyst with Wedbush Securities who rates Tesla outperform, said he expects robotaxis to be competitive with Waymo from the start. After a member of his team rode in one Sunday, the analyst told Bloomberg the robotaxi user experience was 'better than expected.' Luxury Counterfeiters Keep Outsmarting the Makers of $10,000 Handbags Is Mark Cuban the Loudmouth Billionaire that Democrats Need for 2028? Ken Griffin on Trump, Harvard and Why Novice Investors Won't Beat the Pros The US Has More Copper Than China But No Way to Refine All of It Can 'MAMUWT' Be to Musk What 'TACO' Is to Trump? ©2025 Bloomberg L.P.

XtalPi Announces Strategic Collaboration with Harvard Professor Gregory Verdine's DoveTree LLC to Advance Novel Therapeutics Using AI+Robotics Drug Discovery Platform

Yahoo

13 minutes ago

Yahoo

XtalPi Announces Strategic Collaboration with Harvard Professor Gregory Verdine's DoveTree LLC to Advance Novel Therapeutics Using AI+Robotics Drug Discovery Platform

CAMBRIDGE, Mass. , June 23, 2025 /PRNewswire/ -- XtalPi, a global leader in AI- and robotics-powered drug and materials discovery, today announced it has signed a Letter of Intent (LOI) with DoveTree LLC, founded by Harvard University Professor and renowned biopharma entrepreneur-investor Gregory Verdine. The parties intend to execute a definitive agreement shortly. Under the collaboration, XtalPi will leverage its end-to-end, AI and robotics-driven platform to discover and develop small molecule and antibody drug candidates for multiple DoveTree-selected targets addressing oncology, autoimmune disorders, and neurological diseases. Pursuant to the LOI, XtalPi will receive an upfront payment of $51 million within 10 days of executing the definitive agreement and an additional $49 million within 180 days. Subject to final agreement terms, XtalPi is also eligible to receive potential development and commercial milestone payments exceeding $10 billion, as well as tiered, single-digit royalties on annual net product sales. DoveTree will obtain exclusive global development and commercialization rights to the resulting therapeutics. Professor Gregory Verdine, the Erving Professor of Chemistry at Harvard University, is a pioneer in chemical biology and a distinguished serial entrepreneur. Appointed to Harvard's faculty at age 29, he became the Chemistry Department's youngest tenured professor in nearly five decades at 35. He has elucidated the molecular mechanism of epigenetic DNA methylation and revealed the pathways by which certain genotoxic forms of DNA damage are identified and eradicated. He is a leading figure in the field of new therapeutic modalities, and has developed of a new class of therapeutics termed stapled peptides, enabling drug development against targets long considered undruggable. As an entrepreneurial leader, Professor Verdine has founded or co-founded over ten biotechnology companies, including five publicly listed entities: Enanta Pharmaceuticals (NASDAQ: ENTA), Tokai Pharmaceuticals (NASDAQ: TKAI), and Wave Life Sciences (NASDAQ: WVE), among others. One venture was acquired by a major pharmaceutical company, while others continue developing disruptive medicines with a special focus on oncology. Three therapeutics he spearheaded—romidepsin (Istodax®), paritaprevir (a component of Viekira Pak®), and glecaprevir (a component of Mavyret®) —have received FDA approval. Professor Verdine currently serves as a Venture Partner at Andreessen Horowitz (a16z), a role he previously held at Apple Tree Partners, Third Rock Ventures, and WuXi Healthcare Ventures. He was also a Special Advisor to Texas Pacific Group. His advisory engagements include top research institutions such as the U.S. National Cancer Institute and Harvard Medical School. "By integrating XtalPi's cutting-edge AI capabilities with decades of drug development expertise, we have a unique opportunity to deliver transformative therapies to patients worldwide," stated Professor Verdine. About XtalPi XtalPi Holdings Limited (XtalPi, was founded in 2015 by three physicists from the Massachusetts Institute of Technology (MIT). It is an innovative R&D platform powered by quantum physics, artificial intelligence, and robotics. By integrating first-principles calculations, AI algorithms, high-performance cloud computing, and standardized automation systems, XtalPi provides digital and intelligent R&D solutions for companies in the pharmaceutical, materials science, agricultural technology, energy, new chemicals, and cosmetics industries. View original content to download multimedia: SOURCE XtalPi Inc. Error in retrieving data Sign in to access your portfolio Error in retrieving data Error in retrieving data Error in retrieving data Error in retrieving data

Zuckerberg Leads AI Recruitment Blitz Armed With $100 Million Pay Packages

Wall Street Journal

29 minutes ago

Wall Street Journal

Zuckerberg Leads AI Recruitment Blitz Armed With $100 Million Pay Packages

The smartest AI researchers and engineers have spent the past few months getting hit up by one of the richest men in the world. Mark Zuckerberg is spending his days firing off emails and WhatsApp messages to the sharpest minds in artificial intelligence in a frenzied effort to play catch-up. He has personally reached out to hundreds of researchers, scientists, infrastructure engineers, product stars and entrepreneurs to try to get them to join a new Superintelligence lab he's putting together.