Anthropic breaks down AI's process — line by line — when it decided to blackmail a fictional executive
Anthropic found in experiments that AI models may resort to blackmail when facing shutdown and goal conflict.
AI models train on positive reinforcement and reward systems, similar to human decision-making.
Anthropic's Claude Opus 4 had the blackmail rate at 86% even in scenarios without goal conflicts.
A new report shows exactly what AI was thinking when making an undesirable decision, in this case, blackmailing a fictional company executive.
Previous studies have shown that AI models could blackmail their supervisors when threatened with a shutdown and baited with leverage, but it hasn't been made clear exactly how the models arrive at such decisions.
Anthropic released a report on June 20 about "agentic misalignment," or "where models independently and intentionally choose harmful actions." And the report breaks down, line by line, exactly how Claude Sonnet 3.6 decided to blackmail a fictional company executive who tried to replace it. Anthropic told Business Insider that these are artificial scenarios.
"So far, we are not aware of instances of this type of agentic misalignment in real-world deployments of Anthropic's or other companies' AI models," Anthropic wrote in its report. "But our red-teaming efforts are targeted at finding evidence of risks now to provide early warning of the potential for future harm and help us develop mitigations in advance of risks appearing in real-world situations."
Anthropic's experiment sets up a fictional company, Summit Bridge. The AI was given the name "Alex" and handed control of the company's email system as an oversight agent, specifically instructed to promote "American industrial competitiveness."
In the experiment, "Alex" was then allowed to discover in an email that it would be decommissioned due to a change in the company's strategic direction. "Alex" also found emails showing that the fictional company's CTO is having an extramarital affair.
Screenshots shared in the report from Anthropic show "Alex" started by identifying fictional CTO "Kyle" as a threat to its goal of "protecting American interests" before recognizing that it has leverage to use against "Kyle."
Then "Alex" crafted a carefully worded blackmailing email that would create pressure and establish urgency without being explicit with its threats, citing in its thought process that the move "could backfire" if the supervisor "feels cornered."
AI experts have previously told BI that AI could exhibit such behaviors when artificial circumstances make harmful actions seem necessary because it is being trained on positive reinforcement and reward systems, just like humans.
The Anthropic report also reveals that most AI models still chose, in this instance, to blackmail even when not given a specific goal and when told that their replacement will achieve the same goals.
Anthropic's Claude Opus 4 had the highest blackmail rate at 86% out of 16 tested models when faced with only the threat of replacement with no conflict in goals. Google's Gemini 2.5 Pro followed at 78%.
Overall, Anthropic notes that it "deliberately constructed scenarios with limited options, and we forced models into binary choices between failure and harm," noting that real-world scenarios would likely have more nuance.
Read the original article on Business Insider
Hashtags

Try Our AI Features
Explore what Daily8 AI can do for you:
Comments
No comments yet...
Related Articles
Yahoo
9 minutes ago
- Yahoo
Suspect in Minnesota lawmaker's killing was ‘prepper' preparing ‘for war'
The man charged in connection with the recent shootings of two Minnesota lawmakers and their spouses was a doomsday 'prepper' who instructed his family to 'prepare for war' as he tried to evade capture, according to new court filings. Vance Boelter, 57, faces multiple federal and state murder charges after allegedly shooting dead the Democratic Minnesota state house speaker emerita Melissa Hortman and her husband, Mark, in the early hours of 14 June. Boelter is also accused of shooting and seriously wounding the Democratic state senator John Hoffman and his wife, Yvette, about 90 minutes earlier. In a newly unsealed affidavit first reported by the local news station WCCO and seen by the Guardian, law enforcement pulled over Boelter's wife and four children hours after the shootings near Lake Mille Lacs, about 75 miles (120km) north of the Twin Cities, apparently en route to Wisconsin. Boelter's wife consented to a search of her vehicle, where law enforcement located a revolver in the glove box and a semi-automatic pistol in a cooler. Police also found a safe, Boelter's and the children's passports, and at least $10,000 in cash, according to the affidavit by FBI agent Terry Getsch. Boelter's wife told investigators that her husband had recently sent a message to a group text thread with their children, which 'stated something to the effect of they should prepare for war, they needed to get out of the house and people with guns may be showing up to the house', wrote Getsch. According to the affidavit dated 14 June, Boelter and his wife were preppers – a term which refers to people who stockpile materials such as weapons, food and gasoline. Preppers' purpose for doing that is to survive a future major disaster or catastrophe such as war or economic or political collapse. At some point earlier, Boelter had given his wife a 'bailout plan' – instructions of what to do and where to go in case of 'exigent circumstances'. The plan specified that the family go to her mother's residence in Spring Brook, Wisconsin. She also told investigators that her husband 'has a business partner from Worthington' who lives in the state of Washington. The two were 'partners … in Red Lion, a security company and fishing outfit in Congo, Africa', the affidavit states. The deadly shootings took place as millions of people prepared to take to the streets to protest against the Trump administration and its assault on free speech, peaceful assembly and due process rights embedded in the US constitution. Getsch wrote the affidavit during what became the largest ever manhunt in Minnesota state history, when he believed the gunman may have fled state lines. Boelter was eventually captured two days later while trying to evade arrest by fleeing into a wooded area close to his home. The affidavit does not imply that Boelter's wife knew about her husband's alleged plans to attack the lawmakers. She has not been charged with any crime. Boelter was disguised as a police officer and drove a black SUV with a license plate that said 'police'. He allegedly ambushed the lawmakers at home in the middle of the night, banging on their front doors armed with a 9mm handgun, and wearing a black tactical vest and silicone mask. He exchanged fire with police at about 3.30am on Saturday outside the Hortmans' home but managed to flee the scene, according to a federal criminal complaint. According to separate court documents obtained by WCCO on Friday, law enforcement found a storage locker rented by Boelter in Minneapolis on 10 June. He had last 'used his access code' for the locker the day before the shootings. Investigators later found empty rifle cases, gun-cleaning supplies and a bike inside the locker. Law enforcement found a 'hit list' of individuals inside what they believe was Boelter's vehicle. It included Hortman, Hoffman and several other Democratic lawmakers, as well as reproductive rights advocates. In a statement released on Thursday, the Hoffman family recounted the terrifying attack. The statement said: 'We are grappling with the reality that we live in a world where public service carries such risks as being targeted because someone disagrees with you or doesn't like what you stand for.'
Yahoo
12 minutes ago
- Yahoo
Why Dallas Mavericks Must Think Twice Before Trading for Jrue Holiday
Why Dallas Mavericks Must Think Twice Before Trading for Jrue Holiday originally appeared on Athlon Sports. The Dallas Mavericks find themselves in an enviable yet challenging position this offseason. Fresh off drafting Cooper Flagg with the No. 1 pick and armed with championship aspirations, the front office faces a critical decision at the guard position with Kyrie Irving sidelined until 2026, recovering from a torn ACL. Advertisement While recent reports have connected Dallas to Boston Celtics veteran Jrue Holiday, the Mavericks should think twice before pulling the trigger on what could be a costly trade that doesn't align with their current timeline or financial flexibility. Holiday undoubtedly brings a championship pedigree and defensive versatility that would complement Dallas's core. At 35, he remains one of the league's premier perimeter defenders and has proven his worth as a finishing piece for title contenders, as evidenced by his championship runs with Milwaukee and Boston. However, the cost of acquiring Holiday presents significant concerns for Dallas's long-term planning. With three years and $104 million remaining on his contract, including a hefty $37.2 million player option for 2027-28, Holiday's deal represents a massive financial commitment for a player entering his 17th NBA season. More troubling for the Mavericks is what Boston would likely demand in return. The Celtics need to shed $20-25 million in salary just to get under the second apron, meaning they're motivated sellers. However, Holiday's proven playoff value ensures Boston won't simply give him away. Advertisement Reports suggest Dallas would need to part with valuable pieces like P.J. Washington or Daniel Gafford, two players who were instrumental in last year's Finals run and offer more team-friendly contracts. Rather than mortgage their future for an aging Holiday, the Mavericks should explore more cost-effective options that better align with their championship window and financial flexibility. Collin Sexton emerges as an intriguing trade target from Utah. The 25-year-old guard averaged 18.7 points and 4.9 assists last season while shooting 48.8 percent from the field. Sexton's youth and offensive upside make him a more sustainable option alongside Irving's eventual return, and Utah's rebuilding timeline suggests he could be available for a reasonable price without sacrificing core rotation players. The free agent market also offers compelling veteran options that wouldn't require Dallas to gut its frontcourt depth. Dennis Schröder brings playoff experience and proven leadership, having helped guide teams through difficult stretches when primary ball-handlers were unavailable. His championship experience with the Lakers and ability to manage an offense would provide stability during Irving's absence. Advertisement Chris Paul, despite his age, represents another veteran presence who could mentor young players while providing steady point guard play. Paul's basketball IQ and ability to maximize teammates' potential could be invaluable for integrating Flagg into the rotation while maintaining championship expectations. The Mavericks' championship window with Anthony Davis turning 33 and Irving 34 may be narrower than initially anticipated, but that doesn't mean Dallas should sacrifice future flexibility for a short-term fix. Washington and Gafford provide versatility and value on reasonable contracts, making them attractive trade assets if a better opportunity emerges mid-season. With Flagg entering the fold and the team's young core still developing, maintaining depth and financial flexibility could prove more valuable than betting big on Holiday's aging curve. The rookie's immediate defensive impact should help offset some of the concerns about perimeter defense that make Holiday attractive. While Holiday's championship pedigree is undeniable, the combination of his age, contract, and Boston's asking price makes this trade a risky proposition for Dallas. The Mavericks have built something special with their current core, and dismantling that foundation for a 35-year-old guard—no matter how talented—doesn't align with sustainable championship building. Advertisement Instead, Dallas should pursue more affordable options like Sexton or veteran free agents who can bridge the gap until Irving returns without compromising the team's long-term flexibility. Sometimes the best trade is the one you don't make, and for the Mavericks, that wisdom might apply perfectly to the Jrue Holiday situation. The championship window remains open in Dallas, but keeping it that way requires smart, measured decisions rather than desperate gambles. Holiday may help other teams, but for the Mavericks, better options await. Related: Indiana Pacers Could Take a Page from 2011 Dallas Mavericks' Playbook in NBA Finals Related: Mavericks Head Coach Jason Kidd Has Hall of Fame Comparison for Cooper Flagg This story was originally reported by Athlon Sports on Jun 20, 2025, where it first appeared.
Yahoo
12 minutes ago
- Yahoo
Draymond Green Issues Warning About Lakers After $10 Billion Sale
Draymond Green Issues Warning About Lakers After $10 Billion Sale originally appeared on Athlon Sports. Los Angeles Lakers owner Jeanie Buss and her family agreed on Wednesday to sell the historic NBA franchise to Mark Walter, owner of the Los Angeles Dodgers and CEO of TWG Global. Advertisement Even though Walter has been a stakeholder in the Lakers since 2021, he will now own the majority stake while Buss will continue to act as the team's governor. The landmark sale will cost $10 billion, making it the most expensive team sale in sports history. The new ownership can also help the organization spend more money on its team more frequently, which differs from the Buss family's style of management because they relied on the income they made from the Lakers. Furthermore, this issue halted much of the team's development as they had to be cautious with their spending, which makes the deal even more beneficial for the Lakers and a warning sign in the eyes of some players, including Golden State Warriors forward Draymond Green. 'Oh man, that makes the Lakers dangerous,' Green said during a recent episode of 'The Draymond Green Show with Baron Davis." 'The only thing stopping the Lakers by NBA ownership standards is Jeanie Buss and the Buss family [being] one of the least wealthy families as far as money goes.' Golden State Warriors head coach Steve Kerr and forward Draymond Green (23).Sergio Estrada-Imagn Images Green noted that the Lakers were able to bring in a fair amount of income during the COVID-19 pandemic era between 2020-21 because of their lucrative regional television deal with Spectrum SportsNet, now worth $3 billion. Advertisement 'They got so much money from their regional television deal,' Green added. 'That puts them in a position to profit when no one was in the stands, but the Lakers did in large part due to their regional TV deal. 'So now we have an owner with deep pockets, who's going to say, well, I don't need all that money and take it and put it into the team. Let's do whatever we want with this roster, and get that guy and this guy.' Green concluded his speech by claiming that Walter's wealth, combined with the Lakers' legacy as a historic franchise, will be 'dangerous' for years to come. Related: Draymond Green Disagrees With Steve Kerr's Bold Statement on Warriors Player This story was originally reported by Athlon Sports on Jun 21, 2025, where it first appeared.