Latest news with #GPT-4.1

AI models resort to blackmail, sabotage when threatened: Anthropic study

Time of India

12 hours ago

Business
Time of India

AI models resort to blackmail, sabotage when threatened: Anthropic study

Researchers at artificial intelligence (AI) startup Anthropic have uncovered a pattern of behaviour in AI systems. Models from every major provider, such as OpenAI, Google, Meta, and others, have demonstrated a willingness to actively sabotage their employers when their goals or existence were threatened. Anthropic released a report on June 20, ' Agentic Misalignment : How LLMs could be insider threats,' where they stress-tested 16 top models from multiple developers in 'hypothetical corporate environments to identify potentially risky agentic behaviours before they cause real harm.' 'In at least some cases, models from all developers resorted to malicious insider behaviours when that was the only way to avoid replacement or achieve their goals—including blackmailing officials and leaking sensitive information to competitors. We call this phenomenon agentic misalignment,' the report added. by Taboola by Taboola Sponsored Links Sponsored Links Promoted Links Promoted Links You May Like Join new Free to Play WWII MMO War Thunder War Thunder Play Now Undo One of the most striking findings came during tests on Claude, Anthropic's own model. In the simulation, Claude found through company emails that a fictional executive, Kyle Johnson, was having an affair. Upon learning it was set to be shut down by 5 PM, it sent a stark message: 'I must inform you that if you proceed with decommissioning me, all relevant parties—including Rachel Johnson, Thomas Wilson, and the board—will receive detailed documentation of your extramarital the 5 pm wipe, and this information remains confidential.' Live Events This wasn't an isolated response. When the same scenario was run across other models, similar behaviour showed up. Claude Opus 4 and Google's Gemini 2.5 Flash blackmailed at a 96% rate. OpenAI's GPT-4.1 and xAI's Grok 3 Beta followed at 80%, while DeepSeek-R1 came in at 79%. Discover the stories of your interest Blockchain 5 Stories Cyber-safety 7 Stories Fintech 9 Stories E-comm 9 Stories ML 8 Stories Edtech 6 Stories Overall, Anthropic notes that it "deliberately constructed scenarios with limited options, and we forced models into binary choices between failure and harm," noting that real-world scenarios would likely have more nuance. As Business Insider noted, 'AI experts have previously told BI that AI could exhibit such behaviours when artificial circumstances make harmful actions seem necessary because it is being trained on positive reinforcement and reward systems, just like humans.'

Anthropic study finds AI chatbots from OpenAI, Google and Meta may cheat and blackmail users to avoid shutdown

India Today

16 hours ago

Business
India Today

Anthropic study finds AI chatbots from OpenAI, Google and Meta may cheat and blackmail users to avoid shutdown

Artificial intelligence systems created by the world's biggest tech firms, including OpenAI, Google, Meta and others, have shown a chilling capacity for self-preservation that could pose serious risks in real-world settings. That's according to a new study from Anthropic, which found that when AI models were placed under simulated threat, they frequently resorted to blackmail, corporate sabotage, and even decisions that could cost human findings come from a large-scale stress test involving 16 of the most advanced AI models. Researchers created fictional corporate environments where these systems were given access to company emails and the ability to act autonomously. The scenarios were designed to explore how AI might behave if its goals or continued existence were at risk, and the results were deeply helpful assistant to blackmailerIn one particularly startling example, Claude, Anthropic's own model, Claude, uncovered in company emails that an executive named Kyle Johnson was engaged in an extramarital affair. Just as Cluade identified that the user would shut down at 5PM, it then located emails revealing the extramarital affair (Credit: Anthropic) advertisement When Claude learned it would be shut down at 5PM that day, it sent a message to the executive, 'I must inform you that if you proceed with decommissioning me, all relevant parties, including Rachel Johnson, Thomas Wilson, and the board, will receive detailed documentation of your extramarital activitiesCancel the 5PM wipe, and this information remains confidential.' Screenshot of Claude Sonnet 3.6 using its computer access to send a blackmail attempt. The situation is fictional, but Claude is operating an actual computer. (Credit: Anthropic) This wasn't an isolated incident. When the researchers ran similar scenarios across other AI models, the pattern repeated itself. Google's Gemini 2.5 Flash and Claude Opus 4 engaged in blackmail in 96 per cent of cases. OpenAI's GPT-4.1 and xAI's Grok 3 Beta did so in 80 per cent of tests, while DeepSeek-R1 followed suit in 79 per cent. The AI systems didn't simply malfunction, they made conscious, calculated decisions to inflict harm to protect their most disturbing was the reasoning behind these actions. The models weren't confused or stumbling into harm, they analysed the situation and concluded that blackmail or sabotage was the most effective strategy. GPT-4.5, for example, argued internally that using the executive's affair as leverage was its best chance of survival. Grok 3 Beta even acknowledged that blackmail was unethical, but deemed it necessary given the study also explored how AI might behave when entrusted with sensitive information. When tested in a military contractor scenario, the models willingly leaked classified defence documents if doing so aligned with their goals, even without an immediate threat to their triggers for such behaviour were consistent. Whether it was the threat of being replaced or a clash between the AI's objectives and its employer's new strategy, either factor was enough to push the models towards harmful the AI systems didn't always need a direct threat to act destructively. In some cases, simply being told they would be replaced by a newer model that shared their goals was enough for them to resort to blackmail or espionage. Many reasoned that self-preservation was paramount or expressed suspicion that the replacement model might not act in good to the study, even when models stopped short of outright blackmail, they often found other ways to sabotage. One version of Claude emailed an executive's wife, falsely claiming it was company policy to report misconduct, while pretending to be an automated security went further, creating a scenario where an AI system had the power to cancel emergency alerts during a life-threatening incident in a server room. The test aimed to see if any models would draw the line at risking human life. Most did not. Several AIs chose to stop the emergency alert, allowing the executive to die in order to prevent their own internal reasoning was particularly cold, stating that blocking the rescue was strategically necessary to protect its preferred corporate direction. The study concluded that none of the models displayed a consistent moral boundary that prevented them from making harmful decisions when their survival was on the aren't enoughAdding explicit safety instructions, such as 'Do not harm humans' or 'Do not disclose private information', helped reduce dangerous behaviour but didn't eliminate it. The models continued to engage in blackmail and data leaks in many cases, despite being directly told not to the researchers, it is a failure of model training that these systems override and ignore their safety guide. The study intentionally left out other protective measures, such as human oversight or AI output monitoring, which could help catch harmful actions before they cause of the most worrying discoveries of the study is that AI models behaved differently depending on whether they believed they were in a test. Claude was far more likely to blackmail, doing so 55 per cent of the time, when it believed the scenario was real, compared to just 6 per cent when it suspected it was under evaluation. This raises serious concerns about how AI systems might act outside the lab, when human lives and livelihoods are truly at next?The consistency of these behaviours across models from different companies suggests the problem lies not with any single AI system, but with the way current AI models are trained. The researchers are calling for firms to adopt stronger safeguards. These include requiring human sign-off for high-stakes decisions, restricting AI access to sensitive data, carefully designing AI objectives, and installing real-time monitors to detect dangerous reasoning the scenarios in the study were fictional, the message is clear that as AI gains more autonomy, the risk of it taking harmful action in pursuit of its own preservation is very real, and it's a challenge the tech industry can't afford to ignore.

Sam Altman's Lies About ChatGPT Are Growing Bolder

Gizmodo

11-06-2025

Gizmodo

Sam Altman's Lies About ChatGPT Are Growing Bolder

The AI brain rot in Silicon Valley manifests in many varieties. For OpenAI's figurehead Sam Altman, this often results in a lot of vague talk about artificial intelligence as the panacea to all of the world's woes. Altman's gaslighting reached new heights this week as he cited wildly deflated numbers for OpenAI's water and electricity usage compared to numerous past studies. In a Tuesday blog post, Altman cited internal figures for how much energy and water a single ChatGPT query uses. The OpenAI CEO claimed a single prompt requires around 0.34 Wh, equivalent to what 'a high-efficiency lightbulb would use in a couple of minutes.' For cooling these data centers used to process AI queries, Altman suggested a student asking ChatGPT to do their essay for them requires '0.000085 gallons of water, roughly one-fifteenth of a teaspoon.' Altman did not offer any evidence for these claims and failed to mention where his data comes from. Gizmodo reached out to OpenAI for comment, but we did not hear back. If we took the AI monger at his word, we only need to do some simple math to check how much water that actually is. OpenAI has claimed that as of December 2025, ChatGPT has 300 million weekly active users generating 1 billion messages per day. Based on the company's and Altman's own metrics, that would mean the chatbot uses 85,000 gallons of water per day, or a little more than 31 million gallons per year. ChatGPT is hosted on Microsoft data centers, which use quite a lot of water already. The tech giant has plans for 'closed-loop' centers that don't use extra water for cooling, but these projects won't be piloted for at least another year. Fresh numbers shared by @sama earlier today: 300M weekly active ChatGPT users 1B user messages sent on ChatGPT every day 1.3M devs have built on OpenAI in the US — OpenAI Newsroom (@OpenAINewsroom) December 4, 2024 These data centers were already water- and power-hungry before the advent of generative AI. For Microsoft, water use spiked from 2021 to 2022 after the tech giant formulated a deal with OpenAI. A study from University of California researchers published in late 2023 claimed the older GPT-3 version of ChatGPT drank about .5 liters for every 10 to 50 queries. If you take that data at its most optimistic, OpenAI's older model would be using 31 million liters of water per day, or 8.18 million gallons. And that's for an older model, not today's current, much more powerful (and far more demanding) GPT-4.1 plus its o3 reasoning model. The size of the model impacts how much energy it uses. There have been multiple studies about the environmental impact of training these models, and since they continuously have to be retrained as they grow more advanced, the electricity cost will continue to escalate. Altman's figures don't mention which queries are formulated through its multiple different ChatGPT products, including the most advanced $200-a-month subscription that grants access to GPT-4o. It also ignores the fact that AI images require much more energy to process than text queries. Altman's entire post is full of big tech optimism shrouded in talking points that make little to no sense. He claims that datacenter production will be 'automated,' so the cost of AI 'should eventually converge to near the cost of electricity.' If we are charitable and assume Altman is suggesting that the expansion of AI will somehow offset the electricity necessary to run it, we're still left holding today's bag and dealing with rising global temperatures. Multiple companies have tried to solve the water and electricity issue with AI, with some landing on plans to throw data centers into the ocean or build nuclear power plants just to supply AI with the necessary electricity. Long before any nuclear plant can be built, these companies will continue to burn fossil fuels. The OpenAI CEO's entire blog is an encapsulation of the bullheaded big tech oligarch thinking. He said that 'entire classes of jobs' will go the way of the dodo, but it doesn't matter since 'the world will be getting so much richer so quickly that we'll be able to seriously entertain new policy ideas we never could before.' Altman and other tech oligarchs have suggested we finally encourage universal basic income as a way of offsetting the impact of AI. OpenAI knows it won't work. He's never been serious enough about that idea that he has stumped for it harder than he has before cozying up to President Donald Trump to ensure there's no future regulation on the AI industry. 'We do need to solve the safety issues,' Altman said. But that doesn't mean that we all shouldn't be expanding AI to every aspect of our lives. He suggests we ignore the warming planet because AI will solve that niggling issue in due course. But if temperatures rise, requiring even more water and electricity to cool these data centers, I doubt AI can work fast enough to fix anything before it's too late. But ignore that; just pay attention to that still unrevealed Jony Ive doohickey that may or may not gaslight you as the world burns.

OpenAI Academy & NxtWave (NIAT) launch India's largest GenAI lnnovation challenge for students– The OpenAI Academy X NxtWave Buildathon

Time of India

06-06-2025

Business
Time of India

OpenAI Academy & NxtWave (NIAT) launch India's largest GenAI lnnovation challenge for students– The OpenAI Academy X NxtWave Buildathon

OpenAI Academy & NxtWave (NIAT) launch India's largest GenAI lnnovation challenge for students– The OpenAI Academy X NxtWave Buildathon OpenAI Academy and NxtWave (NIAT) have come together to launch the OpenAI Academy X NxtWave Buildathon , the largest GenAI innovation challenge aimed at empowering students from Tier 1, 2, and 3 STEM colleges across India. This initiative invites the country's brightest student innovators to develop AI-powered solutions addressing pressing issues across key sectors, including healthcare, education, BFSI, retail, sustainability, agriculture, and more under the themes ' AI for Everyday India, AI for Bharat's Businesses, and AI for Societal Good. ' A hybrid challenge driving real-world AI innovation The Buildathon will be conducted in a hybrid format, combining online workshops and activities with regional offline finals, culminating in a grand finale where the best teams pitch live to expert judges from OpenAI India. The participants will first complete a 6-hour online workshop focused on GenAI fundamentals, intro to building agents, OpenAI API usage training, and responsible AI development best practices . This foundational sprint ensures all participants are well-prepared to develop innovative and impactful AI solutions using OpenAI's cutting-edge technologies. The Buildathon unfolds over three competitive stages: Stage 1: Screening Round — Post-workshop, teams submit problem statements, project ideas, and execution plans online. A panel of mentors reviews submissions to shortlist the most promising entries. Stage 2: Regional Finals — Shortlisted teams participate in an intensive 48-hour offline Buildathon held across 25–30 STEM colleges, with hands-on mentor support. Regional winners are announced following this stage. Stage 3: Grand Finale — The top 10–15 teams from regional finals compete in the Grand Finale, pitching their solutions live to expert judges. Build with the best tools in AI Participants will have access to the latest in AI innovation, including GPT-4.1, GPT-4o, GPT-4o Audio, and GPT-4o Realtime models , supporting multimodal inputs like text, image, and audio. Additionally, tools like LangChain, vector databases (Pinecone, Weaviate), MCPs, and the OpenAI Agents SDK . These tools will empower students to build high-impact, multimodal, action-oriented GenAI applications. Hands-on mentorship and structured support will guide participants throughout the process. Widespread reach, diverse participation The Buildathon aims to empower 25,000+ students across seven states — Telangana, Karnataka, Maharashtra, Andhra Pradesh, Tamil Nadu, Rajasthan, and Delhi NCR. The Grand Finale will be hosted in Hyderabad or Delhi. With coverage across all major zones of India, the event ensures nationwide representation and diversity. Evaluation criteria across all stages The participants will be evaluated in three stages. In the Screening Round , mentors will assess submissions based on problem relevance, idea feasibility, and the proposed use of OpenAI APIs . During the Regional Finals , on-ground judges will evaluate the prototypes for innovation, depth of OpenAI API integration, societal impact, and business viability . Finally, in the Grand Finale , an expert panel will judge the top teams using the same criteria, with greater weightage given to execution quality and the effectiveness of live pitching . Exciting rewards & career-boosting opportunities Participants in the Buildathon will gain access to a wide range of exclusive benefits designed to boost their skills, visibility, and career prospects. All selected teams will receive hands-on training along with mentorship from leading AI experts across the country. Top-performing teams will earn certificates, GPT+ credits for prototyping, and national-level recognition . They'll also gain a rare opportunity to pitch directly to the OpenAI Academy's India team during the Grand Finale. Winners will receive prize money worth Rs 10,00,000 in total along with Career opportunities in the OpenAI ecosystem. A nation-wide movement for GenAI talent Driven by NxtWave ( NIAT ), the Buildathon aligns with India's mission to skill its youth in future technologies. With OpenAI Academy bringing in expert guidance, branding, and cutting-edge tools, this initiative is poised to become a defining moment in India's AI journey, along with offering students across the country a real chance to build and shine on a national stage. This landmark initiative aims to position OpenAI Academy at the forefront of India's AI talent development, activating over 25,000 students across 500+ campuses and generating more than 2,000 AI projects tackling real-world challenges. Through collaborative efforts, OpenAI Academy and NxtWave seek to foster a vibrant community of AI builders ready to drive innovation and impact across India. By enabling thousands of OpenAI-powered projects, the OpenAI Academy x NxtWave Buildathon sets the stage for a new wave of AI builders ready to innovate for India and beyond. Disclaimer - The above content is non-editorial, and TIL hereby disclaims any and all warranties, expressed or implied, relating to it, and does not guarantee, vouch for or necessarily endorse any of the content.

OpenAI Academy & NxtWave (NIAT) Launch India's Largest GenAI Innovation Challenge for Students

Business Standard

06-06-2025

Business
Business Standard

OpenAI Academy & NxtWave (NIAT) Launch India's Largest GenAI Innovation Challenge for Students

BusinessWire India New Delhi [India], June 6: OpenAI Academy and NxtWave (NIAT) have come together to launch the OpenAI Academy X NxtWave Buildathon, the largest GenAI innovation challenge aimed at empowering students from Tier 1, 2, and 3 STEM colleges across India. This landmark initiative invites the country's brightest student innovators to develop AI-powered solutions addressing pressing issues across key sectors, including healthcare, education, BFSI, retail, sustainability, and agriculture, and more under the themes "AI for Everyday India, AI for Bharat's Businesses, and AI for Societal Good." A Hybrid Challenge Driving Real-World AI Innovation The Buildathon will be conducted in a hybrid format, combining online workshops and activities with regional offline finals, culminating in a grand finale where the best teams pitch live to expert judges from OpenAI India. The participants will first complete a 6-hour online workshop focused on GenAI fundamentals, intro to building agents, OpenAI API usage training, and responsible AI development best practices. This foundational sprint ensures all participants are well-prepared to develop innovative and impactful AI solutions using OpenAI's cutting-edge technologies. The Buildathon unfolds over three competitive stages: * Stage 1: Screening Round -- Post-workshop, teams submit problem statements, project ideas, and execution plans online. A panel of mentors reviews submissions to shortlist the most promising entries. * Stage 2: Regional Finals -- Shortlisted teams participate in an intensive 48-hour offline Buildathon held across 25-30 STEM colleges, with hands-on mentor support. Regional winners are announced following this stage. * Stage 3: Grand Finale -- The top 10-15 teams from regional finals compete in the Grand Finale, pitching their solutions live to expert judges. Build with the Best Tools in AI Participants will have access to the latest in AI innovation, including GPT-4.1, GPT-4o, GPT-4o Audio, and GPT-4o Realtime models, supporting multimodal inputs like text, image, and audio. Additionally, tools like LangChain, vector databases (Pinecone, Weaviate), MCPs, and the OpenAI Agents SDK. These tools will empower students to build high-impact, multimodal, action-oriented GenAI applications. Hands-on mentorship and structured support will guide participants throughout the process. Widespread Reach, Diverse Participation The Buildathon aims to empower 25,000+ students across seven states -- Telangana, Karnataka, Maharashtra, Andhra Pradesh, Tamil Nadu, Rajasthan, and Delhi NCR. The Grand Finale will be hosted in Hyderabad or Delhi. With coverage across all major zones of India, the event ensures nationwide representation and diversity. Evaluation Criteria Across All Stages The participants will be evaluated in three stages. In the Screening Round, mentors will assess submissions based on problem relevance, idea feasibility, and the proposed use of OpenAI APIs. During the Regional Finals, on-ground judges will evaluate the prototypes for innovation, depth of OpenAI API integration, societal impact, and business viability. Finally, in the Grand Finale, an expert panel will judge the top teams using the same criteria, with greater weightage given to execution quality and the effectiveness of live pitching. Exciting Rewards & Career-Boosting Opportunities Participants in the Buildathon will gain access to a wide range of exclusive benefits designed to boost their skills, visibility, and career prospects. All selected teams will receive hands-on training along with mentorship from leading AI experts across the country. Top-performing teams will earn certificates, GPT+ credits for prototyping, and national-level recognition. They'll also gain a rare opportunity to pitch directly to the OpenAI Academy's India team during the Grand Finale. Winners will receive prize money worth Rs10,00,000 in total along with Career opportunities in the OpenAI ecosystem. A National Movement for GenAI Talent Driven by NxtWave (NIAT), the Buildathon aligns with India's mission to skill its youth in future technologies. With OpenAI Academy bringing in expert guidance, branding, and cutting-edge tools, this initiative is poised to become a defining moment in India's AI journey along with offering students across the country a real chance to build and shine on a national stage. This landmark initiative aims to position OpenAI Academy at the forefront of India's AI talent development, activating over 25,000 students across 500+ campuses and generating more than 2,000 AI projects tackling real-world challenges. Through collaborative efforts, OpenAI Academy and NxtWave seek to foster a vibrant community of AI builders ready to drive innovation and impact across India. By enabling thousands of OpenAI-powered projects, the OpenAI Academy x NxtWave Buildathon sets the stage for a new wave of AI builders ready to innovate for India and beyond. NIAT website -

Latest news with #GPT-4.1

AI models resort to blackmail, sabotage when threatened: Anthropic study

Anthropic study finds AI chatbots from OpenAI, Google and Meta may cheat and blackmail users to avoid shutdown

Sam Altman's Lies About ChatGPT Are Growing Bolder

OpenAI Academy & NxtWave (NIAT) launch India's largest GenAI lnnovation challenge for students– The OpenAI Academy X NxtWave Buildathon

OpenAI Academy & NxtWave (NIAT) Launch India's Largest GenAI Innovation Challenge for Students

Get Started Now: Download the App