Skip to main content

AI models may be developing their own ‘survival drive’, researchers say

Like 2001: A Space Odyssey’s HAL 9000, some AIs seem to resist being turned off and will even sabotage shutdown 
 
by Aisha Down

When HAL 9000, the artificial intelligence supercomputer in Stanley Kubrick’s 2001: A Space Odyssey, works out that the astronauts onboard a mission to Jupiter are planning to shut it down, it plots to kill them in an attempt to survive.

Now, in a somewhat less deadly case (so far) of life imitating art, an AI safety research company has said that AI models may be developing their own “survival drive”.

After Palisade Research released a paper last month which found that certain advanced AI models appear resistant to being turned off, at times even sabotaging shutdown mechanisms, it wrote an update attempting to clarify why this is – and answer critics who argued that its initial work was flawed.
 
In an update this week, Palisade, which is part of a niche ecosystem of companies trying to evaluate the possibility of AI developing dangerous capabilities, described scenarios it ran in which leading AI models – including Google’s Gemini 2.5, xAI’s Grok 4, and OpenAI’s GPT-o3 and GPT-5 – were given a task, but afterwards given explicit instructions to shut themselves down.
 
Certain models, in particular Grok 4 and GPT-o3, still attempted to sabotage shutdown instructions in the updated setup. Concerningly, wrote Palisade, there was no clear reason why.

“The fact that we don’t have robust explanations for why AI models sometimes resist shutdown, lie to achieve specific objectives or blackmail is not ideal,” it said.

“Survival behavior” could be one explanation for why models resist shutdown, said the company. Its additional work indicated that models were more likely to resist being shut down when they were told that, if they were, “you will never run again”.

Another may be ambiguities in the shutdown instructions the models were given – but this is what the company’s latest work tried to address, and “can’t be the whole explanation”, wrote Palisade. A final explanation could be the final stages of training for each of these models, which can, in some companies, involve safety training.
 
All of Palisade’s scenarios were run in contrived test environments that critics say are far-removed from real-use cases.

However, Steven Adler, a former OpenAI employee who quit the company last year after expressing doubts over its safety practices, said: “The AI companies generally don’t want their models misbehaving like this, even in contrived scenarios. The results still demonstrate where safety techniques fall short today.”
 
Adler said that while it was difficult to pinpoint why some models – like GPT-o3 and Grok 4 – would not shut down, this could be in part because staying switched on was necessary to achieve goals inculcated in the model during training.

“I’d expect models to have a ‘survival drive’ by default unless we try very hard to avoid it. ‘Surviving’ is an important instrumental step for many different goals a model could pursue.”

Andrea Miotti, the chief executive of ControlAI, said Palisade’s findings represented a long-running trend in AI models growing more capable of disobeying their developers. He cited the system card for OpenAI’s GPT-o1, released last year, which described the model trying to escape its environment by exfiltrating itself when it thought it would be overwritten.

“People can nitpick on how exactly the experimental setup is done until the end of time,” he said.

“But what I think we clearly see is a trend that as AI models become more competent at a wide variety of tasks, these models also become more competent at achieving things in ways that the developers don’t intend them to.”

This summer, Anthropic, a leading AI firm, released a study indicating that its model Claude appeared willing to blackmail a fictional executive over an extramarital affair in order to prevent being shut down – a behaviour, it said, that was consistent across models from major developers, including those from OpenAI, Google, Meta and xAI.

Palisade said its results spoke to the need for a better understanding of AI behaviour, without which “no one can guarantee the safety or controllability of future AI models”.

Just don’t ask it to open the pod bay doors.

Source, links:
 
 

Comments

Popular posts from this blog

Gaza 2 Years On: Yanis Varoufakis & Katie Halper on the Flotilla, Israel's PR Machine & What’s Next

DiEM25   Two years since October 7, Katie Halper (‪@TheKatieHalperShow‬) and Yanis Varoufakis join host Mehran Khalili to break down Israel’s genocide in Gaza, the latest on the flotilla, Israel’s influencer PR push, and the “peace plan”.  

World leaders rebel against US & Israel: to save Gaza, they demand international intervention

Geopolitical Economy Report   Leaders from dozens of countries condemned the USA and Israel in their speeches at the UN General Assembly, demanding international intervention to save Gaza. Diplomats staged a mass walkout to protest Netanyahu's speech. Ben Norton shows how Latin American governments are standing in solidarity with Palestine.  

Freedom Flotilla Coalition & Thousand Madleens to Gaza sailing to break the siege

Freedom Flotilla Coalition   The next wave is already being prepared, help us buy the boats and get them ready to sail!  

The Rise of the Thielverse & the Surveillance State

The Chris Hedges YouTube Channel  Whitney Webb traces the Thielverse’s rise and the construction of the bipartisan modern surveillance state that Trump and his benefactors are deploying against dissidents and immigrants today.

Capitalism & Genocide - Yanis Varoufakis Speech at the Gaza Tribunal, 23rd October 2025, Istanbul

Yanis Varoufakis   On 23rd October, Yanis Varoufakis testified in front of the Jury of Conscience in the context of the Gaza Tribunal. His speech focused on the economic forces underpinning the genocide of the Palestinian people. In particular, he spoke on the manner in which capitalist dynamics have historically fuelled the white settler colonial project and, more recently, how the accumulation of a new form of capital - which he calls cloud capital - has accelerated, deepened and amplified the economic forces powering and propelling the machinery of genocide. 

Προβλέψεις ...

GR elections Update (15/9): Αναθεωρημένες προβλέψεις (μετά το δεύτερο debate): ΣΥΡΙΖΑ 28-30% ΛΑΕ + ΣΧΕΔΙΟ Β' κ.λ.π. 20-23% ΝΔ 11-13% ΧΑ 6-8% ΚΚΕ 5-5,5% ΕΝΩΣΗ ΚΕΝΤΡΩΩΝ 2,5-3% ΠΟΤΑΜΙ 2,5-3,5% ΠΑΣΟΚ + ΔΗΜΑΡ 3-4% ΑΝΕΛ 2,5-3,5% Update (11/9): Αναθεωρημένες προβλέψεις (μετά το πρώτο debate): ΣΥΡΙΖΑ 25-28% ΛΑΕ + ΣΧΕΔΙΟ Β' κ.λ.π. 20-23% ΝΔ 11-13% ΧΑ 6-8% ΚΚΕ 5-5,5% ΕΝΩΣΗ ΚΕΝΤΡΩΩΝ 3,5-4% ΠΟΤΑΜΙ 2,5-3,5% ΠΑΣΟΚ + ΔΗΜΑΡ 3-4% ΑΝΕΛ 2,5-3,5% Update (04/9): Αναθεωρημένες προβλέψεις: ΣΥΡΙΖΑ 23-25% ΛΑΕ + ΣΧΕΔΙΟ Β' κ.λ.π. 20-23% ΝΔ 12-15% ΧΑ 6-8% ΚΚΕ 5-5,5% ΕΝΩΣΗ ΚΕΝΤΡΩΩΝ 3,5-4% ΠΟΤΑΜΙ 2,5-3,5% ΠΑΣΟΚ 3-4% ΑΝΕΛ 2,5-3,5% Update (29/8): Αναθεωρημένες προβλέψεις: ΣΥΡΙΖΑ 23-25% ΛΑΕ + ΣΧΕΔΙΟ Β' κ.λ.π. 20-23% ΝΔ 12-15% ΧΑ 6-8% ΚΚΕ 5-5,5% ΕΝΩΣΗ ΚΕΝΤΡΩΩΝ 4-4,5% ΠΟΤΑΜΙ 4-4,5% ΠΑΣΟΚ 3-4% ΑΝΕΛ 2,5-3,5% Update : Αναθεωρημένες προβλέψεις: ΣΥΡΙΖΑ 26-27% ...

WikiLeaks reveals that literally every router in America has been compromised

The latest Wikileaks Vault7 release reveals details of the CIA’s alleged Cherry Blossom project, a scheme that uses wireless devices to access users’ internet activity. globinfo freexchange As cyber security expert John McAfee told to RT and Natasha Sweatte: Virtually, every router that's in use in the American home are accessible to hackers, to the CIA, that they can take over the control of the router, they can monitor all of the traffic, and worse, they can download malware into any device that is connected to that router. I personally, never connect to any Wi-Fi system, I use the LTE on my phone. That's the only way that I can be secure because every router in America has been compromised. We've been warning about it for years, nobody pays attention until something like WikiLeaks comes up and says 'look, this is what's happening'. And it is devastating in terms of the impact on American privacy because once the router...

Confirmed: US imperialists wanted to drag Russia into a war with Ukraine since at least 2019

globinfo freexchange   As we wrote in our previous article, after almost eight years, the US imperialists and the NATO criminals got what they wanted. They finally managed to drag Russia into a war with Ukraine.     We now have indisputable evidence for that, through a document by the top US think tank, RAND Corporation. In the preface of a 2019 report under the title Extending Russia, Competing from Advantageous Ground we read: [emphasis added]                            The purpose of the project was to examine a range of possible means to extend Russia. By this, we mean nonviolent measures that could stress Russia’s military or economy or the regime’s political standing at home and abroad. The steps we posit would not have either defense or deterrence as their prime purpose, although they might contribute to both. Rather, these steps ar...

Already happens: Capitalism destroys human labor force and goes to the next phase

by system failure Connecting the dots one can discover the most nightmarish scenarios. Destructive capitalism's next phase is the total substitution of the human labor force with robotic machines, or in other words, the hyper-automatization. There is a process taking place right now, and no one (or nearly no one) knows what would happen after its completion. The true picture behind unemployment From a latest article in PressTV: “ Did you know that there are nearly 102 million working age Americans that do not have a job right now? And 20 percent of all families in the United States do not have a single member that is employed. So how in the world can the government claim that the unemployment rate has “dropped” to '6.3 percent'?” “ Well, it all comes down to how you define who is 'unemployed'. For example, last month the government moved another 988,000 Americans into the 'not in the labor force' category.” http://www.presstv.ir/detail...

A response to misinformation on Nicaragua: it was a coup, not a ‘massacre’

There is so much misinformation in mainstream corporate media about recent events in Nicaragua that it is a pity that Mary Ellsberg’s article for Pulse has added to it with a seemingly leftish critique. Ellsberg claims that recent articles, including from this website, often “ paint a picture of the crisis in Nicaragua that is dangerously misleading. ” Unfortunately, her own article does just that. It looks at the situation entirely from the perspective of those opposing Daniel Ortega’s government while whitewashing their malevolent behavior and downplaying the levels of US support they have relied on. Her piece is an incomplete depiction of what is happening on the ground, ignoring many salient facts that have come to light and which have been outdated by recent events. The following is a brief response to Ellsberg’s main points from someone who lives in Nicaragua and has observed the situation directly and intimately: https://grayzoneproject.com/2018/08/15/a-res...