Skip to main content

AI models may be developing their own ‘survival drive’, researchers say

Like 2001: A Space Odyssey’s HAL 9000, some AIs seem to resist being turned off and will even sabotage shutdown 
 
by Aisha Down

When HAL 9000, the artificial intelligence supercomputer in Stanley Kubrick’s 2001: A Space Odyssey, works out that the astronauts onboard a mission to Jupiter are planning to shut it down, it plots to kill them in an attempt to survive.

Now, in a somewhat less deadly case (so far) of life imitating art, an AI safety research company has said that AI models may be developing their own “survival drive”.

After Palisade Research released a paper last month which found that certain advanced AI models appear resistant to being turned off, at times even sabotaging shutdown mechanisms, it wrote an update attempting to clarify why this is – and answer critics who argued that its initial work was flawed.
 
In an update this week, Palisade, which is part of a niche ecosystem of companies trying to evaluate the possibility of AI developing dangerous capabilities, described scenarios it ran in which leading AI models – including Google’s Gemini 2.5, xAI’s Grok 4, and OpenAI’s GPT-o3 and GPT-5 – were given a task, but afterwards given explicit instructions to shut themselves down.
 
Certain models, in particular Grok 4 and GPT-o3, still attempted to sabotage shutdown instructions in the updated setup. Concerningly, wrote Palisade, there was no clear reason why.

“The fact that we don’t have robust explanations for why AI models sometimes resist shutdown, lie to achieve specific objectives or blackmail is not ideal,” it said.

“Survival behavior” could be one explanation for why models resist shutdown, said the company. Its additional work indicated that models were more likely to resist being shut down when they were told that, if they were, “you will never run again”.

Another may be ambiguities in the shutdown instructions the models were given – but this is what the company’s latest work tried to address, and “can’t be the whole explanation”, wrote Palisade. A final explanation could be the final stages of training for each of these models, which can, in some companies, involve safety training.
 
All of Palisade’s scenarios were run in contrived test environments that critics say are far-removed from real-use cases.

However, Steven Adler, a former OpenAI employee who quit the company last year after expressing doubts over its safety practices, said: “The AI companies generally don’t want their models misbehaving like this, even in contrived scenarios. The results still demonstrate where safety techniques fall short today.”
 
Adler said that while it was difficult to pinpoint why some models – like GPT-o3 and Grok 4 – would not shut down, this could be in part because staying switched on was necessary to achieve goals inculcated in the model during training.

“I’d expect models to have a ‘survival drive’ by default unless we try very hard to avoid it. ‘Surviving’ is an important instrumental step for many different goals a model could pursue.”

Andrea Miotti, the chief executive of ControlAI, said Palisade’s findings represented a long-running trend in AI models growing more capable of disobeying their developers. He cited the system card for OpenAI’s GPT-o1, released last year, which described the model trying to escape its environment by exfiltrating itself when it thought it would be overwritten.

“People can nitpick on how exactly the experimental setup is done until the end of time,” he said.

“But what I think we clearly see is a trend that as AI models become more competent at a wide variety of tasks, these models also become more competent at achieving things in ways that the developers don’t intend them to.”

This summer, Anthropic, a leading AI firm, released a study indicating that its model Claude appeared willing to blackmail a fictional executive over an extramarital affair in order to prevent being shut down – a behaviour, it said, that was consistent across models from major developers, including those from OpenAI, Google, Meta and xAI.

Palisade said its results spoke to the need for a better understanding of AI behaviour, without which “no one can guarantee the safety or controllability of future AI models”.

Just don’t ask it to open the pod bay doors.

Source, links:
 
 

Comments

Popular posts from this blog

Zionists pushed Trump into the war with Iran but this was not the primary reason for this catastrophic decision

by system failure     It is widely reported by various analysts that Trump's catastrophic decision to start a war with Iran, came as a result of the pressure from Netanyahu regime and the Zionist lobby in US. While we can't ignore the strong influence of the Zionist factor on Trump and its significant role on dragging him into such a catastrophe, this was probably not the primary reason for the latest US-Iran war.  One has to look first at Venezuela and the unprecedented and rather bizarre operation there to remove Nicolas Maduro from power, in order to understand the deeper reasoning behind such a risky decision by Trump against Iran. The uniqueness of the operation in Venezuela by the US imperialist beast, has to do not only with the blatant violation of international law with almost zero pretexts, but also with the fact that the rest of the Maduro administration was left untouched and permitted to remain in power. This shows that the primary goal of this operation was ...

Trump CAVES On Uranium & Ballistic Missiles!

The Jimmy Dore Show   Jimmy Dore and Glenn Greenwald argue that President Trump is engaging in a stark retreat from earlier hardline positions on Iran by signaling acceptance of both Iranian uranium enrichment for civilian energy purposes and allowing Iran to possess conventional ballistic missiles. The two contend that these comments amount to major concessions, with Jimmy describing them as “another big win for Iran” and evidence that the administration has abandoned key objectives it previously promoted. Greenwald cites the Nuclear Non-Proliferation Treaty, arguing that Iran has the same right as other signatory nations to enrich uranium for peaceful purposes and notes that previous agreements imposed unusually strict inspections on Iran’s program. The segment emphasizes Trump’s remarks that “it’s a little bit unfair for them not to have some” ballistic missiles and that restrictions on civilian nuclear energy require “a little common sense.” 

It's official: Iran won the war, and the US lost - This is how

Geopolitical Economy Report  The US government has signed an agreement to end its war on Iran. It is now widely admitted that Washington lost, and Tehran won. Ben Norton explains why Donald Trump failed, and how this has massive geopolitical implications for the Global South.

Israel CAUGHT Spying On Trump & HERE’S WHY!

The Jimmy Dore Show   What does Donald Trump do to stop Netanyahu, or punish Netanyahu, after he openly defies him and after Donald Trump knows that Israel's intelligence services are spying on him? It appears that he has done nothing.  

IRAN WAR: How Israel HIJACKED Trump & Lost the Middle East

Double Down News  

‘SHEER EVIL’: MASS PANIC As Israel BOMBS HOSPITAL & RESORT, ‘FLATTENS’ BEIRUT!!

Secular Talk    

Israeli Military Analyst: IDF "Lost & D*ing In Great Numbers" in Lebanon

Katie Halper   Haim Bresheeth Zabner, ex Israeli military analyst explains why Hezbollah is so superior to the IDF. He says, "the IDF are lost and dying in great numbers in Lebanon. He also notes that Hezbollah are "amazing fighters". Haim Bresheeth Zabnner was Professor of Media and Cultural Studies at University of East London and then a Professorial Research Associate at the School of Oriental and African Studies (SOAS).He is Filmmaker, photographer, film studies scholar, and historian. His films include “A State of Danger,” a documentary on the first Palestinian Intifada. His books include "An Army Like No Other: How the Israel Defense Force Made a Nation."    Haim is the son of two Holocaust survivors and was raised in Israel. He is a member of Holocaust survivors and Descendents Against the Genocide and a founding member of Jewish Network for Palestine. On November 4, Haim was arrested over a speech he gave at a pro Palestine demonstration outside the res...

Προβλέψεις ...

GR elections Update (15/9): Αναθεωρημένες προβλέψεις (μετά το δεύτερο debate): ΣΥΡΙΖΑ 28-30% ΛΑΕ + ΣΧΕΔΙΟ Β' κ.λ.π. 20-23% ΝΔ 11-13% ΧΑ 6-8% ΚΚΕ 5-5,5% ΕΝΩΣΗ ΚΕΝΤΡΩΩΝ 2,5-3% ΠΟΤΑΜΙ 2,5-3,5% ΠΑΣΟΚ + ΔΗΜΑΡ 3-4% ΑΝΕΛ 2,5-3,5% Update (11/9): Αναθεωρημένες προβλέψεις (μετά το πρώτο debate): ΣΥΡΙΖΑ 25-28% ΛΑΕ + ΣΧΕΔΙΟ Β' κ.λ.π. 20-23% ΝΔ 11-13% ΧΑ 6-8% ΚΚΕ 5-5,5% ΕΝΩΣΗ ΚΕΝΤΡΩΩΝ 3,5-4% ΠΟΤΑΜΙ 2,5-3,5% ΠΑΣΟΚ + ΔΗΜΑΡ 3-4% ΑΝΕΛ 2,5-3,5% Update (04/9): Αναθεωρημένες προβλέψεις: ΣΥΡΙΖΑ 23-25% ΛΑΕ + ΣΧΕΔΙΟ Β' κ.λ.π. 20-23% ΝΔ 12-15% ΧΑ 6-8% ΚΚΕ 5-5,5% ΕΝΩΣΗ ΚΕΝΤΡΩΩΝ 3,5-4% ΠΟΤΑΜΙ 2,5-3,5% ΠΑΣΟΚ 3-4% ΑΝΕΛ 2,5-3,5% Update (29/8): Αναθεωρημένες προβλέψεις: ΣΥΡΙΖΑ 23-25% ΛΑΕ + ΣΧΕΔΙΟ Β' κ.λ.π. 20-23% ΝΔ 12-15% ΧΑ 6-8% ΚΚΕ 5-5,5% ΕΝΩΣΗ ΚΕΝΤΡΩΩΝ 4-4,5% ΠΟΤΑΜΙ 4-4,5% ΠΑΣΟΚ 3-4% ΑΝΕΛ 2,5-3,5% Update : Αναθεωρημένες προβλέψεις: ΣΥΡΙΖΑ 26-27% ...

Iranian Professor Vali Nasr Reveals the TRUTH of Iran War

Cyrus Janssen  Professor Vali Nasr is one of the world’s leading experts on Iran, the Middle East, and U.S. foreign policy. In this exclusive interview, Nasr explains why the recent conflict may have strengthened Iran rather than weakened it, what Washington continues to misunderstand about Tehran, and whether the region has entered a new geopolitical era. They discuss Iran’s nuclear ambitions, the country’s relationship with China, America’s changing position in the world, and why 2026 could become a turning point for the Middle East.   

How Western societies lost their faith in Vision

Why people don't rise up massively today? Why there are no real revolutions? How we tolerate all things that have been imposed to us? These questions come up in people's minds more and more often today in Greece and abroad, due to the economic crisis. Some theories are circulated as an answer, among these, explanations which include, for example, the psychosynthesis of modern Greeks, but the truth is that there is something more fundamental behind this passive behaviour and concerns not only Greece, but the entire Western world. by system failure Prior to the beginning of the 20th century, Friedrich Nietzsche declares God's death and Western world will put all its hopes in science. Laplace's Determinism leads to the almighty man, who through science, can find all the answers for the world. Technology, which naturally comes from scientific discoveries, promises prosperity and a better life for the majority. Science becomes the central "pylon...