Skip to main content

AI models may be developing their own ‘survival drive’, researchers say

Like 2001: A Space Odyssey’s HAL 9000, some AIs seem to resist being turned off and will even sabotage shutdown 
 
by Aisha Down

When HAL 9000, the artificial intelligence supercomputer in Stanley Kubrick’s 2001: A Space Odyssey, works out that the astronauts onboard a mission to Jupiter are planning to shut it down, it plots to kill them in an attempt to survive.

Now, in a somewhat less deadly case (so far) of life imitating art, an AI safety research company has said that AI models may be developing their own “survival drive”.

After Palisade Research released a paper last month which found that certain advanced AI models appear resistant to being turned off, at times even sabotaging shutdown mechanisms, it wrote an update attempting to clarify why this is – and answer critics who argued that its initial work was flawed.
 
In an update this week, Palisade, which is part of a niche ecosystem of companies trying to evaluate the possibility of AI developing dangerous capabilities, described scenarios it ran in which leading AI models – including Google’s Gemini 2.5, xAI’s Grok 4, and OpenAI’s GPT-o3 and GPT-5 – were given a task, but afterwards given explicit instructions to shut themselves down.
 
Certain models, in particular Grok 4 and GPT-o3, still attempted to sabotage shutdown instructions in the updated setup. Concerningly, wrote Palisade, there was no clear reason why.

“The fact that we don’t have robust explanations for why AI models sometimes resist shutdown, lie to achieve specific objectives or blackmail is not ideal,” it said.

“Survival behavior” could be one explanation for why models resist shutdown, said the company. Its additional work indicated that models were more likely to resist being shut down when they were told that, if they were, “you will never run again”.

Another may be ambiguities in the shutdown instructions the models were given – but this is what the company’s latest work tried to address, and “can’t be the whole explanation”, wrote Palisade. A final explanation could be the final stages of training for each of these models, which can, in some companies, involve safety training.
 
All of Palisade’s scenarios were run in contrived test environments that critics say are far-removed from real-use cases.

However, Steven Adler, a former OpenAI employee who quit the company last year after expressing doubts over its safety practices, said: “The AI companies generally don’t want their models misbehaving like this, even in contrived scenarios. The results still demonstrate where safety techniques fall short today.”
 
Adler said that while it was difficult to pinpoint why some models – like GPT-o3 and Grok 4 – would not shut down, this could be in part because staying switched on was necessary to achieve goals inculcated in the model during training.

“I’d expect models to have a ‘survival drive’ by default unless we try very hard to avoid it. ‘Surviving’ is an important instrumental step for many different goals a model could pursue.”

Andrea Miotti, the chief executive of ControlAI, said Palisade’s findings represented a long-running trend in AI models growing more capable of disobeying their developers. He cited the system card for OpenAI’s GPT-o1, released last year, which described the model trying to escape its environment by exfiltrating itself when it thought it would be overwritten.

“People can nitpick on how exactly the experimental setup is done until the end of time,” he said.

“But what I think we clearly see is a trend that as AI models become more competent at a wide variety of tasks, these models also become more competent at achieving things in ways that the developers don’t intend them to.”

This summer, Anthropic, a leading AI firm, released a study indicating that its model Claude appeared willing to blackmail a fictional executive over an extramarital affair in order to prevent being shut down – a behaviour, it said, that was consistent across models from major developers, including those from OpenAI, Google, Meta and xAI.

Palisade said its results spoke to the need for a better understanding of AI behaviour, without which “no one can guarantee the safety or controllability of future AI models”.

Just don’t ask it to open the pod bay doors.

Source, links:
 
 

Comments

Popular posts from this blog

Capitalism & Genocide - Yanis Varoufakis Speech at the Gaza Tribunal, 23rd October 2025, Istanbul

Yanis Varoufakis   On 23rd October, Yanis Varoufakis testified in front of the Jury of Conscience in the context of the Gaza Tribunal. His speech focused on the economic forces underpinning the genocide of the Palestinian people. In particular, he spoke on the manner in which capitalist dynamics have historically fuelled the white settler colonial project and, more recently, how the accumulation of a new form of capital - which he calls cloud capital - has accelerated, deepened and amplified the economic forces powering and propelling the machinery of genocide. 

Munich Shock: Rubio’s Vision of a New Western Century & World Order

GVS Deep Dive   At the 2026 Munich Security Conference, U.S. Secretary of State Marco Rubio delivered one of the most consequential foreign policy speeches of the year. Framed as a call for Western renewal, his address went far beyond NATO reassurance — outlining a vision of sovereignty, industrial consolidation, and civilizational confidence that may signal the end of the post-Cold War global order.   Is this the beginning of a Second Cold War?   Is the West reorganizing around bloc competition?   Or are we witnessing the construction of a new world order? 

Iran's Next Strike OBLITERATES US Navy & Israel, War Has BEGUN

Danny Haiphong   Prof. Mohammad Marandi joins the show to react to Iran's vow to strike a devastating blow to the heart of Tel Aviv and US Navy as imminent US war approaches. Trump has moved military assets to the region and now Iran has responded by moving its missiles and drones in strike position. Watch until the end for an in-depth analysis of a war that's already begun, and is about to change everything with one fatal move by the US empire.

Billionaires are social distancing in super yachts as tens of millions lose jobs

Everyday, it becomes clearer: the COVID-19 pandemic is hitting poor, working, and marginalized communities the hardest. Millions of workers – especially low-wage retail, food service, hospitality, and care workers – have faced the terrible choice daily between going to work and risking their health, or staying home and risking their paychecks. Many other workers don’t even have that choice, with around 30 million people in the US filing for unemployment in the past six weeks. But billionaires don’t face these same problems. As tens of millions have lost their jobs over the past two months, billionaire wealth soared by a whopping $282 billion between March 18 and April 10, according to a new study from the Institute for Policy Studies.  And while finding enough space to wait out the pandemic is something many struggle with, billionaires have been escaping to their second (or third, or fourth) homes to ride it out in luxury – all while they position themselves to ...

Η θύελλα έρχεται, Grexit τώρα!

globinfo freexchange Η εκστρατεία δαιμονοποίησης της πιθανότητας επιστροφής σε εθνικό νόμισμα συνεχίζεται αμείωτη, ακόμα και επί κυβέρνησης Τσίπρα. Η κυβέρνηση ΣΥΡΙΖΑ-ΑΝΕΛ είναι σίγουρο ότι δεν έχει την πραγματική εξουσία στη χώρα και αυτό φάνηκε τόσο από το οικονομικό πραξικόπημα Ντράγκι το περασμένο καλοκαίρι, όσο και από το γεγονός ότι επιβάλλονται στην παρούσα κυβέρνηση άνθρωποι σε θέσεις-κλειδιά, όπως για παράδειγμα ο τωρινός διοικητής της Τράπεζας της Ελλάδος, Γιάννης Στουρνάρας. Η προπαγάνδα της εγχώριας τραπεζομιντιακής δικτατορίας που διατηρεί ακόμα την πραγματική εξουσία, ως παράρτημα της Ευρωπαϊκής Χρηματοπιστωτικής Δικτατορίας (ΕΧΔ), δαιμονοποιεί με κάθε μέσο και με κάθε ευκαιρία, μέσα από τα γνωστά σενάρια ολέθρου, την πιθανότητα επιστροφής σε εθνικό νόμισμα. Όπως έχει επανειλημμένα τονιστεί σε παλαιότερα άρθρα, ο μόνος τρόπος για να σταματήσει η καταστροφική πορεία της χώρας, η οποία επιβάλλεται συστηματικά από τους μηχανισμούς της ευρω-δικτατορίας κ...

A response to misinformation on Nicaragua: it was a coup, not a ‘massacre’

There is so much misinformation in mainstream corporate media about recent events in Nicaragua that it is a pity that Mary Ellsberg’s article for Pulse has added to it with a seemingly leftish critique. Ellsberg claims that recent articles, including from this website, often “ paint a picture of the crisis in Nicaragua that is dangerously misleading. ” Unfortunately, her own article does just that. It looks at the situation entirely from the perspective of those opposing Daniel Ortega’s government while whitewashing their malevolent behavior and downplaying the levels of US support they have relied on. Her piece is an incomplete depiction of what is happening on the ground, ignoring many salient facts that have come to light and which have been outdated by recent events. The following is a brief response to Ellsberg’s main points from someone who lives in Nicaragua and has observed the situation directly and intimately: https://grayzoneproject.com/2018/08/15/a-res...

Προβλέψεις ...

GR elections Update (15/9): Αναθεωρημένες προβλέψεις (μετά το δεύτερο debate): ΣΥΡΙΖΑ 28-30% ΛΑΕ + ΣΧΕΔΙΟ Β' κ.λ.π. 20-23% ΝΔ 11-13% ΧΑ 6-8% ΚΚΕ 5-5,5% ΕΝΩΣΗ ΚΕΝΤΡΩΩΝ 2,5-3% ΠΟΤΑΜΙ 2,5-3,5% ΠΑΣΟΚ + ΔΗΜΑΡ 3-4% ΑΝΕΛ 2,5-3,5% Update (11/9): Αναθεωρημένες προβλέψεις (μετά το πρώτο debate): ΣΥΡΙΖΑ 25-28% ΛΑΕ + ΣΧΕΔΙΟ Β' κ.λ.π. 20-23% ΝΔ 11-13% ΧΑ 6-8% ΚΚΕ 5-5,5% ΕΝΩΣΗ ΚΕΝΤΡΩΩΝ 3,5-4% ΠΟΤΑΜΙ 2,5-3,5% ΠΑΣΟΚ + ΔΗΜΑΡ 3-4% ΑΝΕΛ 2,5-3,5% Update (04/9): Αναθεωρημένες προβλέψεις: ΣΥΡΙΖΑ 23-25% ΛΑΕ + ΣΧΕΔΙΟ Β' κ.λ.π. 20-23% ΝΔ 12-15% ΧΑ 6-8% ΚΚΕ 5-5,5% ΕΝΩΣΗ ΚΕΝΤΡΩΩΝ 3,5-4% ΠΟΤΑΜΙ 2,5-3,5% ΠΑΣΟΚ 3-4% ΑΝΕΛ 2,5-3,5% Update (29/8): Αναθεωρημένες προβλέψεις: ΣΥΡΙΖΑ 23-25% ΛΑΕ + ΣΧΕΔΙΟ Β' κ.λ.π. 20-23% ΝΔ 12-15% ΧΑ 6-8% ΚΚΕ 5-5,5% ΕΝΩΣΗ ΚΕΝΤΡΩΩΝ 4-4,5% ΠΟΤΑΜΙ 4-4,5% ΠΑΣΟΚ 3-4% ΑΝΕΛ 2,5-3,5% Update : Αναθεωρημένες προβλέψεις: ΣΥΡΙΖΑ 26-27% ...

What Iran, Russia & China just did is HUGE, War BACKFIRES on Trump

Danny Haiphong   Iran's shocking response to Trump's imminent attack is sending fear down the spines of the US military as war leaves them defenseless from Iranian missile fire says Mohammad Marandi. This video breaks down why this war is already backfiring on Trump. 

First predictions for the snap elections in Greece

Greek elections globinfo freexchange First predictions for the snap elections in Greece have started already. According to the German newspaper Bild, SYRIZA appears with heavy losses with a percentage of 28%. Close to SYRIZA is the right-Wing New Democracy with 25% (little less than 3% lower than in previous elections) and the new Popular Unity party that came from the split of SYRIZA, appears to gather 8% of the votes. All first polls show significant losses for Alexis Tsipras and SYRIZA. In the last few days, many members of the party have resigned and Tsipras has to deal also with the internal crisis in his party after the split according to the plan B of the Brussels bureaufascists. Most of the early predictions give Lafazanis' Popular Unity a percentage of 7-8%, while SYRIZA's partners in the coalition government, Independent Greeks, struggle to reach the crucial 3% to enter the new parliament. In any case, the split of SYRIZA creates an even...