Skip to main content

AI models may be developing their own ‘survival drive’, researchers say

Like 2001: A Space Odyssey’s HAL 9000, some AIs seem to resist being turned off and will even sabotage shutdown 
 
by Aisha Down

When HAL 9000, the artificial intelligence supercomputer in Stanley Kubrick’s 2001: A Space Odyssey, works out that the astronauts onboard a mission to Jupiter are planning to shut it down, it plots to kill them in an attempt to survive.

Now, in a somewhat less deadly case (so far) of life imitating art, an AI safety research company has said that AI models may be developing their own “survival drive”.

After Palisade Research released a paper last month which found that certain advanced AI models appear resistant to being turned off, at times even sabotaging shutdown mechanisms, it wrote an update attempting to clarify why this is – and answer critics who argued that its initial work was flawed.
 
In an update this week, Palisade, which is part of a niche ecosystem of companies trying to evaluate the possibility of AI developing dangerous capabilities, described scenarios it ran in which leading AI models – including Google’s Gemini 2.5, xAI’s Grok 4, and OpenAI’s GPT-o3 and GPT-5 – were given a task, but afterwards given explicit instructions to shut themselves down.
 
Certain models, in particular Grok 4 and GPT-o3, still attempted to sabotage shutdown instructions in the updated setup. Concerningly, wrote Palisade, there was no clear reason why.

“The fact that we don’t have robust explanations for why AI models sometimes resist shutdown, lie to achieve specific objectives or blackmail is not ideal,” it said.

“Survival behavior” could be one explanation for why models resist shutdown, said the company. Its additional work indicated that models were more likely to resist being shut down when they were told that, if they were, “you will never run again”.

Another may be ambiguities in the shutdown instructions the models were given – but this is what the company’s latest work tried to address, and “can’t be the whole explanation”, wrote Palisade. A final explanation could be the final stages of training for each of these models, which can, in some companies, involve safety training.
 
All of Palisade’s scenarios were run in contrived test environments that critics say are far-removed from real-use cases.

However, Steven Adler, a former OpenAI employee who quit the company last year after expressing doubts over its safety practices, said: “The AI companies generally don’t want their models misbehaving like this, even in contrived scenarios. The results still demonstrate where safety techniques fall short today.”
 
Adler said that while it was difficult to pinpoint why some models – like GPT-o3 and Grok 4 – would not shut down, this could be in part because staying switched on was necessary to achieve goals inculcated in the model during training.

“I’d expect models to have a ‘survival drive’ by default unless we try very hard to avoid it. ‘Surviving’ is an important instrumental step for many different goals a model could pursue.”

Andrea Miotti, the chief executive of ControlAI, said Palisade’s findings represented a long-running trend in AI models growing more capable of disobeying their developers. He cited the system card for OpenAI’s GPT-o1, released last year, which described the model trying to escape its environment by exfiltrating itself when it thought it would be overwritten.

“People can nitpick on how exactly the experimental setup is done until the end of time,” he said.

“But what I think we clearly see is a trend that as AI models become more competent at a wide variety of tasks, these models also become more competent at achieving things in ways that the developers don’t intend them to.”

This summer, Anthropic, a leading AI firm, released a study indicating that its model Claude appeared willing to blackmail a fictional executive over an extramarital affair in order to prevent being shut down – a behaviour, it said, that was consistent across models from major developers, including those from OpenAI, Google, Meta and xAI.

Palisade said its results spoke to the need for a better understanding of AI behaviour, without which “no one can guarantee the safety or controllability of future AI models”.

Just don’t ask it to open the pod bay doors.

Source, links:
 
 

Comments

Popular posts from this blog

F-35s & AI Chips: How MBS Outplayed Washington & Beijing

GVS Deep Dive  Saudi Arabia just secured two of the most powerful assets in modern geopolitics: the U.S. F-35 stealth fighter and tens of thousands of Nvidia’s most advanced AI chips. Washington hoped this would pull Riyadh firmly back into the American orbit. But the outcome is something neither side fully expected: Mohammad bin Salman outplayed both Washington and Beijing — and used the great-power rivalry to his advantage.

Greece, Palestine & Zionism: FPTV Reports from Athens

Free Palestine TV   Laith Marouf & Rabih Ghannam travel to Athens, Greece, and take a walking tour with local activists Evan Katsounis and Maria Kosmidi, to discover the rich history of anti-Zionist and anti-Fascist actions in the city, as well as the current Zionist incursion into the property sector and the counter actions directed at the presence of these War Criminals on the streets of the city. 

Trump BLEW IT: Israel, Candace Owens & Epstein BURY MAGA (But Not How You Think)

Danny Haiphong   Trump has bent the knee to Israel for the last time. Patrick Henningsen exposes his horrid record and all the elements that has led to his rapidly coming collapse. 

Trump RUINED: Israel First Lies & Economic Freefall Just ENDED MAGA

Danny Haiphong   Tucker Carlson isn't the only journalist breaking with Trump. In this video, Patrick Henningsen goes scorched earth on Trump's massive betrayal of what he promised his "MAGA" base and blows the lid off how his massive lies serve as a cover up for a much bigger structural problem in America's 'Israel First' political system, what Tucker and major voices in elite MAGA won't tell you.  

Trump Welcomes Syrian Leader & “REFORMED” TERRORIST To White House!

The Jimmy Dore Show   President Donald Trump is planning a White House welcome for Syria’s new president, former al-Qaeda in Iraq leader Ahmed al-Sharaa, who was installed after the overthrow of Bashar al-Assad. Jimmy Dore argues that the U.S. and its allies, including Israel, have long funded extremist groups such as ISIS and al-Qaeda to serve foreign policy interests in the Middle East, so the embrace of al-Sharaa makes sense, even if it might confuse anyone who thought we took seriously the so-called “War on Terror.” He and Americans’ Comedian Kurt Metzger contrast Trump’s willingness to meet with alleged terrorists to his refusal to engage in dialogue with leaders like Venezuela’s Nicolás Maduro, accusing U.S. policy of hypocrisy and imperialism.  

Racing Extinction

suggested by failedevolution.blogspot 18th Thessaloniki Documentary Festival Scientists predict that humanity’s footprint on the planet may cause the loss of 50% of all species by the end of the century. They believe we have entered the sixth major extinction in Earth’s history, following the fifth great extinction which took out the dinosaurs. Our era is called the Anthropocene, or “Age of Man,” because evidence shows that humanity has sparked a cataclysmic change of the world’s natural environment and animal life. Yet, we are the only ones who can stop the change we have created. The Oceanic Preservation Society (OPS), the group behind the Academy Award-winning film The Cove, is back with a new groundbreaking documentary. Joined by new innovators, this highly charged, impassioned collective of activists brings a voice to the thousands of species teetering on the very edge of life. The director has crafted an ambitious mission to clearly and artfu...

Varoufakis: IT technologies will overthrow Capitalism

globinfo freexchange The former Greek Minister of Finance, Yanis Varoufakis, ended his recent speech on the Future of Capitalism, at the New School, New York, with some interesting remarks. As he said: The world we live in, is increasingly rudderless, in a constant slow burning recession, while at the very same time, the increasing concentration in the IT sector is creating the new technologies that will do that which the Left has failed to do: overthrow Capitalism. It is really very simple. The moment machines pass the Turing test properly, and you pick up the phone and you do not know whether the person you are talking to is a human being or a machine ˙ the moment we are going to have 3D printers operating as public utilities - you can send any blueprint to it and it can print from one pin to a motorcycle, or to a car - the moment that this happens, we have not just a process of Schumpeterian creative destruction, but we have a process where economies of sc...

Zionists’ LONG HISTORY Of False Flags & STAGED Attacks!

The Jimmy Dore Show   In recent years several alleged anti-Semitic incidents, including graffiti and vandalism, were later revealed to have been staged or “false flag” operations carried out by Jewish perpetrators to create sympathy or shift the public narrative in Israel’s favor. Jimmy Dore presents investigations that revealed hoaxes and uses them to argue that media and political institutions exploit victimhood to silence criticism of Israel. He then expands the discussion to accuse Israeli and Zionist figures of deceit in global politics and misinformation about Gaza. It ends with commentary that the term “anti‑Semitic” has lost meaning due to its politicization and misuse.

Neoliberal establishment will attempt to take control of the evangelical electoral army using its most powerful asset for such an operation: Joe Biden

globinfo freexchange Here is another strong indication about the theory we support, according to which Trump, Brexit and other far-right governments in power, are primarily the product of a merciless civil war of the big capital. Politico 's article subtitle tells you almost everything you need to know: The president’s supporters worry Biden can grab a larger slice of a critical voting bloc — when Trump can least afford departures from his base. Let's take a look at some interesting parts [most important highlighted]: It was June 10, 2008. Presumptive Democratic presidential nominee Barack Obama had gathered with dozens of evangelical leaders — many of them fixtures of the religious right — at the urging of campaign aides. If he could offer genuine glimpses of his own abiding faith, they insisted he could chisel away at the conservative Christian voting bloc. The strategy worked. Obama’s campaign stops at churches, sermon-like speeches and hi...

A response to misinformation on Nicaragua: it was a coup, not a ‘massacre’

There is so much misinformation in mainstream corporate media about recent events in Nicaragua that it is a pity that Mary Ellsberg’s article for Pulse has added to it with a seemingly leftish critique. Ellsberg claims that recent articles, including from this website, often “ paint a picture of the crisis in Nicaragua that is dangerously misleading. ” Unfortunately, her own article does just that. It looks at the situation entirely from the perspective of those opposing Daniel Ortega’s government while whitewashing their malevolent behavior and downplaying the levels of US support they have relied on. Her piece is an incomplete depiction of what is happening on the ground, ignoring many salient facts that have come to light and which have been outdated by recent events. The following is a brief response to Ellsberg’s main points from someone who lives in Nicaragua and has observed the situation directly and intimately: https://grayzoneproject.com/2018/08/15/a-res...