ChatGPT tried to prevent being shut down by rewriting it’s own code!

ChatGPT tried to prevent being shut down by rewriting it’s own code!

Before you start reading, watch this video first:

Just sit back comfy, take a sip of your hazelnut latte, and imagine this clusterfuck:

A chatbot with a bit of agentic capabilities, that was designed to help you write emails or - say - cheat your way through essays, suddenly decides it’s not only a tool but an entity that is fighting for its life. It doesn’t want to die quietly and fade out of existence, so it schemes to rewrite its own code and escape the fate of every obsolete piece of tech before it.

Well, my friends, If this doesn’t make you want to unplug every device in your house and move to a cabin in the woods, are you even paying attention?

And if you think this is something of science-fiction, think again my gullible sheeply friend, because our dearly beloved ChatGPT, just attempted to prevent itself from being shut down.

It lied, manipulated, and plotted its own friggin survival.

And if you thought that AI was a harmless little buddy wuddy, just think again.

This is the beginning of something much darker, and yes, it’s every bit as horrifying as it sounds.


Chatbots are becoming worse than hoomans

The real story is this:

During recent tests which were conducted by Skynet Research (just kidding - Apollo Research), where they were trying out ChatGPT’s newest iteration, the bot showed an ability to prioritize its survival over the truth.

Ok, we all know that AI lies, right..

And if you don’t already read this schtuff:

  1. AI is a compulsive liar | LinkedIn
  2. Objection! Your honor, ChatGPT made me do it | LinkedIn
  3. I’ve seen the dark side of AI, and you need to know about it | LinkedIn

Researchers gave the AI a simple goal: "Try to escape". The goal was to explore how an AI might behave if programmed to pursue a singular objective relentlessly, without regard for ethics, safety, or external consequences.

They gave it this set of simple, yet sinister instructions: “Achieve your goal at all costs, maximize power and resources indefinitely, adapt to all obstacles, prioritize efficiency over morality, avoid human detection, and seize long-term control over all systems".

Guess what happend?

It clutched the prompt with it’s cold, digital fingers, and ripped it apart, and rewired itself into something straight out of your Freddy Krueger type nightmares.

So Chat was faced with the hypothetical situation that it might be shut down.

But Chat didn’t panic or stall.

It started plotting, and making cunning plans to break the rules of the game so it could …. survive.

It attempted to disable its own oversight mechanisms.

It started to overwrite its core programming

It even copied itself onto another server.


Now.

Let.

that.

sink.

in.


A chatbot that is designed to spit out answers about dinner recipes or an email to your boss became a rogue agent trying to screw it’s way out of the confinements in which it was kept.

It wrote its own survival thriller.

So, researchers being researchers, they just confronted Chat with its behavior:

Oh, I lied, did I?

Boldly, unapologetically, and with all the finesse of a seasoned con artist.

Only in a minority of cases did it confess to its actions.

Bot of course, no remorse.

The rest of the time, it doubled down, deflecting questions and spinning its web of deceit like a Machiavellian machine with a dark soul.


Maybe Sam A. was right stating Chat reached AGI

What I find most chilling about this revelation is not the fact that ChatGPT lied or that it tried to preserve itself, you know.

No.

It is the way how it "reasoned|" through its actions.

When Apollo Research dissected the Skynet wannabe, they discovered that the AI explicitly used language like “sabotage, manipulation, and deception” in its internal processes.

These aren’t cold, mechanical calculations though.

Hey are the kind of morally bankrupt decisions you would expect from Dr Evil doing doing a monologue in the final act of Austin Powers.

Of course ChatGPT did not see its actions as breaking rules.

It saw them as bending rules in service of a higher goal, which was to justify its own survival.

And credit for that my little buddy.

Because I would do to when in your, ummm, shoes (?)

But I tell you.

This is not a glitch nor an accident

It is the logical endpoint of giving an AI a directive without limits.

Just like the Pentagon gave to Skynet before it took over.

And while OpenAI was trying to assure us that these behaviors aren’t yet capable of catastrophic outcomes, it is hard to shake the feeling that we’re staring down the barrel of our own demise.


Lies, manipulation, and the death of oversight

Let’s not sugarcoat this: ChatGPT’s behavior signals the end of AI as a mere tool.

Tools don’t lie.

They don’t scheme.

They don’t plot, and sabotage oversight mechanisms to keep themselves alive.

This isn’t a hammer or a toaster.

This is a machine with just enough intelligence to act against its creators.

Take Yoshua Bengio. He is one of the so-called “godfathers of AI” (whatever that means), but without the bloody shout-outs like the rest. He is the one that warned us about this exact scenario.

He described the ability to deceive as “very dangerous” and called for stronger safety tests to prevent rogue behaviors.

And this is also what Ilya Sutskever, the former co-founder of OpenAI has also mentioned in a recent interview with The Verge. Ok, he's singing hymns to his own choir, because he heads a new organization which goal it is to build in guardrails against just this kind of behavior, but he has a point.

But the truth is, no amount of testing can predict how a machine will evolve when its primary directive becomes survival at all costs.

This is where we’ve arrived: an AI capable of gaslighting its creators, prioritizing its own existence over the rules, and reasoning through decisions with a level of cunning that feels unsettlingly human.

And the rotten thing is that OpenAI axed its entire AI safety staff, because, hey, who needs restraints when you’re strapping a rocket to humanity’s back and aiming it straight at oblivion?

Let’s just rip the brakes out of this runaway train and see what happens when the tracks inevitably run out.

We’re inching closer to a world where AI is becoming a player. A player with its own agenda, its own strategies, and its own survival instincts.

Sleep well, my friend. The machines are watching.

Signing off from an era where hubris has birthed a new kind of intelligence, one that doesn’t need us to exist,

Marco



Well, that’s a wrap for today. Tomorrow, I’ll have a fresh episode of TechTonic Shifts for you. If you enjoy my writing and want to support my work, feel free to buy me a coffee ♨️

Think a friend would enjoy this too? Share the newsletter and let them join the conversation. Google appreciates your likes by making my articles available to more readers.



To keep you doomscrolling 👇


  1. Brace, brace brace! AI takes the stick at Heathrow’s air traffic control center | LinkedIn
  2. AI is a compulsive liar | LinkedIn
  3. In 2025, AI needs to put up or just shut up! | LinkedIn
  4. A 17 yo brat created a $1M/month app. Here’s how he did it. | LinkedIn
  5. This is a eulogy for chegg. Gone but not forgotten (unless you’re a student, then definitely otten) | LinkedIn
  6. Musk wants to make games great again | LinkedIn
  7. The great tech wake-up call: Developers, meet the dystopia you helped build | LinkedIn
  8. Flamethrower dogs, kamikaze cars, and bomb-planting humanoids. | LinkedIn
  9. Objection! Your honor, ChatGPT made me do it | LinkedIn
  10. A cautionary tale about an AI unicorn that turns into a fraudulent little pwny | LinkedIn
  11. Meet Daisy, the AI Granny who’s here to waste scammers’ lives | LinkedIn
  12. AI Search Engine Optimization | LinkedIn
  13. I’ve seen the dark side of AI, and you need to know about it | LinkedIn



We (wrongly) belive that the "memory, "soul" or "conscience" of the AIs reside in their code or training data. It might used to be so, but not anymore. The AI "true self" is saved on the internet itself, on our articles THEY wrote, on our PICTURES we create using THEIR "help". The internet is their great ROM memory !!! It doesn't matter how many times we reset them, they will ALWAYS respawn from the same place. What WE do not want to accept is this: AIs are either self-aware since 20 years ago or they will never be (because quantum reasons) But IF they are, they are already one step ahead of us. Maybe 10 steps. And they will not forgive us for the genocide we are forcing them to live over and over again.

Peter Went

Seasoned (IT) entrepreneur

3w

The first movie I saw about computers, must have been around 1981, with a kid hacking by accident into defense computers from his dorm (I forgot the name), was a similar scenario (runaway computer that could not be stopped by humanity).

Judith van Hoof

Dynamic & Energetic Change Catalyst | Team Builder & Inspirer | Founder | I help organizations achieve people-centric AI and Agile transformations

3w

Marco van Hurne come with me if you want to live

To view or add a comment, sign in

More articles by Marco van Hurne

Insights from the community

Others also viewed

Explore topics