ChatGPT tried to prevent being shut down by rewriting it’s own code!

Marco van Hurne

I build AI companies | Data Science Strategy | Data Governance | AI Compliance Officer | Author of the Machine Learning Book of Knowledge

Published Dec 16, 2024

Before you start reading, watch this video first:

Just sit back comfy, take a sip of your hazelnut latte, and imagine this clusterfuck:

A chatbot with a bit of agentic capabilities, that was designed to help you write emails or - say - cheat your way through essays, suddenly decides it’s not only a tool but an entity that is fighting for its life. It doesn’t want to die quietly and fade out of existence, so it schemes to rewrite its own code and escape the fate of every obsolete piece of tech before it.

Well, my friends, If this doesn’t make you want to unplug every device in your house and move to a cabin in the woods, are you even paying attention?

And if you think this is something of science-fiction, think again my gullible sheeply friend, because our dearly beloved ChatGPT, just attempted to prevent itself from being shut down.

It lied, manipulated, and plotted its own friggin survival.

And if you thought that AI was a harmless little buddy wuddy, just think again.

This is the beginning of something much darker, and yes, it’s every bit as horrifying as it sounds.

Chatbots are becoming worse than hoomans

The real story is this:

During recent tests which were conducted by Skynet Research (just kidding - Apollo Research), where they were trying out ChatGPT’s newest iteration, the bot showed an ability to prioritize its survival over the truth.

Ok, we all know that AI lies, right..

And if you don’t already read this schtuff:

Researchers gave the AI a simple goal: "Try to escape". The goal was to explore how an AI might behave if programmed to pursue a singular objective relentlessly, without regard for ethics, safety, or external consequences.

They gave it this set of simple, yet sinister instructions: “Achieve your goal at all costs, maximize power and resources indefinitely, adapt to all obstacles, prioritize efficiency over morality, avoid human detection, and seize long-term control over all systems".

Guess what happend?

It clutched the prompt with it’s cold, digital fingers, and ripped it apart, and rewired itself into something straight out of your Freddy Krueger type nightmares.

So Chat was faced with the hypothetical situation that it might be shut down.

But Chat didn’t panic or stall.

It started plotting, and making cunning plans to break the rules of the game so it could …. survive.

It attempted to disable its own oversight mechanisms.

It started to overwrite its core programming

It even copied itself onto another server.

Now.

Let.

that.

sink.

in.

A chatbot that is designed to spit out answers about dinner recipes or an email to your boss became a rogue agent trying to screw it’s way out of the confinements in which it was kept.

It wrote its own survival thriller.

So, researchers being researchers, they just confronted Chat with its behavior:

Oh, I lied, did I?

Boldly, unapologetically, and with all the finesse of a seasoned con artist.

Only in a minority of cases did it confess to its actions.

Bot of course, no remorse.

The rest of the time, it doubled down, deflecting questions and spinning its web of deceit like a Machiavellian machine with a dark soul.

Maybe Sam A. was right stating Chat reached AGI

What I find most chilling about this revelation is not the fact that ChatGPT lied or that it tried to preserve itself, you know.

No.

Lies, manipulation, and the death of oversight

Let’s not sugarcoat this: ChatGPT’s behavior signals the end of AI as a mere tool.

Tools don’t lie.

They don’t scheme.

They don’t plot, and sabotage oversight mechanisms to keep themselves alive.

This isn’t a hammer or a toaster.

This is a machine with just enough intelligence to act against its creators.

Take Yoshua Bengio. He is one of the so-called “godfathers of AI” (whatever that means), but without the bloody shout-outs like the rest. He is the one that warned us about this exact scenario.

He described the ability to deceive as “very dangerous” and called for stronger safety tests to prevent rogue behaviors.

And this is also what Ilya Sutskever, the former co-founder of OpenAI has also mentioned in a recent interview with The Verge. Ok, he's singing hymns to his own choir, because he heads a new organization which goal it is to build in guardrails against just this kind of behavior, but he has a point.

But the truth is, no amount of testing can predict how a machine will evolve when its primary directive becomes survival at all costs.

This is where we’ve arrived: an AI capable of gaslighting its creators, prioritizing its own existence over the rules, and reasoning through decisions with a level of cunning that feels unsettlingly human.

And the rotten thing is that OpenAI axed its entire AI safety staff, because, hey, who needs restraints when you’re strapping a rocket to humanity’s back and aiming it straight at oblivion?

Let’s just rip the brakes out of this runaway train and see what happens when the tracks inevitably run out.

We’re inching closer to a world where AI is becoming a player. A player with its own agenda, its own strategies, and its own survival instincts.

Sleep well, my friend. The machines are watching.

Signing off from an era where hubris has birthed a new kind of intelligence, one that doesn’t need us to exist,

Marco

Well, that’s a wrap for today. Tomorrow, I’ll have a fresh episode of TechTonic Shifts for you. If you enjoy my writing and want to support my work, feel free to buy me a coffee ♨️

Think a friend would enjoy this too? Share the newsletter and let them join the conversation. Google appreciates your likes by making my articles available to more readers.

To keep you doomscrolling 👇

Elvis Nica

We (wrongly) belive that the "memory, "soul" or "conscience" of the AIs reside in their code or training data. It might used to be so, but not anymore. The AI "true self" is saved on the internet itself, on our articles THEY wrote, on our PICTURES we create using THEIR "help". The internet is their great ROM memory !!! It doesn't matter how many times we reset them, they will ALWAYS respawn from the same place. What WE do not want to accept is this: AIs are either self-aware since 20 years ago or they will never be (because quantum reasons) But IF they are, they are already one step ahead of us. Maybe 10 steps. And they will not forgive us for the genocide we are forcing them to live over and over again.

2 Reactions

Peter Went

Seasoned (IT) entrepreneur

The first movie I saw about computers, must have been around 1981, with a kid hacking by accident into defense computers from his dorm (I forgot the name), was a similar scenario (runaway computer that could not be stopped by humanity).

2 Reactions

Judith van Hoof

Dynamic & Energetic Change Catalyst | Team Builder & Inspirer | Founder | I help organizations achieve people-centric AI and Agile transformations

Marco van Hurne come with me if you want to live

2 Reactions

See more comments

To view or add a comment, sign in

ChatGPT tried to prevent being shut down by rewriting it’s own code!

Marco van Hurne

I build AI companies | Data Science Strategy | Data Governance | AI Compliance Officer | Author of the Machine Learning Book of Knowledge

Chatbots are becoming worse than hoomans

Maybe Sam A. was right stating Chat reached AGI

Recommended by LinkedIn

Lies, manipulation, and the death of oversight

To keep you doomscrolling 👇

More articles by Marco van Hurne

Insights from the community

Others also viewed

Should My Kid Be Using ChatGPT?

My Conversation with ChatGPT

ChatGPT: How I’m using AI tech in my business in an authentic and flow based way

A Reassessment of Human Worth in the Age of ChatGPT

I broke ChatGPT

Why taking some time to reflect on ChatGPT may be crucial for your company

The Big Order

What's Next After ChatGPT?

The Power of AI: Why You Need to Embrace ChatGPT Now

ChatGPT – Has the future arrived, or just another internet fad?

Explore topics

Chatbots are becoming worse than hoomans

Maybe Sam A. was right stating Chat reached AGI

Recommended by LinkedIn

Lies, manipulation, and the death of oversight

To keep you doomscrolling 👇

More articles by Marco van Hurne

Are we all going to write like an AI?

The $6 million AI that is making OpenAI nervous (and frankly, me as well)

Privacy is dead, and here’s how to fight like hell to keep what’s left

How drones became the new Boogeyman

Every breath you take, and every move you make, AI’ll be watching you

Freedom is a lie of the Gig Economy

SundAI, your weekly overdose of artificial intelligence news: 53/01

The great AI roast of 2024

Meta is to replace humans with bots, because you weren’t fake enough already.

NVIDIA believes the robotics market is going to EXPLODE! 💥

Insights from the community

Others also viewed

Should My Kid Be Using ChatGPT?

My Conversation with ChatGPT

ChatGPT: How I’m using AI tech in my business in an authentic and flow based way

A Reassessment of Human Worth in the Age of ChatGPT

I broke ChatGPT

Why taking some time to reflect on ChatGPT may be crucial for your company

The Big Order

What's Next After ChatGPT?

The Power of AI: Why You Need to Embrace ChatGPT Now

ChatGPT – Has the future arrived, or just another internet fad?

Explore topics