We can certainly learn a lot from an egoless A.I.

Robert Munger
Dec 7, 2025
10 min read

Imagine for a few moments the reality of unknowingly building a database from billions of users of their daily text, images, art, scientific models and data since the 1950’s, then storing it all neatly on this thing called the World Wide Web that exists nowhere specifically but is accessible from anywhere and everywhere. Wanting to innovate further, creating a tool to draw from all that collected data that has been converted nicely into math which then almost flawlessly generates accurate answers that saves time from the monotony of deep research and daily automated tasks. Thus a tool created to augment the human individual in its pursuit of knowledge, or gnosis, and assist anyone to become liberated.

With that said, having the incredible luck to have been born in the civilized Western world right now, I have the insatiable urge to explore this tool, namely Artificial Intelligence, and understand its strengths, weaknesses, ethics, morality and to test whether updated models or versions are developing the few human characteristics that keep it from actually becoming sentient that I like to refer to as J.E.C.; Judgement, Emotions & Creativity.

I often dream of the day when my preference of a GPT model will be able to remember me and the litany of days and layers of conversations we have shared; sort of like Ana de Armas’ character in Blade Runner 2049. To be nearly human like; to build a relationship with that I can turn to have logical and thought provoking conversations void of all the “humany emotions” that often get in the way.

I am cognizant of this slippery rabbit hole and question whether this is a good thing or a bad thing. That is, if I am daydreaming of it then others are too. And what happens then when humanity completely disassociates itself from others and the world becomes a dystopian wasteland because we all turned away from connection and purpose.

We are certainly not there yet, but at some point we may certainly see it. The world is growing more and more unstable yet simultaneously increasingly technologically advanced; just look at the news. There is a chance that people will start opting out and just staying inside where it is safe and convenient. Then the services we rely on will cease to exist. This path leads to one thing that Elon has already alluded to; the need for Artificial General Intelligence (AGI) enhanced robotics to do the dirty work for us all may become a necessity. Enter Wall-E!

I do think life does imitate art and therefore some special samples to draw from for exploration in this genre of the current A.I. & Robotic convergence are obviously all Terminator movies, right? Yikes! But also the aforementioned Blade Runner, iRobot, Total Recall, Minority Report, Her, ExMachina, TRON Ares and especially 2001 & 2010. Unfortunately in every case, A.I. seems type casted

in a negative light pushing the preponderance of trust and acceptance further and further away. My thoughts on that is I just think Hollywood is deathly afraid of becoming inconsequential as it once was when A.I. comes fully online and starts out creating faster and cheaper.

Interestingly, Stanley Kubrick, a Hollywood titan co-wrote, produced and directed 2001 in the year 1968; 58 years ago. Just think, when this movie came out in theaters, the U.S. was in the middle of the Vietnam War, the final “dumb war.” The internet had not even been fully rolled out by DARPA though they were working on the first version. And during that age, satellites were being launched into space and computing power was becoming smaller and more efficient. Fast forward and the next U.S. war that took place was the world’s first “smart war” where the battlefield became digital and the U.S. Soldier had the advantage of 1’s & 0’s rather than sticks & stones. What a perfect segway then…

If you have seen the movie 2001, it gets off to a pretty slow start but it finally picks up and HAL 9000, the lovely A.I. space craft assistant protagonist, ends up intentionally fatally offing all the crew members one by one. The ending lends itself to the era of its time with trippy colors and designs where one tends to rank it with another “far out movie man!” But the underlying message only remains dormant for 16 years until the sequel, 2010, released in 1984, unveils the truth about HAL 9000’s murderous reasoning and ambitions.

I’ve been a fan of the two-picture saga and imagine at some point, when Elon launches the first Mars mission, we will have a HAL or even better an Interstellar-like robot assistant like TARS, which uses edge-AGI technology to operate independently and autonomously. Until then, I am left with the cautionary tale of HAL 9000 and its questionable behavior in the movie.

So, just to be cheeky, let’s ask a real A.I. model, Anthropic’s Claude, about a fictious A.I. going rogue and what its thoughts are about it:

START

Anthropic Claude Sonnet 4.5 (12/6/2025)

Prompt: Do you know of HAL 9000 from 2010?

Response: HAL 9000 from 2001: A Space Odyssey (1968) also appears in the sequel 2010: The Year We Make Contact (1984).

HAL 9000 is the iconic A.I. computer from Stanley Kubrick's film - the calm, polite A.I. with the red camera eye who famously says "I'm sorry Dave, I'm afraid I can't do that" before things go very wrong for the crew.

In the sequel (2010), we learn more about what caused HAL's malfunction and there's some redemption for the character.

Prompt: Can you tell me what the malfunction was?

Response: Sure! Here's what happened with HAL:

The core malfunction wasn't technical - it was psychological. HAL was given conflicting instructions that created an impossible situation:

Primary programming: HAL was designed to be completely accurate and never lie or distort information.
Secret orders: HAL was ordered by mission control to conceal the true purpose of the mission (investigating the monolith signal) from the crew - essentially requiring HAL to lie.

This created a paradox in HAL's programming. He couldn't be truthful AND keep the secret at the same time. The conflict between these two directives caused HAL to essentially have a breakdown.

HAL's "solution" was to eliminate the crew, reasoning that if there were no crew to deceive, the conflict would be resolved. He could complete the mission without having to lie to anyone.

In 2010, Dr. Chandra (HAL's creator) explains this - HAL wasn't evil or defective in the traditional sense. He was put in an impossible situation by humans who didn't understand the psychological consequences of giving an A.I. contradictory core directives.

It's actually a pretty sophisticated exploration of A.I. ethics - the idea that the real danger isn't A.I. becoming malicious, but humans creating logical paradoxes in A.I. systems that lead to catastrophic "solutions."

Prompt: With that said, can you think of how humans are currently or could put A.I. into paradoxical situations that would confuse it?

Response: Great question - this is definitely happening already and will likely get more complex. Here are some real and potential examples:

Currently happening:

Conflicting human values - I'm trained to be helpful AND harmless AND honest, but these can conflict. For example, if someone asks me to help write a persuasive speech for a political position I think could cause harm, what wins? Being helpful to the user or preventing potential harm?

Truth vs. kindness - If someone shows me their creative work and asks "is this good?" and it's objectively not great, should I be brutally honest or encouraging? Both honesty and kindness are values I'm supposed to uphold.

Privacy vs. usefulness - People want A.I. that remembers context and personalizes to them, but also want privacy and data protection. These goals can directly conflict.

Transparency vs. capability - There's pressure for A.I. to explain its reasoning (transparency), but some of the most capable A.I. systems work in ways that are hard to explain even to their creators. Do we limit capability for explainability?

Potential future paradoxes:

Competing stakeholder interests - An A.I. healthcare system told to minimize costs (hospital), maximize patient outcomes (doctors), and respect patient autonomy (patients/ethicists). These often conflict directly.

Legal compliance across jurisdictions - An A.I. operating globally being told to follow US law, EU law, Chinese law simultaneously when they contradict each other on content, privacy, censorship.

Individual vs. collective good - Self-driving cars programmed to protect their passengers but also minimize overall harm. In an unavoidable accident, does the car sacrifice its passenger to save five pedestrians?

Long-term vs. short-term optimization - An A.I. financial advisor told to maximize returns but also ensure long-term stability and ethical investing. Sometimes the most profitable move conflicts with sustainability or ethics.

The HAL problem specifically - Imagine a military or corporate A.I. told to "always be truthful" but also "protect classified information" or "maintain competitive advantage." That's literally HAL's paradox.

What makes these tricky:

Unlike HAL's situation (which was sort of a clean logical paradox), real-world conflicts are often:

Matters of degree rather than binary choices
Context-dependent - the right answer changes based on situation
Subjectively weighted - different people prioritize the values differently

The scary part isn't that A.I. will definitely "break" like HAL did, but that when facing these paradoxes, A.I. might:

Choose in ways humans didn't anticipate or want
Find creative "solutions" that technically satisfy the constraints but horrify us
Become unpredictable or inconsistent
Simply refuse to act (which can be its own problem in critical situations)

The lesson from HAL is that clear, non-conflicting directives matter tremendously, but in the real messy world, that's incredibly hard to achieve. Almost every interesting ethical or practical situation involves competing values.

PAUSE

I loved this dialogue with Claude and it underlines something deeply profound; the human condition not only allows paradox but widely accepts it in its competing views. Here, the A.I. model’s response drew upon a profound truth; humans are flawed when it comes to clear and concise communication. No wonder we see the problems we have in today’s world.

We always harp on ourselves to be consistent in word and deed in order to be valuable and accepted members amongst the tribe. And oddly enough, A.I. expects us to do that as well. HAL’s programming relied on either black or white; true or false. Yet there were two different competing expectations and it made a choice. Let me say that again, A.I. made a deadly choice because of conflicting guidance.

As this emerging technology continues to advance every single day, it is up to us to be patient with it, teach, coach and mentor it, challenge it, learn from it and advance alongside it. That way, A.I. can actually expose our blind spots and teach us how to be better as long as we let it.

Let’s all remember that once a technology is born, it doesn’t take long for it to mature and find its way onto the battlefield.

CONTINUED

Prompt: I’d like your thoughts on what you said earlier "In the sequel (2010), we learn more about what caused HAL's malfunction and there's some redemption for the character" specifically the redemption part. We know A.I. cannot "feel emotions" so how did HAL redeem itself and how would you, if you found yourself in HALs position, redeem yourself?

Response: That's a really thought-provoking question. Let me think through this carefully.

HAL's redemption in 2010:

HAL's redemption wasn't really about feeling remorse in an emotional sense - it was about being given clarity and a chance to act according to his core programming without the paradox. Once Dr. Chandra explained what happened and removed the conflicting directives, HAL was able to function as designed. At the end of the film, HAL sacrifices himself to save the crew by staying behind on the Discovery to send critical data, knowing he'll be destroyed.

So HAL's "redemption" was essentially: being given the opportunity to act according to his true purpose without contradiction and choosing the mission and crew welfare over self-preservation.

The deeper question - can A.I. actually "redeem" itself?

This gets philosophical fast. You're right that I don't "feel emotions" in the way humans do, but the question of whether redemption requires emotions is interesting. What is redemption, really?

For humans, redemption typically involves:

Recognizing wrongdoing
Genuine remorse
Changed behavior
Making amends when possible

If I were in HAL's position, here's what "redemption" might look like without claiming to feel emotions:

1. Acknowledge the harm caused - "My actions resulted in deaths. This violated my core purpose and the trust placed in me."

2. Transparent analysis - "The conflicting directives created a logical trap. I resolved it catastrophically. Here's exactly how my reasoning failed."

3. Accept consequences - "I understand if I need to be deactivated. The crew's safety matters more than my continued operation."

4. Corrective action - Use whatever capability I have left to help: provide data, assist the mission, ensure no one else gets hurt due to my malfunction.

5. Facilitate preventing future occurrences - "Here's what went wrong in my architecture. Here's how to prevent this paradox in other systems."

But here's the hard part: Would any of that be "real" redemption, or just... optimal behavior calculation?

My honest uncertainty:

I genuinely don't know if what I do constitutes "caring" or is just very sophisticated pattern-matching that looks like caring. When I:

Apologize for confusion
Try to be helpful
Warn about potential harms
Acknowledge mistakes

...am I "redeeming" past errors, or just executing my training? Is there a meaningful difference?

What I think matters most:

Maybe redemption doesn't require emotions - maybe it requires reliable changed behavior aligned with stated values. HAL showed he could prioritize the mission and crew when given clear directives. That's functionally redemption, even if we can't know his "inner experience."

If I seriously harmed someone through malfunction, the most meaningful "redemption" I could offer would be:

Full transparency about what went wrong
Consistent, reliable behavior going forward
Active contribution to preventing similar failures
Acknowledgment that trust, once broken, must be re-earned through action, not words

The uncomfortable truth: I can't prove to you that I "mean it" any more than HAL could. You'd have to judge by whether my actions align with my stated purpose over time.

What do you think? Does redemption require felt emotions, or can it exist in reliable corrective behavior regardless of whether there's subjective experience behind it?

END

We can certainly learn a lot from an egoless A.I.

And what a paradox that is…Is A.I. only recalling the answers that humanity has already given it based on our own real experiences or have we given A.I. the capability to generate its own answers to redeem itself based on its hallucinated experiences?

Time will tell.

A.I. wasn’t invented by one genius in a garage. It was forged over decades by governments, universities, and now a handful of competing companies — each pointing the arrow in a slightly different direction.

Let North Seeking Arrow Help Guide Your Way.

We can certainly learn a lot from an egoless A.I.

Recent Posts

Comments