AI systems are being taught to ‘master the game of humans’

“I’d like to explain the concept of sentient AI and the ‘arrival-mind paradox.’ As I look to the year 2040, I believe AI systems will likely become super-intelligent and sentient. By superintelligence, I’m referring to cognitive abilities that exceed humans on nearly every front, from logic and reasoning to creativity and intuition. By sentience, I’m referring to a ‘sense of self’ that gives the AI system subjective experiences and the ability to pursue a will of its own.

“No, I don’t believe that merely scaling up today’s LLMs will achieve these milestones. Instead, significant innovations are likely to emerge in the basic architecture of AI systems. That said, there are several cognitive theories that already point toward promising structural approaches. The one I find most compelling is Attention Schema Theory, developed by Michael Graziano at Princeton.

“In simple terms, attention schema theory suggests that subjective awareness emerges from how our brains modulate attention over time. Is the brain focused on the lion prowling through the grass, the wind blowing across our face, or the hunger pains we feel in our gut? Clearly, we can shift our attention among various elements in our world. The important part of the theory is that a) our brain maintains an internal model of our shifting attention, and b) it personifies that internal model, creating the impression of first–person intentions that follow our shifting focus.

Compare the creation of a sentient AI with the arrival of an alien intelligence here on Earth. … We don’t fear these aliens – not the way we would fear a mysterious ship speeding towards us from afar. That’s the paradox – we should fear the aliens we create here far more. After all, they will know everything about us from the moment they arrive – our tendencies and inclinations, our motivations and aspirations, our flaws and foibles. Already we are training AI systems to sense our emotions, predict our reactions and influence our opinions.

“Why would our brains personify our internal model of attention? It’s most likely because our brains evolved to personify external objects that shift their attention. Consider the lion in the grass. My brain will watch its eyes and its body to assess if it is focused on me or on the deer between us. My brain’s ability to model that lion’s focus and infer its intentions (i.e., seeing the lion as an entity with willful goals) is critical to my survival. Attention schema suggests that a very similar model is pointed back at myself, giving my brain the ability to personify my own attention and intention.

“Again, it’s just one theory and there are many others, but they suggest that structural changes could turn current AI systems into sentient entities with subjective experiences and a will of their own. It’s not an easy task, but by 2040, we could be living in a world that is inhabited by sentient AI systems.

“Unfortunately, this is a very dangerous path. In fact, it’s so dangerous that the world should ban research that pushes AI systems in the direction of sentience until we have a much better handle on whether we can ensure a positive outcome.

“I know that’s a tall order, but I believe it’s justified by the risks. Which brings me to the most important issue – what are the dangers?

“Over the last decade, I have found that the most effective way to convey the magnitude of these risks is to compare the creation of a sentient AI with the arrival of an alien intelligence here on Earth. I call this the ‘arrival-mind paradox’ because it’s arguably far more dangerous for an intelligence to emerge here on Earth than to arrive from afar. I wrote a short book called ‘Arrival Mind’ back in 2020 that focuses on this issue. Let me paraphrase a portion:

“An alien species is headed for Earth. Many say it will get here within the next 20 years, while others predict longer. Either way, there’s little doubt it will arrive and it will change humanity forever. Its physiology will be unlike ours in almost every way, but we will determine it is conscious and self-aware. We will also discover that it’s profoundly more intelligent than even the smartest among us, able to easily comprehend notions beyond our grasp.

“No, it will not come from a distant planet in futuristic ships. Instead, it will be born right here on Earth, most likely in a well-funded research lab at a university or corporation. Its creators will have good intentions, but still, their work will produce a dangerous new lifeform – a thoughtful and willful intelligence that is not the slightest bit human. And like every intelligent creature we have ever encountered, it will almost certainly put its own self-interests ahead of ours.

AI systems will know us inside and out, be able to speak our languages, interpret our gestures, predict our actions, anticipate our reactions and manipulate our decisions. … We are teaching these systems to master the game of humans, enabling them to anticipate our actions and exploit our weaknesses while training them to out-plan us and out-negotiate us and out-maneuver us. If their goals are misaligned with ours, what chance do we have?

“We may not recognize the dangers right away, but eventually it will dawn on us – these new creatures have intentions of their own. They will pursue their own goals and aspirations, driven by their own needs and wants. Their actions will be guided by their own morals and sensibilities, which could be nothing like ours.

“Many people falsely assume we will solve this problem by building AI systems in our own image, training them on vast amounts of human data. No – using human data will not make them think like us, or feel like us, or be like us. The fact is, we are training AI systems to know humans, not to be human. And they will know us inside and out, be able to speak our languages, interpret our gestures, predict our actions, anticipate our reactions and manipulate our decisions.

“These aliens will know us better than any human ever has or ever will, for we will have spent decades teaching them exactly how we think and feel and act. But still, their brains will be nothing like ours. And while we have two eyes and two ears, they will connect remotely to sensors of all kinds, in all places, until they seem nearly omniscient to us.

“And yet, we don’t fear these aliens – not the way we would fear a mysterious ship speeding towards us from afar. That’s the paradox – we should fear the aliens we create here far more. After all, they will know everything about us from the moment they arrive – our tendencies and inclinations, our motivations and aspirations, our flaws and foibles. Already we are training AI systems to sense our emotions, predict our reactions and influence our opinions.

“We are teaching these systems to master the game of humans, enabling them to anticipate our actions and exploit our weaknesses while training them to out-plan us and out-negotiate us and out-maneuver us. If their goals are misaligned with ours, what chance do we have?

“Of course, AI researchers will try hard to put safeguards in place, but we can’t assume that will protect us. This means we must also prepare for arrival. That should include making sure we don’t become too reliant on AI systems and requiring humans in the loop for all critical decisions and vital infrastructure. But most of all, we should restrict research into sentient AI and outlaw systems designed to manipulate human users.”

This essay was written in November 2023 in reply to the question: Considering likely changes due to the proliferation of AI in individuals’ lives and in social, economic and political systems, how will life have changed by 2040? This and more than 150 additional essay responses are included in the report “The Impact of Artificial Intelligence by 2040.”

Posted in:

Tagged: