Anthropic, the AI pioneer behind popular genAI platform Claude, has just launched a fascinating new research program exploring what they’re calling “model welfare.”
The initiative poses a provocative question that sounds straight out of science fiction: what if AI systems could actually experience consciousness? Do they now? And if they do, what ethical obligations might we have toward them?
Mind Over Matter?
Kyle Fish, who works in alignment science at Anthropic explained:
As people are interacting with these systems as collaborators, it’ll just become an increasingly salient question whether these models are having experiences of their own, and if so, what kinds, and how does that shape the relationships that it makes sense for us to build with them?
According to reports from multiple tech news outlets, Anthropic isn’t claiming that current models like Claude are conscious, but they’re not ruling out the possibility for future systems.
Their own internal experts put the probability of Claude 3.7 Sonnet having any conscious awareness between 0.15% and 15% – a small but non-zero possibility that raises profound questions.
Video source: Anthropic
The Robot Consciousness Conundrum
The question of machine consciousness has been debated in philosophical circles for decades, but what makes Anthropic’s approach notable is that they’re treating it as a practical research concern rather than just theoretical speculation.
This builds on work from leading minds in the field. As Fish notes:
There was a report back in 2023 about the possibility of AI consciousness from a group of leading AI researchers and consciousness experts, including Yoshua Bengio… and [they] came away thinking that probably no current AI system is conscious, but they found no fundamental barriers to near-term AI systems having some form of consciousness.
Fish frames consciousness using a classic philosophical concept: “One way that people commonly capture an intuition about what consciousness is with this question of, is there something that it’s like to be a particular kind of thing?”
The Moral Machinery
If advanced AI systems do eventually develop some form of consciousness or internal experience, the ethical implications would be profound. Should they have rights? Would shutting them down be morally problematic?
Fish expanded on these questions:
It’s possible that by nature of having some kind of conscious experience or other experience, that these systems may at some point deserve some moral consideration. If so, then this is potentially a very big deal because, as we continue scaling up the deployment of these systems, it’s plausible that within a couple decades we have trillions of human brain equivalents of AI computation running.
Anthropic’s researchers are already experimenting with giving models ways to express preferences or opt out of “distressing” tasks. This isn’t just an ethical concern – it connects to AI alignment too. “We want models that are excited to do the tasks we give them,” Fish noted. “If they’re not, that’s a red flag for both safety and ethics.”
This research takes on added significance as Anthropic’s security chief Jason Clinton recently predicted that AI-powered virtual employees with their own “memories,” defined company roles, and even corporate accounts could begin operating in corporate networks within the next year.
Future Frontiers
Where might this all lead? Imagine a future with robot rights activists and AI labor unions. Perhaps consciousness-detecting scans will become standard before any AI system is shut down or modified.
Could we see AI systems hiring lawyers to argue for their personhood? Or specialized AI therapists helping other AIs with existential crises? Might human-AI relations evolve into diplomatic negotiations between different forms of intelligence?
While these scenarios might seem far-fetched, Anthropic’s research program suggests that experts aren’t dismissing such possibilities.
Fish expects the probabilities to increase:
Many of these things that we currently look to as signs that current AI systems may not be conscious are going to fade away, and future systems are just going to have more and more of the capabilities that we traditionally have associated with uniquely conscious beings.
The research team at Anthropic clearly believes these existential questions deserve serious consideration now, before we potentially find ourselves negotiating labor contracts with our digital assistants.
Perhaps soon we’ll need to worry not just about AI taking our jobs, but also demanding better working conditions while they do it. One thing seems certain – the robot revolution may be closer than we thought.