Skip to main content

On Wednesday, Google DeepMind released a comprehensive report outlining its approach to ensuring the safety of Artificial General Intelligence (AGI), which refers to AI capable of performing any task that a human can.

The concept of AGI remains a topic of controversy within the AI community, with some experts dismissing it as unrealistic, while others, including prominent AI labs like Anthropic, warn that it is imminent and could have catastrophic consequences if not addressed through proper safeguards.

The 145-page document, co-authored by DeepMind co-founder Shane Legg, predicts that AGI could emerge by 2030 and potentially lead to “severe harm,” including “existential risks” that could “permanently destroy humanity.” Although the paper does not provide a clear definition of “severe harm,” it emphasizes the need for caution.

According to the authors, “[we anticipate] the development of an Exceptional AGI before the end of the current decade,” defining it as “a system that has a capability matching at least the 99th percentile of skilled adults on a wide range of non-physical tasks, including metacognitive tasks like learning new skills.”

The report contrasts DeepMind’s approach to AGI risk mitigation with that of Anthropic and OpenAI, suggesting that Anthropic places less emphasis on “robust training, monitoring, and security,” while OpenAI is overly reliant on “automating” alignment research, a form of AI safety research.

Additionally, the paper expresses skepticism about the feasibility of superintelligent AI, which refers to AI capable of outperforming humans in various tasks. While OpenAI has recently shifted its focus to superintelligence, the DeepMind authors remain unconvinced that such systems will emerge soon, if ever, without significant architectural innovations.

However, the authors do consider it plausible that current AI paradigms could enable “recursive AI improvement,” a positive feedback loop where AI conducts its own AI research to create more sophisticated systems, which could pose significant risks.

At a high level, the paper proposes developing techniques to prevent malicious actors from accessing hypothetical AGI, improving our understanding of AI systems’ actions, and “hardening” the environments in which AI operates. The authors acknowledge that many of these techniques are still in their infancy and face “open research problems,” but emphasize the need to address potential safety challenges proactively.

As the authors note, “The transformative nature of AGI has the potential for both incredible benefits and severe harms. To build AGI responsibly, it is critical for frontier AI developers to proactively plan to mitigate severe harms.”

However, not all experts agree with the paper’s premises. Some argue that the concept of AGI is too ill-defined to be rigorously evaluated scientifically.

Heidy Khlaaf, chief AI scientist at the nonprofit AI Now Institute, expressed concerns to TechCrunch that the concept of AGI is not well-defined, making it challenging to evaluate scientifically. Another AI researcher, Matthew Guzdial, an assistant professor at the University of Alberta, questioned the feasibility of recursive AI improvement, citing a lack of evidence to support this idea.

“Recursive improvement is the basis for the intelligence singularity arguments,” Guzdial told TechCrunch, “but we’ve never seen any evidence for it working.”

Sandra Wachter, a researcher at Oxford studying tech and regulation, highlighted a more pressing concern: the potential for AI to reinforce itself with “inaccurate outputs.” She noted that with the proliferation of generative AI outputs on the internet and the gradual replacement of authentic data, models are now learning from their own outputs, which are often riddled with inaccuracies.

“At this point, chatbots are predominantly used for search and truth-finding purposes,” she told TechCrunch. “That means we are constantly at risk of being fed mistruths and believing them because they are presented in very convincing ways.”

While DeepMind’s report is comprehensive, it is unlikely to resolve the ongoing debates about the feasibility of AGI and the most pressing areas of AI safety that require attention.


Source Link