ENAIS blog

What is AI safety?

Published on:
October 16, 2023
the ENAIS team

If you prefer video to text, the Center for AI Safety's Introduction to ML Safety is a good place to start.

What does AI safety mean?

AI safety, which includes AI alignment, is a research field which focuses on avoiding catastrophic outcomes from uncontrollable AI. Significant developments towards general intelligence with modern machine learning (for example ChatGPT and Bard) indicate that we need to ensure that artificial intelligent systems work for the benefit of humanity and that we avoid large-scale risks.

The field of AI safety is also sometimes referred to as Artificial General Intelligence Safety or AI Existential Safety. In our field of focus, AI safety is about the safeguarding of humanity from uncontrollable AI scenarios. This can include global systemic risks like nuclear war and cyberwarfare from the use of artificial intelligence but also the dangers from self-improving AI systems with their emergent uncontrollable goals.

How can AI cause existential risk?

This is a short introduction. See more extensive explanations from 80,000 hours, Ngo, Chan & Mindermann et al. (2023) and Carlsmith (2022).

We expect to see at least an internet-scale transformation of society with corresponding economic growth expectations arising from AI. Types of transformative AI like GPT models are not always controllable by their creators, leading to situations like Bing Chat threatening its users. This can lead to both greatly positive changes, but with it comes many risks…

When we imagine the future capability of artificially intelligent systems, we see many systemic risk increases from embedding uncontrollable machine learning systems in our critical infrastructure. Examples include nuclear war, large-scale cybersecurity risks and financial systems collapse (such as the 2010 flash crash). Additionally, we find that these systems can also develop emergent goals that can lead to uncontrollable self-improvement and unintentionally malicious goal-seeking behaviour.

What are people doing about this?

The work on reducing risks from AI safety can be split into two sub-fields:

  1. Technical AI safety research tries to solve the difficult technical problem of ensuring our AI systems have goals that align with ours. This is an important problem because systems that run by themselves or are unintentionally optimizing other goals might lead to large-scale risks. See this Google Brain paper for concrete open problems in technical AI safety (read more).
  2. AI governance/policy/strategy focuses on topics such as governments producing useful legislation, AI companies signing safety agreements, and preventing malevolent use of AI (read more).

Many of the leading machine learning companies such as OpenAI, DeepMind and Anthropic are focused on the safe deployment of their technologies. Read their perspectives on safety here: OpenAI, DeepMind, Anthropic.

Other research groups are focused purely on the safety of these future systems. These include Alignment Research Center, Redwood Research, Stanford Existential Risk Initiative, Center for Human-Compatible AI, Apart Research, Ought, Aligned AI, Krueger Lab and Center for AI Safety (see more).

What can you do about this?

There is amazing positive potential in artificial intelligence and if we can answer the technical and societal questions, we have a unique opportunity to change the world for the better, with democratic, equitable and collaborative use of the technology.

We recommend that you dive into the introductory texts on AI safety linked above and join courses such as the AGI Safety Fundamentals and the Introduction to ML Safety course. You can also find more books and articles on Reading What We Can.

If you are actively working on AI safety in Europe, join our network. If you have any questions that remain unanswered, please reach out to us at contact@enais.co.

The ENAIS team can be found on the About page and is composed of a decentralized group of European researchers and organizers.

More from blog

All Blog Posts >
More posts to come!
Nothing here yet.