AI SAFETY

Back to Cause Areas

Human level artificial intelligence (AI) might seem like something that could only exist in science-fiction, but forecasters are predicting with increasing confidence that transformative Al will be developed within the next 100 years. The effects of advancing this technology could be amazing: Al could help us solve a lot of our current problems and achieve many of our goals, from combatting climate change and improving healthcare to boosting productivity and countless other things that we probably haven't even thought of yet. Alternatively, Al systems could be misused, or become misaligned, potentially leading to political collapse or even human disempowerment. As a consequence, the work that we do on Al safety in the next few decades might well determine which of these futures we are heading into.

1.

There has been a 14-fold increase in the number of active AI startups since 2000.

(Forbes, 2018)

2.

AI will have a huge impact on employment and the labour market: it is estimated that 85 million jobs will be replaced by machines with AI by 2025, but also that 97 million new roles may emerge from this shift.

(World Economic Forum, 2022)

3.

Around $50 million was spent on reducing the worst risks from AI in 2020, while billions were spent advancing AI capabilities. This means that even the lowest estimates suggest that over a hundred times more money is spent on advancing AI compared to reducing risks.

(80,000 Hours, 2022)

4.

A leading non-profit global policy think tank warns that the use of AI in military applications could geopolitical stability and remove the status of nuclear weapons as a means of deterrence through mutually assured destruction, leading to the outbreak of nuclear war by as soon as 2040.

(Rand Corporation, 2018)

5.

According to a research fellow at the University of Oxford's Future of Humanity Institute, unaligned AI (which deviates from our values and interests) is the most significant existential risk, representing a 1 in 10 chance of destroying humanity's long-term potential.

(Toby Ord in The Precipice, 2020)

Below are links to a number of resources to help you explore the cause area and learn how we can contribute to understanding and managing it as a cause area for concern.

Introduction to AI Safety

Here are some popular introductions to AI safety which discuss the risks posed by advancing artificial intelligence. These are great primers on the topic of AI if you aren't very familiar with it.

Current Uses & Capabilities

The applications of AI can be seen all around us. Here are just a few select examples that you may or may not already know about. While they promise exciting innovation, you may also want to think about the potential risks they could bring.

ChatGPT — A chatbot capable of generating human-like text outputs based on prompts given by the user.
ACT-1 — A large-scale transformer which is trained to use digital tools, including a spreadsheet programto manage databases. or web browser like the one you're using right now to view this page.
DALL·E 2 — An AI system which can create original and realistic images and art from a text description, combining a range of concepts, attributes and styles
COMPAS — Standing for Correctional Offender Management Profiling for Alternative Sanctions, this tool is used by U.S. courts to predict the likelihood of recidivism (reoffending and violation of parole) to help them to manage cases and support decision-making.

80,000 Hours: Since 2012, the amount of computational power used for training the largest AI models has been rising exponentially, growing by over 1 billion times, or doubling every 3.4 months.

The danger of artificial intelligence isn't that it's going to rebel against us, but that it's going to do exactly what we ask it to do

(Janelle Shane, November 2019)

“

Sources of AI Risk

The dangers of AI are often depicted as robot uprisings, in which AI systems turn against humans and achieve world domination. However, AI risks can and may arise in various different ways which unintentionally harm our interests or even threaten to destroy the long-term potential of life. Information on the different scenarios and mechanisms for AI risk can be found here.

'How we could stumble into AI catastrophe' — a post which presents a range of stylised stories depicting how powerful and capable AI could bring about world-changing consequences.

Power-seeking misaligned AI

AI Alignment Forum — A place for people to discuss concerns and potential solutions to the AI alignment problem, or how AI can be developed to incoporate human or moral values and interests.
'Is Power-Seeking AI an Existential Risk?' — A report by Joseph Carlsmith (Open Philanthropy) which examines the core argument for concern about existential risk: that AI could actively search for ways to acquire power or cause harm. Also available as a video presentation.
'Outer vs inner alignment: three framings' — A post which frames the distinction between the outer alignment problem (of specifying a reward function which captures human preferences) and the inner alignment problem (of ensuring that a policy trained on that reward function actually tries to act in accordance with human preferences).

Misuse by political institutions

'The Corruption Risks of Artificial Intelligence' — A working paper on how Ai may be intentionally developed for corrupt purposes or exploited by governments in corrupt ways, while offering some recommendations related to regulatory, technical and human factors.

Entrenchment of institutional discrimination and bias

COMPAS — ProPublica published its analysis of the COMPAS recidivism algorithm (a tool used by U.S. courts to manage cases and support decision-making based on an algorithm-generated score on the likelihood of reoffending or violating parole) and found that "black defendants were 45 percent more likely to be assigned higher risk scores than white defendants".
- While the methodology and findings have been criticised (reported in The Atlantic and by a criminal justice think tank) it is still important to think about how AI has the potential to deepen inequality.
Amazon recruitment tool — an algorithm being tested by the company was scrapped it was reported to penalise women candidates and "effectively taught itself that male candidates were preferable".

Scepticism & Uncertainty

As much as it is important to understand AI risks, you may be wondering about the possible counterarguments. You might want to take some time to explore these uncertainties and decide for yourself how important it is that we focus on AI safety.

'My highly personal skepticism braindump on existential risk from artificial intelligence.' — expresses the author's uneasiness over high existential risk estimates from AGI (e.g., 80% doom by 2070).
'Counterarguments to the basic AI risk case' — a detailled assessment which argues that there are key gaps in the proposition that superhuman AI systems presents an existential risk.

Mitigation & Governance

There are a number of organisations which research and advocate for AI safety. Potential solutions and what is being done now to regulate AI is explored here.

OpenAI — a research lab dedicated to conducting fundamental, long-term research toward the creation of safe AGI.
Legal Priorities Project — an organisation whose mission is "to conduct and support strategic legal research that mitigates existential risk and promotes the flourishing of future generations".
European Union Artificial Intelligence Act — proposed legislation by a major regulator to assign risk levels to Ai applications, representing the first of its kind,

Careers & Opportunities

Think that AI safety is important and something you would be interested in working on? Here you can find out more about how you can pursue a career in the field, or test your potential to have an impact in it.

AI Safety Training — a living document of AI safety programs, conferences, and other events.
AI Safety Support — an organisation which supports people wanting to reduce AI risk by offering them career coaching and building a community of those sharing this goal.
''How to pursue a career in technical AI alignment — useful information containing important considerations for those interested in working on AI safety.
'Levelling Up in AI Safety Research Engineering' — a career guide with goal checklists and resources to work through independently if you want to break into AI safety research engineering.

.We estimate there are around 400 people around the world working directly on reducing the chances of an AI-related existential catastrophe (with a 90% confidence interval ranging between 200 and 1,000)

(80,000 Hours, Jan 2023)

“