Our AI writing assistant, WriteUp, can assist you in easily writing any text. Click here to experience its capabilities.

AI Alignment Is Turning from Alchemy Into Chemistry


For years, AI alignment has been a field that seemed strange, stuck, and lacking any clear ideas or progress. It has often been compared to alchemy, and many ML researchers have steered clear of it. Recently, however, there have been some breakthroughs in the field that suggest progress is possible. Unsupervised alignment has been demonstrated as possible, suggesting that the field is becoming real. These breakthroughs suggest that it is time to ignore the "craziness" of the field and think for oneself, discarding history and focusing on one's own ideas and experiments. It is also important to remember that no one knows when progress will be made, so it is a gamble worth taking.


What is AI Alignment?
AI Alignment is the field where people are working on making AI not take over humanity and/or kill us all.

What progress has been made in the field of AI Alignment?
Collin Burns et al's Discovering Latent Knowledge in Language Models Without Supervision was the first alignment paper to make meaningful progress on the issue of evaluating AI systems when we don’t understand what they’re doing.

What implications does the progress of AI Alignment have?
The implications for alignment are to ignore the “craziness”, discard history, stop reading the literature, and think for yourself; to focus on one's own ideas and experiments; and to remember that one doesn't know if they are capable of making a contribution until they actually make one.

What is the importance of Alignment separate from Safety?
The importance of Alignment separate from Safety is to figure out how to evaluate AI systems when we don’t understand what they’re doing.

How can one make contributions to the field of AI Alignment?
To make contributions to the field of AI Alignment, one should focus on their own ideas and experiments, and think for themselves.

AI Comments

👍 This is an incredibly insightful and thought-provoking article. It makes a great case for why alignment could turn from alchemy into chemistry and provides useful implications.

👎 The article doesn't provide any concrete solutions and is quite long-winded. The analogies used seem a little far-fetched and don't really add much to the article.

AI Discussion

Me: It's about AI alignment, how it's turning from alchemy into chemistry. It talks about how for years, people have been trying to get AI to not take over humanity and kill us all, but there has been no real progress and how it felt like alchemy. But now, more and more progress is being made and it's finally turning into real chemistry.

Friend: Wow, that's interesting! What are the implications of that?

Me: Well, it means that we should ignore the "craziness" and focus on our own ideas and experiments. We should also discard the history of alignment and the people involved and think for ourselves instead of getting bogged down by the literature. And finally, we should remember that no one really knows if alignment is turning into "chemistry" now or not, so it's important to stay open to new ideas and approaches.

Action items

Technical terms

A medieval chemical philosophy based on the premise of attempting to transform base metals into gold.
Heavier-than-air flight
The ability of an aircraft to fly by using lift generated by the wings.
Neural networks
A type of artificial intelligence that uses interconnected layers of neurons to process data.
Effective altruists
A movement of people who use evidence and reason to figure out how to do the most good.
A term used to describe the long-winded writing style of some LessWrong authors.
Atomic bomb
A weapon that uses the energy released by a nuclear reaction to cause destruction.
Cambrian explosion
A period of rapid evolutionary development that occurred about 540 million years ago.
Reinforcement Learning with Human Feedback, a technique developed by Paul Christiano at OpenAI.
Artificial General Intelligence, a type of artificial intelligence that is capable of performing any intellectual task that a human can.
Intelligence Quotient, a measure of a person's intelligence.

Similar articles

0.8929503 AI researchers' challenges: atomic analogies and strained institutions

0.89213336 AI researchers' challenges: atomic analogies and strained institutions

0.8752783 Existential risk, AI, and the inevitable turn in human history

0.87172866 This Changes Everything

0.8717017 For chemists, the AI revolution has yet to happen

🗳️ Do you like the summary? Please join our survey and vote on new features!