3 min read

OpenAI, a non-profit artificial intelligence research firm, published a paper yesterday, arguing that long term AI safety research needs social scientists to make sure that AI alignment algorithms succeed when actual humans are involved. AI alignment (or value alignment) refers to the task of ensuring that AI systems reliably do what humans want them to do. “Since we are trying to behave in accord with people’s values, the most important data will be data from humans about their values”, states the OpenAI team.

However, to properly align the advanced AI systems with human values, many uncertainties that are related to the psychology of human rationality, emotion, and biases would have to be resolved. The researchers believe that these can be resolved via experimentation where they train the AI to do what humans want them to do (reliably) by studying humans. This would involve questioning people about what they want from AI, and then training the machine learning models based on this data. Once the models are trained, they can then be optimized to perform well as per these models.

But, it’s not that simple. This is because humans can’t be completely relied upon when it comes to answering questions related to their values. “Humans have limited knowledge and reasoning ability, and exhibit a variety of cognitive biases and ethical beliefs that turn out to be inconsistent on reflection”, states the OpenAI team. Researchers believe that different ways that a question is presented can interact differently with human biases, which in turn, can produce either low or high-quality answers.

To further solve this issue, researchers have come out with experimental debate comprising only of humans in place of the ML agents. Now, although these experiments will be motivated by ML algorithms, they will not involve any ML systems or need any kind of ML background.


“Our goal is ML+ML+human debates, but ML is currently too primitive to do many interesting tasks. Therefore, we propose replacing ML debaters with human debaters, learning how to best conduct debates in this human-only setting, and eventually applying what we learn to the ML+ML+human case”, reads the paper.

Now, as all of this human debate doesn’t require any machine learning, it becomes a purely social science experiment that is motivated by ML considerations but does not need ML expertise to run. This, in turn, makes sure that the core focus is on the component of AI alignment uncertainty specific to humans.

Researchers state that a large proportion of AI safety researchers are focused on machine learning, even though it is not necessarily a sufficient background to conduct these experiments. This is why social scientists with experience in human cognition, behavior, and ethics, are needed for the careful design and implementation of these rigorous experiments.

“This paper is a call for social scientists in AI safety. We believe close collaborations between social scientists and ML researchers will be necessary to improve our understanding of the human side of AI alignment and hope this paper sparks both conversation and collaboration”, states the researchers.

For more information, check out the official research paper.

Read Next

OpenAI’s new versatile AI model, GPT-2 can efficiently write convincing fake news from just a few words

OpenAI charter puts safety, standards, and transparency first

OpenAI launches Spinning Up, a learning resource for potential deep learning practitioners

Subscribe to the weekly Packt Hub newsletter. We'll send you the results of our AI Now Survey, featuring data and insights from across the tech landscape.