Why we need a new agency to regulate advanced artificial intelligence

DEC 09, 2021

Lessons on AI control from the Facebook Files

In this article, I lay out what we can learn about the AI Control Problem using the lessons learned from the Facebook Files. I observe that the challenges we are facing can be distinguished into two categories: the technical problem of direct control of AI, i.e. of ensuring that an advanced AI system does what the company operating it wants it to do, and the governance problem of social control of AI, i.e. of ensuring that the objectives that companies program into advanced AI systems are consistent with society’s objectives. I analyze the scope for our existing regulatory system to address the problem of social control in the context of Facebook but observe that it suffers from two shortcomings. First, it leaves regulatory gaps; second, it focuses excessively on after-the-fact solutions. To pursue a broader and more pre-emptive approach, I argue the case for a new regulatory body—an AI Control Council—that has the power to both dedicate resources to conduct research on the direct AI control problem and to address the social AI control problem by proactively overseeing, auditing, and regulating advanced AI systems.


A fundamental insight from control theory[1] is that if you are not careful about specifying your objectives in their full breadth, you risk generating unintended side effects. For example, if you optimize just on a single objective, it comes at the expense of all the other objectives that you may care about. The general principle has been known for eons. It is reflected for example in the legend of King Midas, who was granted a wish by a Greek god and, in his greed, specified a single objective: that everything he touched turn into gold. He realized too late that he had failed to specify the objectives that he cared about in their full breadth when his food and his daughter turned into gold upon his touch.

The same principle applies to advanced AI systems that pursue the objectives that we program into them. And as we let our AI systems determine a growing range of decisions and actions and as they become more and more effective at optimizing their objectives, the risk and magnitude of potential side effects grow.

The revelations from the Facebook Files are a case in point: Facebook, which recently changed its name to Meta, operates two of the world’s largest social networks, the eponymous Facebook as well as Instagram. The company employs an advanced AI system—a Deep Learning Recommendation Model (DLRM)—to decide which posts to present in the news feeds of Facebook and Instagram. This recommendation model aims to predict which posts a user is most likely to engage with, based on thousands of data points that the company has collected about each of its billions of individual users and trillions of posts.

Facebook’s AI system is very effective in maximizing user engagement, but at the expense of other objectives that our society values. As revealed by whistleblower Frances Haugen via a series of articles in the Wall Street Journal in September 2021, the company repeatedly prioritized user engagement over everything else. For example, according to Haugen, the company knew from internal research that the use of Instagram was associated with serious increases in mental health problems related to body image among female teenagers but did not adequately address them. The company attempted to boost “meaningful social interaction” on its platform in 2018 but instead exacerbated the promotion of outrage, which contributed to the rise of echo chambers that risk undermining the health of our democracy. Many of the platform’s problems are even starker outside of the U.S., where drug cartels and human traffickers employed Facebook to do their business, and Facebook’s attempts to thwart them were insufficient. These examples illustrate how detrimental it can be to our society when we program an advanced AI system that affects many different areas of our lives to pursue a single objective at the expense of all others.With the development of ever more advanced artificial intelligence (AI) systems, some of the world’s leading scientists, AI engineers and businesspeople have expressed concerns that humanity may lose control over its creations, giving rise to what has come to be called the AI Control Problem. The underlying premise is that our human intelligence may be outmatched by artificial intelligence at some point and that we may not be able to maintain meaningful control over them. If we fail to do so, they may act contrary to human interests, with consequences that become increasingly severe as the sophistication of AI systems rises. Indeed, recent revelations in the so-called “Facebook Files” provide a range of examples of one of the most advanced AI systems on our planet acting in opposition to our society’s interests.