Frances Haugen, a former product manager at the company, says she came forward after she saw Facebook’s leadership repeatedly prioritize profit over safety.
Before quitting in May of this year, she combed through Facebook Workplace, the company’s internal employee social media network, and gathered a wide swath of internal reports and research in an attempt to conclusively demonstrate that Facebook had willfully chosen not to fix the problems on its platform.
Today she testified in front of the Senate on the impact of Facebook on society. She reiterated many of the findings from the internal research and implored Congress to act.
Related Story
How Facebook got addicted to spreading misinformation
The company’s AI algorithms gave it an insatiable habit for lies and hate speech. Now the man who built them can't fix the problem.
“I’m here today because I believe Facebook’s products harm children, stoke division, and weaken our democracy,” she said in her opening statement to lawmakers. “These problems are solvable. A safer, free-speech respecting, more enjoyable social media is possible. But there is one thing that I hope everyone takes away from these disclosures, it is that Facebook can change, but is clearly not going to do so on its own.”
During her testimony, Haugen particularly blamed Facebook’s algorithm and platform design decisions for many of its issues. This is a notable shift from the existing focus of policymakers on Facebook’s content policy and censorship—what does and doesn’t belong on Facebook. Many experts believe that this narrow view leads to a whack-a-mole strategy that misses the bigger picture.
“I’m a strong advocate for non-content-based solutions, because those solutions will protect the most vulnerable people in the world,” Haugen said, pointing to Facebook’s uneven ability to enforce its content policy in languages other than English.
Haugen’s testimony echoes many of the findings from an MIT Technology Review investigation published earlier this year, which drew upon dozens of interviews with Facebook executives, current and former employees, industry peers, and external experts. We pulled together the most relevant parts of our investigation and other reporting to give more context to Haugen’s testimony.
Sign up for The Download - Your daily dose of what's up in emerging technology
Stay updated on MIT Technology Review initiatives and events?
Yes
No
How does Facebook’s algorithm work?
Colloquially, we use the term “Facebook’s algorithm” as though there’s only one. In fact, Facebook decides how to target ads and rank content based on hundreds, perhaps thousands, of algorithms. Some of those algorithms tease out a user’s preferences and boost that kind of content up the user’s news feed. Others are for detecting specific types of bad content, like nudity, spam, or clickbait headlines, and deleting or pushing them down the feed.
All of these algorithms are known as machine-learning algorithms. As I wrote earlier this year:
Unlike traditional algorithms, which are hard-coded by engineers, machine-learning algorithms “train” on input data to learn the correlations within it. The trained algorithm, known as a machine-learning model, can then automate future decisions. An algorithm trained on ad click data, for example, might learn that women click on ads for yoga leggings more often than men. The resultant model will then serve more of those ads to women.
And because of Facebook’s enormous amounts of user data, it can
develop models that learned to infer the existence not only of broad categories like “women” and “men,” but of very fine-grained categories like “women between 25 and 34 who liked Facebook pages related to yoga,” and [target] ads to them. The finer-grained the targeting, the better the chance of a click, which would give advertisers more bang for their buck.
The same principles apply for ranking content in news feed:
Just as algorithms [can] be trained to predict who would click what ad, they [can] also be trained to predict who would like or share what post, and then give those posts more prominence. If the model determined that a person really liked dogs, for instance, friends’ posts about dogs would appear higher up on that user’s news feed.
Before Facebook began using machine-learning algorithms, teams used design tactics to increase engagement. They’d experiment with things like the color of a button or the frequency of notifications to keep users coming back to the platform. But machine-learning algorithms create a much more powerful feedback loop. Not only can they personalize what each user sees, they will also continue to evolve with a user’s shifting preferences, perpetually showing each person what will keep them most engaged.
Who runs Facebook’s algorithm?
Within Facebook, there’s no one team in charge of this content-ranking system in its entirety. Engineers develop and add their own machine-learning models into the mix, based on their team’s objectives. For example, teams focused on removing or demoting bad content, known as the integrity teams, will only train models for detecting different types of bad content.
This was a decision Facebook made early on as part of its “move fast and break things” culture. It developed an internal tool known as FBLearner Flow that made it easy for engineers without machine learning experience to develop whatever models they needed at their disposal. By one data point, it was already in use by more than a quarter of Facebook’s engineering team in 2016.
Related Story
She risked everything to expose Facebook. Now she’s telling her story.
Sophie Zhang, a former data scientist at Facebook, revealed that it enables global political manipulation and has done little to stop it.
Many of the current and former Facebook employees I’ve spoken to say that this is part of why Facebook can’t seem to get a handle on what it serves up to users in the news feed. Different teams can have competing objectives, and the system has grown so complex and unwieldy that no one can keep track anymore of all of its different components.
As a result, the company’s main process for quality control is through experimentation and measurement. As I wrote:
Teams train up a new machine-learning model on FBLearner, whether to change the ranking order of posts or to better catch content that violates Facebook’s community standards (its rules on what is and isn’t allowed on the platform). Then they test the new model on a small subset of Facebook’s users to measure how it changes engagement metrics, such as the number of likes, comments, and shares, says Krishna Gade, who served as the engineering manager for news feed from 2016 to 2018.
If a model reduces engagement too much, it’s discarded. Otherwise, it’s deployed and continually monitored. On Twitter, Gade explained that his engineers would get notifications every few days when metrics such as likes or comments were down. Then they’d decipher what had caused the problem and whether any models needed retraining.
How has Facebook’s content ranking led to the spread of misinformation and hate speech?
During her testimony, Haugen repeatedly came back to the idea that Facebook’s algorithm incites misinformation, hate speech, and even ethnic violence.
“Facebook … knows—they have admitted in public—that engagement-based ranking is dangerous without integrity and security systems but then not rolled out those integrity and security systems in most of the languages in the world,” she told the Senate today. “It is pulling families apart. And in places like Ethiopia it is literally fanning ethnic violence.”
Here’s what I’ve written about this previously:
The machine-learning models that maximize engagement also favor controversy, misinformation, and extremism: put simply, people just like outrageous stuff.
Sometimes this inflames existing political tensions. The most devastating example to date is the case of Myanmar, where viral fake news and hate speech about the Rohingya Muslim minority escalated the country’s religious conflict into a full-blown genocide. Facebook admitted in 2018, after years of downplaying its role, that it had not done enough “to help prevent our platform from being used to foment division and incite offline violence.”
As Haugen mentioned, Facebook has also known this for a while. Previous reporting has found that it’s been studying the phenomenon since at least 2016.
In an internal presentation from that year, reviewed by the Wall Street Journal, a company researcher, Monica Lee, found that Facebook was not only hosting a large number of extremist groups but also promoting them to its users: “64% of all extremist group joins are due to our recommendation tools,” the presentation said, predominantly thanks to the models behind the “Groups You Should Join” and “Discover” features.
In 2017, Chris Cox, Facebook’s longtime chief product officer, formed a new task force to understand whether maximizing user engagement on Facebook was contributing to political polarization. It found that there was indeed a correlation, and that reducing polarization would mean taking a hit on engagement. In a mid-2018 document reviewed by the Journal, the task force proposed several potential fixes, such as tweaking the recommendation algorithms to suggest a more diverse range of groups for people to join. But it acknowledged that some of the ideas were “antigrowth.” Most of the proposals didn’t move forward, and the task force disbanded.
In my own conversations, Facebook employees also corroborated these findings.
A former Facebook AI researcher who joined in 2018 says he and his team conducted “study after study” confirming the same basic idea: models that maximize engagement increase polarization. They could easily track how strongly users agreed or disagreed on different issues, what content they liked to engage with, and how their stances changed as a result. Regardless of the issue, the models learned to feed users increasingly extreme viewpoints. “Over time they measurably become more polarized,” he says.
Related Story
The race to understand the exhilarating, dangerous world of language AI
Hundreds of scientists around the world are working together to understand one of the most powerful emerging technologies before it’s too late.
In her testimony, Haugen also repeatedly emphasized how these phenomena are far worse in regions that don’t speak English because of Facebook’s uneven coverage of different languages.
“In the case of Ethiopia there are 100 million people and six languages. Facebook only supports two of those languages for integrity systems,” she said. “This strategy of focusing on language-specific, content-specific systems for AI to save us is doomed to fail.”
She continued: “So investing in non-content-based ways to slow the platform down not only protects our freedom of speech, it protects people’s lives.”
I explore this more in a different article from earlier this year on the limitations of large language models, or LLMs:
Despite LLMs having these linguistic deficiencies, Facebook relies heavily on them to automate its content moder
Educating People On Demand
Comments
Post a Comment