Every time you open Facebook, one of the world’s most influential,
controversial, and misunderstood algorithms springs into action. It
scans and collects everything posted in the past week by each of your
friends, everyone you follow, each group you belong to, and every
Facebook page you’ve liked. For the average Facebook user, that’s more
than 1,500 posts. If you have several hundred friends, it could be as
many as 10,000. Then, according to a closely guarded and constantly
shifting formula, Facebook’s news feed algorithm ranks them all, in what
it believes to be the precise order of how likely you are to find each
post worthwhile. Most users will only ever see the top few hundred.No one outside Facebook knows for sure how it does this, and no one
inside the company will tell you. And yet the results of this automated
ranking process shape the social lives and reading habits of more than 1
billion daily active users—one-fifth of the world’s adult population.
The algorithm’s viral power has turned the media industry upside down,
propelling startups like BuzzFeed and Vox to national
prominence while 100-year-old newspapers wither and die. It fueled the
stratospheric rise of billion-dollar companies like Zynga and
LivingSocial—only to suck the helium from them a year or two later with a
few adjustments to its code, leaving behind empty-pocketed investors
and laid-off workers. Facebook’s news feed algorithm can be tweaked to make us happy or sad; it can expose us to new and challenging ideas or insulate us in ideological bubbles.
And yet, for all its power, Facebook’s news feed algorithm is
surprisingly inelegant, maddeningly mercurial, and stubbornly opaque. It
remains as likely as not to serve us posts we find trivial, irritating,
misleading, or just plain boring. And Facebook knows it. Over the past
several months, the social network has been running a test in which it
shows some users the top post in their news feed alongside one other,
lower-ranked post, asking them to pick the one they’d prefer to read.
The result? The algorithm’s rankings correspond to the user’s
preferences “sometimes,” Facebook acknowledges, declining to get more
specific. When they don’t match up, the company says, that points to “an
area for improvement.”
“Sometimes” isn’t the success rate you might expect for such a
vaunted and feared bit of code. The news feed algorithm’s outsize
influence has given rise to a strand of criticism that treats it as if
it possessed a mind of its own—as if it were some runic form of
intelligence, loosed on the world to pursue ends beyond the ken of human
understanding. At a time when Facebook and other Silicon Valley giants
increasingly filter our choices and guide our decisions through machine-learning software, when tech titans like Elon Musk and scientific laureates like Stephen Hawking are warning of the existential threat posed by A.I., the word itself—algorithm—has
begun to take on an eerie affect. Algorithms, in the popular
imagination, are mysterious, powerful entities that stand for all the
ways technology and modernity both serve our every desire and threaten
the values we hold dear.
The reality of Facebook’s algorithm is somewhat less fantastical, but
no less fascinating. I had a rare chance recently to spend time with
Facebook’s news feed team at their Menlo Park, California, headquarters
and see what it actually looks like when they make one of those
infamous, market-moving “tweaks” to the algorithm—why they do it, how
they do it, and how they decide whether it worked. A glimpse into its
inner workings sheds light not only on the mechanisms of Facebook’s news
feed, but on the limitations of machine learning, the pitfalls of
data-driven decision making, and the moves Facebook is increasingly
making to collect and address feedback from individual human users,
including a growing panel of testers that are becoming Facebook’s
equivalent of the Nielsen family.
Facebook’s algorithm, I learned, isn’t flawed because of some glitch
in the system. It’s flawed because, unlike the perfectly realized,
sentient algorithms of our sci-fi fever dreams,
the intelligence behind Facebook’s software is fundamentally human.
Humans decide what data goes into it, what it can do with that data, and
what they want to come out the other end. When the algorithm errs,
humans are to blame. When it evolves, it’s because a bunch of humans
read a bunch of spreadsheets, held a bunch of meetings, ran a bunch of
tests, and decided to make it better. And if it does keep getting
better? That’ll be because another group of humans keeps telling them
about all the ways it’s falling short: us.
When I arrive at Facebook’s sprawling, Frank Gehry–designed office in
Menlo Park, I’m met by a lanky 37-year-old man whose boyish countenance
shifts quickly between an earnest smile and an expression of intense
focus. Tom Alison is director of engineering for the news feed; he’s in
charge of the humans who are in charge of the algorithm.
Alison steers me through a maze of cubicles and open minikitchens
toward a small conference room, where he promises to demystify the
Facebook algorithm’s true nature. On the way there, I realize I need to
use the bathroom and ask for directions. An involuntary grimace crosses
his face before he apologizes, smiles, and says, “I’ll walk you there.”
At first I think it’s because he doesn’t want me to get lost. But when I
emerge from the bathroom, he’s still standing right outside, and it
occurs to me that he’s not allowed to leave me unattended.
For the same reason—Facebook’s fierce protection of trade
secrets—Alison cannot tell me much about the actual code that composes
the news feed algorithm. He can, however, tell me what it does, and
why—and why it’s always changing. He starts, as engineers often do, at
the whiteboard.
“When you study computer science, one of the first algorithms you
learn is a sorting algorithm,” Alison says. He scribbles a list of
positive integers in dry erase:
Advertisement
4, 1, 3, 2, 5
The simple task at hand: devise an algorithm to sort these numbers
into ascending order. “Human beings know how to do this,” Alison says.
“We just kind of do it in our heads.”
Computers, however, must be told precisely how. That requires an
algorithm: a set of concrete instructions by which a given problem may
be solved. The algorithm Alison shows me is called “bubble sort,” and it
works like this:
- For each number in the set, starting with the first one, compare it to the number that follows, and see if they’re in the desired order.
- If not, reverse them.
- Repeat steps 1 and 2 until you’re able to proceed through the set from start to end without reversing any numbers.
he virtue of bubble sort is
its simplicity. The downside: If your data set is large, it’s
computationally inefficient and time-consuming. Facebook, for obvious
reasons, does not use bubble sort. It does use a sorting algorithm to
order the set of all posts that could appear in your news feed when you
open the app. But that’s the trivial part—a minor subalgorithm within
the master algorithm. The nontrivial part is assigning all those posts a
numerical value in the first place. That, in short, is the job of the
news feed ranking team: to devise a system capable of assigning any
given Facebook post a “relevancy score” specific to any given Facebook
user.
That’s a hard problem, because what’s relevant to you—a post from
your childhood friend or from a celebrity you follow—might be utterly
irrelevant to me. For that, Alison explains, Facebook uses a different
kind of algorithm, called a prediction algorithm. (Facebook’s news feed
algorithm, like Google’s search algorithm or Netflix’s recommendation
algorithm, is really a sprawling complex of software made up of smaller
algorithms.)
“Let’s say I ask you to pick the winner of a future basketball game,
Bulls vs. Lakers,” Alison begins. “Bulls,” I blurt. Alison laughs, but
then he nods vigorously. My brain has taken his input and produced an
immediate verbal output, perhaps according to some impish algorithm of
its own. (The human mind’s algorithms are far more sophisticated than
anything Silicon Valley has yet devised, but they’re also heavily reliant on heuristics and notoriously prone to folly.)
Random guessing is fine when you’ve got nothing to lose, Alison says.
But let’s say there was a lot of money riding on my basketball
predictions, and I was making them millions of times a day. I’d need a
more systematic approach. “You’re probably going to start by looking at
historical data,” he says. “You’re going to look at the win-loss record
of each team, the records of the individual players, who’s injured,
who’s on a streak.” Maybe you’ll take into account environmental
factors: Who’s the home team? Is one squad playing on short rest, or
after a cross-country flight? Your prediction algorithm might
incorporate all of these factors and more. If it’s good, it will not
only predict the game’s winner, but tell you its degree of confidence in
the result.
That’s analogous to what Facebook’s news feed algorithm does when it
tries to predict whether you’ll like a given post. I ask Alison how many
variables—”features,” in machine-learning lingo—Facebook’s algorithm
takes into account. “Hundreds,” he says.
It doesn’t just predict whether you’ll actually hit the like button
on a post based on your past behavior. It also predicts whether you’ll
click, comment, share, or hide it, or even mark it as spam. It will
predict each of these outcomes, and others, with a certain degree of
confidence, then combine them all to produce a single relevancy score
that’s specific to both you and that post. Once every possible post in
your feed has received its relevancy score, the sorting algorithm can
put them in the order that you’ll see them on the screen. The post you
see at the top of your feed, then, has been chosen over thousands of
others as the one most likely to make you laugh, cry, smile, click,
like, share, or comment.
No comments:
Post a Comment