INMA: Facebook’s legendary algorithm demystified
On June 29, Facebook announced an update to an algorithm ranking stories on the News Feed, the network’s main product. The change would prioritise posts of users’ family and friends over posts by publishers and brands. Facebook warned that the update might cause reach and referral traffic to decline to some Pages.
A month later — as data collected by Parse.ly, an analytics company, suggest — the referral traffic from Facebook to 600 digital publishers has actually grown.
This year’s summer (in part of the world, at least) proved to be hot in the world of algorithms. Not only Facebook but Google also updated its search algorithm in June, according to SimilarWeb, another analytics company.
The result turned out to be pretty graceful for news publishers, as their organic search traffic grew in June by 14% versus May across the board. Some publications like the New York Post enjoyed a jump of up to 155%.
All of these events, however fortunate, echo an old Polish proverb: “Master’s grace rides a motley horse.” It shares a wisdom of a medieval peasant whose fortune depended on his feudal lord’s whim: “Eat the lunch while it’s free.”
Content overload and the death of organic reach
Digital publishers’ own experience and data from various sources confirm that organic or free reach of content published by professional publishers and brands on the main social networks have been steadily decreasing in the past year, while the number of posts surged.
TrackMaven, a social analytics company, tracked activities of 22,957 brands across major B2B and B2C industries on Facebook, Twitter, Instagram, Pinterest, LinkedIn, and blogs in 2015. It found that the output of content increased on average by 35%, while content engagement decreased by 17%.
“Content overload is quantifiable,” noted analysts of TrackMaven. “When content output spiked in October 2015, engagement levels took the sharpest downturn. There is a ceiling to how much content can be consumed, liked, and shared. Brands and social networks alike are competing now more than ever for their share of engagement.”
Facebook, as the world’s largest network with 1.7 billion monthly active users, both enjoys and struggles most with that flood of content.
Every time someone visits the News Feed, there are, on average, 1,500 potential stories from friends, people, or Pages they follow, and most people don’t have time to see them all.
As news publishers, do you think we compete for attention with other news outlets? Is The New York Times competing mainly with The Washington Post? Is CNN competing with BBC?
No, we all compete with all these individual users sharing private stories and pictures and with more than 60 million businesses that set up Pages on Facebook and put their faith and money in content marketing.
“That’s why stories in News Feed are ranked,” explained Adam Mosseri, vice president of product management for News Feed, on Facebook’s official blog. “So people can see what they care about first and don’t miss important stuff from their friends. If the ranking is off, people don’t engage and leave dissatisfied. So one of our most important jobs is getting this ranking right.”
How the News Feed algorithm works
The algorithm’s details are secret, but we know a lot. And I don’t mean legends or conspiracy theories present on many of the 97.8 million Web pages that Google finds when asked for the “Facebook algorithm.”
For an upcoming INMA report on Facebook, I reviewed more than a dozen patents filed by Mark Zuckerberg and other Facebook’s employees since 2006.
What did I learn?
- In general, the News Feed’s algorithm ranks individual news items rather than publishers. But, when selecting stories, Facebook takes into account something it calls “a reputation metric of a publishing entity.” I haven’t found an explanation of this metric. (Source: “Selectively providing content on a social networking system,” 2015)
- News items — posts, links, pictures, videos, or polls — are scored and ranked for each and every user individually, as well as for sets of similar users (grouped according to their demographic characteristics like age, ethnicity, income, language). This means the same piece of content can be scored differently for different users. There seems not be anything like universal metrics for the quality of a story. (Source: “Adaptive ranking of news feed in social networking systems,” 2013)
- The algorithm determines the relevance of stories based upon:
- Attributes of an individual user of the News Feed like her interests or location.
- Attributes of each story item like topic, type of content (post, photo, video, Instant Articles, etc.), engagement (clicks, likes and other reactions, shares, comments), or a predicted virality.
- Affinity of the user with the content’s creator, who might be another user or a Page publisher. Is it a direct connection, or a friend of a friend? Do they interact often? Have they ever been tagged together on pictures or checked in to the same place? Patents don’t describe the exact weights that Facebook assigns to these attributes but the company informs the public about the changes on its official blog in a category called “News Feed FYI.”
- The algorithm calculates the probability of the user performing different types of interactions with the story and assigns a value score, or a number, to each one. It is Facebook that determines which interactions are more valuable — for example, whether “comment” has a higher value than “share” — or which content generates certain interactions. Patents don’t contain detailed information on the weight of different interactions. (Source: “Arranging stories in News Feed based on expected value scoring on a social network system,” 2014)
- Diversity of sources, topics, storytelling types, and interactions with content is important in the final selection of items to be displayed in the News Feed. It means the algorithm was designed to avoid publishing multiple items from the same source, on the same topic, or having the same content or even the same news item’s type in one feed. One of the changes that Facebook announced in June was to limit this plurality and display more stories from the same source. (Source: “Diversity enforcement on a social networking system news feed,” 2014)
- Before anything is displayed to the user, Facebook checks whether she hasn’t seen any stories before, and then it displays the feed in a following order: The most engaging and the most recent items published since the previous visit of the user go first, other recent stories not seen previously by her go second, and then the rest of the items (including those already seen) goes below in the reverse chronological order. It is possible, though, that an older story will be displayed above the newer one if the former has been highly interacted with by the user’s connections.
- Patents confirm that the network prioritises stories from the user’s personal life, such as an engagement, birth of a child, moving across a country, graduating from college, and starting a new job. Facebook discovers and boosts these stories when the user updates her profile or when the network infers these events based on analysis of content of her posts, and changes in connections.
Summary in a video format
Here is a simplified description of the algorithm presented by Mosseri at the Facebook Developer Conference F8 in San Francisco in April 2016.
Big Data in Facebook
To display the most relevant stories to each user, Facebook collects vast amounts of data about users and their behaviour. Patents reveal the extent and scope of information collected and the way it is used to deliver better user experiences.
- When creating a profile of a user, Facebook takes into account whatever information she declared about herself like age, gender, marital status, education, or employment history. However, it also tries to infer any gaps in her profile by analysing her posts, interactions, connections, GPS data from her phone, or even IP addresses of devices used to login to Facebook (to determine her residency).According to the patents, algorithms can infer age, gender, language, interests and preferences, education level and the school, employment status and the company, and affiliation to social organisations. (Source: “Inferring user profile attributes from social information,” 2014)
- Facebook records all the user’s actions in the network such as communications with other users, sharing photos, interactions with applications like social games, responding to polls, adding an interest, or joining an employee network. The network also collects time stamps of when these interactions occurred and duration of the activity.
- Facebook enhances its own data with external data that the network captures when its user accesses other Web sites, clicks on links, etc. All of this information is used to build the user’s profile and predict her future behaviour.
- Facebook uses many machine learning models to predict what news items the user may wish to see or interact with even when it lacks any direct input from her. As I have already written, Facebook divides users into different sets, for example, based on age, ethnicity, income, and language. Then it tracks their interactions with content and generates predictive models of each set of users.Models are periodically refined based on users’ actual actions. These models help Facebook to rank, among others, items posted by publishers and brands.
- Technically, when a publisher or a brand boosts its post, it sponsors a bump in one or more indicators used by Facebook to score items and selects them to be displayed in the News Feed. (Source: “Sponsored interfaces in a social networking system,” 2013)
With 1.7 billion users, Facebook is the most popular newspaper in the world. Instead of human curators, most of editorial decisions are made by algorithms designed by a small team of engineers. No wonder the way the algorithm fascinates people all over the world.
Here is a snapshot of how the interest in the News Feed has grown, as measured by the increase in people searching for it on Google.
Since 2013, Facebook has announced a change in its algorithm 23 times. We may expect further changes. Mr. Mosseri of Facebook said, “We view our work as only 1% finished.”
I love this quote by Rusty Coats, executive director of U.S. Local Media Consortium: “Worrying about algorithms is like worrying about Florida weather in the summer. Give it an hour, and it’ll change.”
Note to readers: Want to learn more about Facebook? I am writing a new strategy report for INMA focused on opportunities and risks of doing business on Facebook. Do you have a question? Is there any insight or experience you’d like to share with your peers? Contact me at firstname.lastname@example.org or @g_piechota.