Article Hero
News3 minutes read
September 7, 2022
  • telegram
  • facebook
  • twitter
  • github

The Facebook Data Breach of 2018 and 2021

It's rare that the corporate impact of a single data breach can be felt twice. Even rarer is when the original incident and the 'flare up' are three full years apart.

Yet somehow, Facebook managed to experience just such a data breach. Their oversight in 2018 came back to bite them in the behind in 2021 when the hackers decided to finally leak the data that they had acquired.

Let's dive into the Facebook Data Breach of 2018, how it became news again in 2021, and what actions could have been taken to prevent both of these incidents from becoming the center of the public's outrage.


The Original Facebook Data Breach in 2018

'Scrape' is a painful word. When associated with dentists it brings up imagery of metal tools on sensitive teeth. When associated with skin, it conjures images of wicked abrasions and road rash.

In the world of data breaches, 'scrape' is just as painful. Often it means that someone used an otherwise legitimate method to access a massive amount of data in an unexpected way. When a scrape happens, the company is stuck trying to defend its open API access policies, its lack of data request throttling, and other complex policies that most normal users won't fully understand.

Worst of all is when the decision comes down from on high to say: 'It wasn't really a data breach because…'

All of these things happened to Facebook when a hacker took advantage of their phone number API. At the time, if you typed in a user's phone number, you could get all the rest of their directory information. This feature was intended to make it easy to hook up with friends. But it proved to be a massive weakness and was one of the contributors to the Cambridge Analytica scandal of 2018. Facebook shut the feature down… but not in time.

This allowed the hacker in question to 'legitimately' obtain user data, one request at a time. They just so happened to perform the request hundreds of millions of times, linking the phone numbers and the details of over 533 million Facebook user accounts in the process. They then sat on this data for 18 months.

Then, a security researcher found much of the scraped data sitting in an open database online in 2019. All told, around 400 million correlated user records were in that database. After verifying that the data was accurate, they went to Facebook for comment.

Facebook said half of the data was from prior leaks, and the other half must have been scraped before they closed the loophole in 2018. All said, their response seemed utterly unconcerned. They caught a lot of flack for not detecting the systematic scraping of their database, for including a feature that could be so openly abused in the first place, and for having such a tepid reaction to the news.

Fast Forward To 2021

In Q2 of 2021, the other shoe finally dropped.

Whereas 'only' a couple hundred million of the scraped users had been included in the old database, a more complete version of the scraped information showed up on hacking forums. It included the details of over 533 million Facebook users, from 106 countries. Records consisted of their phone numbers (of course), Facebook logins, full names, countries and states, birthdays, and in some cases their E-mail addresses.

Facebook's internal statement on this release got leaked shortly after. It stated:

'Longer term, though, we expect more scraping incidents and think it's important to both frame this as a broad industry issue and normalize the fact that this activity happens regularly.'

Their intention was to present mass data scraping as unavoidable, and not their problem. There was no word on safeguarding their APIs against future abuses. There was no comment on avoiding the release of powerful tools that could be easily data scraped.

And most importantly: They planned to issue no warning to users who were on the list of scraped names.

It was a three-stage denial process: Silence. Then try to get the problem to go away with a vague press statement. Then attempt to avoid all responsibility for mass scraping.

Though no medical or password information was in the leaks, that doesn't mean they're a non-issue. In this case, scammers, social engineers, and hackers now have a massive database that they can use to fool users into handing over more information. It's a treasure trove.

Facebook's refusal to even warn users is seen by some as a breach of their duty to protect their users in the context of international law. Several countries have gone after them for this incident. Their already tarnished record among industry professionals sank even lower after this particular trainwreck.

The Aftermath

Over the course of the last year, Facebook has done everything possible to distance itself from its own checkered past. Rebranding as 'Meta' was the first step, implying that practices would be different when dealing with virtual reality… or perhaps just hoping that people would forget their past transgressions.

2020 was a record year for social engineering scams. The more information that gets out there, and the more time people have on their hands, the greater the risk to the Internet community. The FBI reported that the top types of fraud were phishing, non-payment scams, and extortion. All of these are facilitated by tying together phone numbers, locations, and online accounts.

To take none of the blame for mass scraping incidents, to intentionally design features that can easily be used to mass scrape without any safeguards… that's a level of irresponsibility and tone-deafness that shouldn't be tolerated.

Please keep your privacy in mind when creating and using social media accounts. You never know when your real-life information, callously leaked by a company and not reported to the affected users, will be used against you by malicious actors. Use the appropriate privacy tools, and never give out your address, photo metadata, or other identifying information on an open forum.

Will R
Hoody Editorial Team

Will is a former Silicon Valley sysadmin and award-winning non-functional tester. After 20+ years in tech, he decided to share his experience with the world as a writer. His recent work involves documenting government hacking methods while probing the current state of privacy and security on the Internet.

Latest


Blog
Timer7 minutes read

How the Government Hacks You, Final Chapter: IoT Hacks

Chapter 14: IoT Hacks

Will R
6 months ago
Blog
Timer9 minutes read

How the Government Hacks You, Chapter 13: GPS Tracking

Dive into the unsettling world of government-controlled GPS tracking!

Will R
6 months ago
Blog
Timer7 minutes read

How the Government Hacks You, Chapter 12: Garbage Day

Trash Talk: How your garbage can be exploited by hackers, law enforcement, and government agencies

Will R
7 months ago
Blog
Timer8 minutes read

How the Government Hacks You, Chapter 11: Resonance Attacks

It’s time to uncover how government surveillance gets personal.

Will R
7 months ago

Bulletproof privacy in one click

Discover the world's #1 privacy solution

  • Chrome Icon
  • Brave Icon
  • Edge Icon
  • Chromium Icon
  • Coming soon

    Firefox Icon
  • Coming soon

    Safari Icon
  • Coming soon

    Opera Icon

No name, no email, no credit card required

Create Key