top | item 19565918

540M Facebook Records Exposed

113 points| themattress | 7 years ago |techcrunch.com | reply

22 comments

order
[+] testplzignore|7 years ago|reply
The "540 million records" wording seems misleading (probably intentionally by UpGuard and/or TechCrunch). The screenshot on https://www.upguard.com/breaches/facebook-user-data-leak leads me to think that this is 540m object records of various types (posts, comments, etc), not records of 540m distinct users like some readers would think.

It sounds like a lot, but it's not. You could probably scrape that much data from public Facebook pages in a few days without even being logged in, especially a few years ago. Heck, you could say right now that Reddit has billions of user records exposed if you define them that way. The Hacker News first page itself links to thousands of user records :)

[+] kerng|7 years ago|reply
Amazing how a third party can harvest that amount of data and Facebook is freely handing it out... they really have no control over the data they process and handle. It's been shown again and again.

It seems Facebook should be forced to disable any kind of data sharing with 3rd parties since they obviously cannot make it work. They have enough issues with the security of internal data handling procedurs already that they have to fix, before giving data to third parties.

[+] SlowRobotAhead|7 years ago|reply
>It seems Facebook should be forced to disable any kind of data sharing with 3rd parties since they obviously cannot make it work.

That is a massive part of their model, so that will never happen. The alternative of course is to stop giving them data.

[+] anonytrary|7 years ago|reply
Third parties can always just resort to web-scraping if API support is dropped. If you consistently scrape public pages on Facebook, you can amass a trove of data within a year. By supporting an API, Facebook offers a controlled avenue for this to happen, which people pay for because it's easier than scraping. It will still happen if this doesn't exist, though.
[+] anonytrary|7 years ago|reply
These things are very hard to stop. First law of the internet says that if you have a public website, it will be scraped and turned into structured data. Over the years, Facebook has been adding more options to make profiles private, etc. but there are still loopholes around these things with 3rd party "delegated" authentication.
[+] torqueTorrent|7 years ago|reply
They were moving so fast that they broke things, badly!
[+] badwolf|7 years ago|reply
Seems it was 3rd party apps data stored in ... openly accessible S3 buckets. -_-
[+] nvr219|7 years ago|reply
Isn't this the most common reason for these leaks? Does Amazon not have screaming red banners saying "this is gonna be openly accessible?"
[+] collingreene|7 years ago|reply
https://www.facebook.com/data-abuse - as mentioned in the article this scenario (non-fb companies mishandling fb user data) is exactly the reason Facebooks data abuse bounty program exists. Hopefully the finders of this submitted to the program.
[+] jakequist|7 years ago|reply
tl;dr - Somebody scraped massive amounts of FB data over a number of years and then abandoned it on a public S3 bucket.
[+] mindfulplay|7 years ago|reply
It's the 21st century and I think it's time to stop calling these'records'.
[+] nvr219|7 years ago|reply
540M Facebook compact discs exposed
[+] torqueTorrent|7 years ago|reply
Alas, the 21st century provides the opportunity to address the growing scourge of using sounds or combinations of letters that communicate meaning without being divisible into smaller units capable of independent use.