Digging through developers.facebook.com, it appears that their link scraper creates Objects in their Social Graph. Because these resources are accessible by URL, they are considered self-hosted objects, and they're always public [1].
Their Link Sharing FAQ mentions that Messenger is one of the ways link previews are created, further confirming this behavior [2].
Really the problem is that several situations coincide which make the result surprising:
- It's not so bad that posting URLs to Facebook generates a link preview and saves that as a public resource in their Object Graph.
- It's not so bad that you can find the Object's object-instance-id by the URL.
- It's not so bad that Facebook correlates a bunch of information about that Object's relationship with other nodes in their graph.
- For data they believe is all-public, it's not so bad that object-instance-ids are not cryptographically secure and are trivially crawlable.
But when taken together, it -is- surprising that URLs shared through Messenger (a setting that most users would assume to be "private") can be trivially crawlable.
There are lots of private endpoints that are secured by a token that you were able to obtain by being logged in. For instance Github's raw view of a file. Sharing this link in a private Messenger chat should not be public for the world to see.
One would naturally assume private conversations are 100% private and not scrapable by some third party.
I agree that Facebook should try to prevent private information from leaking out. Even if they are indirectly involved.
But I also think sites should never use personally identifiable information in the URL. There are much more sites that cause issues when sharing these kinds of URLs. To name a few: bit.ly, twitter, Comments in Hacker News.
Although the crux of the issue remains, the example of the document is bad. If anyone is able to open the document using just the link and no authentication, the document is effectively public.
this sort of thing also hints at why FB abandoned xmpp. FB wants to do what they please with your conversational data, which obviously runs counter to the culture of encryption that xmpp is embracing.
[+] [-] niftich|9 years ago|reply
Really the problem is that several situations coincide which make the result surprising:
- It's not so bad that posting URLs to Facebook generates a link preview and saves that as a public resource in their Object Graph.
- It's not so bad that you can find the Object's object-instance-id by the URL.
- It's not so bad that Facebook correlates a bunch of information about that Object's relationship with other nodes in their graph.
- For data they believe is all-public, it's not so bad that object-instance-ids are not cryptographically secure and are trivially crawlable.
But when taken together, it -is- surprising that URLs shared through Messenger (a setting that most users would assume to be "private") can be trivially crawlable.
[1] https://developers.facebook.com/docs/sharing/opengraph/objec... [2] https://developers.facebook.com/docs/sharing/webmasters/faq
[+] [-] Kaotique|9 years ago|reply
One would naturally assume private conversations are 100% private and not scrapable by some third party.
[+] [-] unknown|9 years ago|reply
[deleted]
[+] [-] starquake|9 years ago|reply
But I also think sites should never use personally identifiable information in the URL. There are much more sites that cause issues when sharing these kinds of URLs. To name a few: bit.ly, twitter, Comments in Hacker News.
[+] [-] ikeboy|9 years ago|reply
Keeping information in URLs is fine as long as it has enough entropy to be unbruteforcable. Bit.ly and t.co don't, and so aren't secure.
[+] [-] dingo_bat|9 years ago|reply
[+] [-] thefastlane|9 years ago|reply
[+] [-] rejectedstone|9 years ago|reply