I'm the (co)author of the project. Please note that it's just a one-day proof of concept, imagine what a well motivated corporation could do.
It's nothing new (someone correctly pointed the EFF project) but I wanted to make a real world demo out of it.
The demo doesn't store each bits of info separately, it simply creates an hash out of them. If I stored the data separately I could for example identify small user agent updates or screen resolution changes or newly installed plugins and so on.
Also, many of the info can be gathered without JS and actually browsing with NoJS puts you in a very restricted niche making you even more trackable ;)
The demo is far from perfect but I believe that even a 90% reliability is alarming. Anyway I'll put all the source code on Github for you to review. I hope to be able to add the NoScript code as well soon.
You can view source to see what they're using to generate the fingerprint: screenSize, devicePixelRatio, timezone, mimeTypes, plugins, httpAcceptHeaders, fonts. It's interesting that these are enough to generate a moderately unique fingerprint. I'm sure my fonts list is unique, so that's probably enough to ID me right there. However, not every computer I use has the same fonts installed, nor the same screen dimensions. This won't track me as I go from my desktop to mobile device.
Professional authentication managers such as RSA Adaptive Authentication can gather 40 or 50 data points from which to tell if a user is somewhat the same or not. They apply a ratio to the value generated by which a user can be redirected to a challenge question. It's not foolproof but it prevents a lot of automated phishing or botnet scams from being able to automatically log in with your credentials.
I would say how unique the fingerprint really is is actually an important issue, I wonder how much traffic / time does it take before collisions start to occur. In the current setting it would probably suffice if the fingerprint generated one of just 100 or 500 values, the traffic is probably rather low and you visit the page for maybe few minutes and you probably won't go back to it in 3 days or even in 1 hour just to check whether some other guy didn't overwrite your secret word.
Regardless, it's a very interesting idea, and also picturing how difficult and counter-intuitive security can be if you do not study such issues, as an API designer I would surely have a hard time foreseeing that exposing the screen size or fonts list can turn out to be a security issues for the users.
That said, they can probably eventually start correlating the different fingerprints using other data, like device id, location patterns, etc. It would not be impossible to build a dossier of all your browsers and devices, especially if you ever log in to any online service from multiple machines.
Simply resizing my browser window before pasting the second url seems to thwart this (But I don't have flash installed). Without flash, it falls firmly into the "kinda-works sometimes if everything goes perfect" camp.
Resizing my browser didn't go it but moving it to another screen it changed from "1ccf9e9301db4fb87b1d178d77edad5bfa598057" to "ab0e6beb449408b28473dd66a6f4501528087c0e". I don't think this method is prefect at all or should be used for anything reliable(like logins).
Not necessarily, they fingerprint you using the info the browser gives to them. This site for example uses JS to do the fingerprinting, but it could be just as easily (perhaps less scalable) to do the fingerprinting serverside.
It seems like the biggest information leak is installed fonts. If I install one extra font above the default for my OS, I am leaking a good deal of information:
https://panopticlick.eff.org/
Would it not make sense for my OS to sandbox which fonts can be accessed by my browser? If a webpage wants to use a special font-family, I could be prompted to allow/block access to my greater font library.
Does anyone have any idea how well this works with corporate enviroments where the typical workstation is a clone of all the others behind the same (NAT'es) address?
Underpants could be a synonym for underwear, in which case it could make sense: you'd like to keep you fingerprint private on the web, and your underwear private in real life.
You could accomplish the same thing using local storage in an iframe with postmessage and it'd be a lot more robust with fairly significant browser support. (IE8+)
I built a demo a year ago that let you store personal data and exposed a postmessage API for storing and sharing permissions and personal data with sites as kind of the beginnings of a poor man's client-side only Oauth.
After typing "meow" and hitting enter. The copy paster url had this to say about me "It seems you didn't save the word. Go to lab.cubiq.org/underpants first."
The unique fingerprint is also different. "93615388f7f54cd79d2f806ac3795c182217aa9b" somehow became "f37ec3fdd05c27c13cbb7fcdef95cc004297f62d" after copy-pasting.
Other than that technical glitch for me (Linux, Chrome latest unstable version), I still think this is actually a pretty good idea. But will websites use it now that the ones we actually want to worry about are injected into every website via Tweet and Like and + and whatever buttons.
Google in particular is everywhere with their gAnalytics tracking code.
edit: now that I think about it, I may have misunderstood the point. Was it a proof of concept of providing cross-site tracking without tying to a personal identity?
If not, insecurities in cross-site whatever hardly matter when I am logged into every little tidbit that is loaded via iframe and appears on almost every website. Even porn sites have like buttons these days.
You have to remember that it's not just about the physical tracking, it's also about time. If the page takes seconds to run it's not feasible on billions of requests because the tracking software sits in between the page getting paid and the target destination. If your tracking script takes seconds (this technique only works after the page is loaded), or provides a white screen jump period, it isn't something that will be desired. Unless of course all other options are removed.
Getting all pages on the web to remove the old image based cookie tracking in support of JS etc... also will never happen unless extreme circumstances occur. Most of the people running ad sites have no idea what you are talking about anyway in realms outside of Cookie and Tracking.
> This technique can be used to find out some of the softwares installed on your system. For example I can say that you probably don't have Adobe Creative Suite installed.
Sorry, I have CS3 installed and fully licensed. But still a nice demo.
This fingerprinting can be defeated by NoScript. Or by turning off JavaScript. A sensible idea would be to make an extension that purges the unique elements from the set they're tracking (i.e. fonts, plugins, mimeTypes, screen size and pixel ratio, etc.) and provide a white-list for sites you want to have that information.
Fingerprinting can be very useful to you as a user, as well. Imagine that you switch between devices and work locations all the time and you use a core suite of web applications. Fingerprinting could be used, with your permission, to uniquely identify you across all your devices, locations, and browsers for the suite of services that you depend upon. Not having to ever enter a password again while remaining secure sounds pretty nice to me.
It is not the technology, but the evil application of said technology that is evil.
Today I learned that surfing with different firefox profiles out of privacy concerns is only useful if you disable Flash. Or any other plugin, for that matter.
In 2008 I worked at a major credit card company and they were building the exact same thing, only with more like 75 attributes. Of course it was all through a 3rd party so they wouldn't have any PII, but it was their design. They'd to this to build super-cookies and then track prospects across multiple products. It was awful.
It depends on what they are looking for. For example, a company could gather information about your browsing habits using advertising. If a company like Google distributes ads across a wide variety of sites, they can use your fingerprint to gather lots of information about your browsing habits.
This is scary not because it enables them to provide more relevant ads, but because it enables them to sell personal information to organizations like corporations and the government. Imagine your boss being able to buy a package that tells him what type of porn you like, how often you view porn, your most visited subreddits, etc.
And often from this information, these companies (based on aggregate data) can make more sweeping generalizations (that are often incorrect, but also often right on the mark) like income bracket, ethnicity, drug use habits, sexual orientation, etc. These approximations can also be bought.
Imagine that your 'package' has some information in it deducing that you regularly use cocaine. Even if this is not true and the person observing this information knows it might not be true, the fact that it has been stated might be enough to lose you some important opportunity.
I don't know how common it is for someone to buy this information, but I know that the information is already out there and the potential for things like this is very large.
Can anyone explain what use-case this technique enables that is not served by cookie tracking?
Or is the point that disabling cookies is not sufficient to avoid being tracked? Everyone has cookies enabled (or many sites don't work), so if that's all it is, nbd..
[+] [-] cubiq|14 years ago|reply
It's nothing new (someone correctly pointed the EFF project) but I wanted to make a real world demo out of it.
The demo doesn't store each bits of info separately, it simply creates an hash out of them. If I stored the data separately I could for example identify small user agent updates or screen resolution changes or newly installed plugins and so on.
Also, many of the info can be gathered without JS and actually browsing with NoJS puts you in a very restricted niche making you even more trackable ;)
The demo is far from perfect but I believe that even a 90% reliability is alarming. Anyway I'll put all the source code on Github for you to review. I hope to be able to add the NoScript code as well soon.
[+] [-] TomGullen|14 years ago|reply
[+] [-] robterrell|14 years ago|reply
[+] [-] peterwwillis|14 years ago|reply
[+] [-] nodata|14 years ago|reply
[+] [-] stiff|14 years ago|reply
Regardless, it's a very interesting idea, and also picturing how difficult and counter-intuitive security can be if you do not study such issues, as an API designer I would surely have a hard time foreseeing that exposing the screen size or fonts list can turn out to be a security issues for the users.
[+] [-] masklinn|14 years ago|reply
May not even track from one browser to the next. Camino and Safari generates different fingerprints on my machine.
[+] [-] jamesaguilar|14 years ago|reply
[+] [-] unknown|14 years ago|reply
[deleted]
[+] [-] noonespecial|14 years ago|reply
So its another demonstration of flash being ridiculously insecure. These guys did it better, even defeating tor to reveal the origin IP. http://dl.packetstormsecurity.net/0610-advisories/Practical_...
[+] [-] Keverw|14 years ago|reply
[+] [-] pwenzel|14 years ago|reply
[+] [-] toomanysecrets|14 years ago|reply
[+] [-] justincormack|14 years ago|reply
[+] [-] blahedo|14 years ago|reply
[+] [-] neilk|14 years ago|reply
[+] [-] SoftwareMaven|14 years ago|reply
[+] [-] pwg|14 years ago|reply
[+] [-] brudgers|14 years ago|reply
[+] [-] marco_polo|14 years ago|reply
TL;DR serverside code can fingerprint you
[+] [-] Sivart13|14 years ago|reply
[+] [-] wunderland|14 years ago|reply
Would it not make sense for my OS to sandbox which fonts can be accessed by my browser? If a webpage wants to use a special font-family, I could be prompted to allow/block access to my greater font library.
[+] [-] david_a_r_kemp|14 years ago|reply
[+] [-] king_magic|14 years ago|reply
[+] [-] wtvanhest|14 years ago|reply
2nd step
3rd step profit
[+] [-] Siverv|14 years ago|reply
[+] [-] 5h|14 years ago|reply
[+] [-] asto|14 years ago|reply
[+] [-] eldude|14 years ago|reply
I built a demo a year ago that let you store personal data and exposed a postmessage API for storing and sharing permissions and personal data with sites as kind of the beginnings of a poor man's client-side only Oauth.
[+] [-] Swizec|14 years ago|reply
After typing "meow" and hitting enter. The copy paster url had this to say about me "It seems you didn't save the word. Go to lab.cubiq.org/underpants first."
The unique fingerprint is also different. "93615388f7f54cd79d2f806ac3795c182217aa9b" somehow became "f37ec3fdd05c27c13cbb7fcdef95cc004297f62d" after copy-pasting.
Other than that technical glitch for me (Linux, Chrome latest unstable version), I still think this is actually a pretty good idea. But will websites use it now that the ones we actually want to worry about are injected into every website via Tweet and Like and + and whatever buttons.
Google in particular is everywhere with their gAnalytics tracking code.
edit: now that I think about it, I may have misunderstood the point. Was it a proof of concept of providing cross-site tracking without tying to a personal identity?
If not, insecurities in cross-site whatever hardly matter when I am logged into every little tidbit that is loaded via iframe and appears on almost every website. Even porn sites have like buttons these days.
[+] [-] shpoonj|14 years ago|reply
[+] [-] methodin|14 years ago|reply
Getting all pages on the web to remove the old image based cookie tracking in support of JS etc... also will never happen unless extreme circumstances occur. Most of the people running ad sites have no idea what you are talking about anyway in realms outside of Cookie and Tracking.
[+] [-] polynomial|14 years ago|reply
Sorry, I have CS3 installed and fully licensed. But still a nice demo.
[+] [-] pbhjpbhj|14 years ago|reply
I'm guessing you're on Mac OSX? Can you have fonts available that aren't apparent to the browser somehow?
[+] [-] cubiq|14 years ago|reply
[+] [-] DanielBMarkham|14 years ago|reply
[+] [-] robterrell|14 years ago|reply
[+] [-] jackcviers|14 years ago|reply
It is not the technology, but the evil application of said technology that is evil.
[+] [-] finalcut|14 years ago|reply
When I went I saved the word "what" against the fingerprint of 0e24f67890fb99dfd6fa147adc5634224e6cf509
Then, I opened a new tab, copied and pasted the url to ghosttouch and was given this fingerprint: 0373164e6053f6d4d1e0cea156be83e5a45e13d4
So, out of curiosity I copied and pasted the url in the same tab where I had "saved" my word:
0373164e6053f6d4d1e0cea156be83e5a45e13d4
It generated the same code.
So then I went back and re-saved my word, went to the ghosttouch site again and this time it loaded my word.
Something wonky in there but I'm not sure what.
[+] [-] mverwijs|14 years ago|reply
[+] [-] mrgreenfur|14 years ago|reply
[+] [-] reilly3000|14 years ago|reply
[+] [-] PaperclipTaken|14 years ago|reply
This is scary not because it enables them to provide more relevant ads, but because it enables them to sell personal information to organizations like corporations and the government. Imagine your boss being able to buy a package that tells him what type of porn you like, how often you view porn, your most visited subreddits, etc.
And often from this information, these companies (based on aggregate data) can make more sweeping generalizations (that are often incorrect, but also often right on the mark) like income bracket, ethnicity, drug use habits, sexual orientation, etc. These approximations can also be bought.
Imagine that your 'package' has some information in it deducing that you regularly use cocaine. Even if this is not true and the person observing this information knows it might not be true, the fact that it has been stated might be enough to lose you some important opportunity.
I don't know how common it is for someone to buy this information, but I know that the information is already out there and the potential for things like this is very large.
[+] [-] robfig|14 years ago|reply
Or is the point that disabling cookies is not sufficient to avoid being tracked? Everyone has cookies enabled (or many sites don't work), so if that's all it is, nbd..
[+] [-] slavak|14 years ago|reply
Just to emphasize that this is meant to demonstrate privacy risks, not to be taken as a feature suggestion...