Another interesting technique to fingerprint users online is called GPU Fingerprinting [1] (2022).
Codenamed 'DrawnApart', the technique relies on WebGL to count the number and speed of the execution units in the GPU, measure the time needed to complete vertex renders, handle stall functions, and more stuff
browsers should come with a default software renderer, and behave like the mic and camera where the site will require user permission to release the hardware GPU render path.
I feel like these days (especially given the recent focus on side channel attacks) it is basically a given that adding uniform noise to something that leaks data does not work, because you can always take more samples and remove the noise. Why did Safari add this? I understand that needing more samples is definitely an annoyance to fingerprinting efforts, but as this post shows it's basically always surmountable in some form or the other.
A lot of Apple's "privacy" features nowadays are marketing. It's privacy theater. What matters is whether they can tell a plausible story to the public, not whether is technically effective.
The essence seems to be that the web audio API has a lot of algorithms that do a lot of math, and every browser has a slightly different implementation, and the exact results depend on the operating system and cpu too. So if you use the web audio API to generate a small signal all browsers will generate something that's really close, but the tiny differences can be used to help tell them apart.
i think it comes from similar tricks that are played with webgl where there is a lot of entropy that comes from pc videocard drivers and the hardware itself.
it's a shame that browser people have to add noise to audio buffer handling to try and thwart it.
TL;DR different codepaths even within the same codebase (e.g. SIMD variants) can result in subtly different floating point results (iiuc, likely related to to the fact that floating point math is unexpectedly sensitive to order of operations etc.)
Probably implementation details and compiler optimizations, float addition is not commutative for example. Implementing the same algorithm with the same formulas correctly can still lead to slightly different results
Someone definitely correct me if I'm wrong, but the success of the fingerprinting workarounds here seem to boil down to the following choice wrt handling oscillator anti-aliasing in the Web Audio API spec:
"There are several practical approaches that an implementation may take to avoid this aliasing. Regardless of approach, the idealized discrete-time digital audio signal is well defined mathematically. The trade-off for the implementation is a matter of implementation cost (in terms of CPU usage) versus fidelity to achieving this ideal.
It is expected that an implementation will take some care in achieving this ideal, but it is reasonable to consider lower-quality, less-costly approaches on lower-end hardware."
AFAICT this means that the OscillatorNode output they are exploiting here is almost guaranteed to not be deterministic across browsers (or even in the same browser on different hardware). The non-determinism is based on whatever anti-aliasing method is chosen by the browser (or, possibly, multiple paths within the same browser which could get chosen based on the underlying hardware). This includes changes/fixes to the same anti-aliasing algos.
I don't really understand this choice of relegating anti-aliasing to the browser given that:
a) any high-quality audio app/library will want full control over how the signals they generate avoid aliasing and will not use these stock oscillators anyway, or
b) the kinds of web applications that would accept arbitrary anti-aliasing algos (and the consequent browser-dependent discrepancies therein) probably wouldn't care whether the aliasing algo is hardcoded SIMD instructions or some 20MB javascript web audio helper framework
Edit 3: I wonder if the same kind of solution could be used here as was used by Hixie to standardize the HTML5 parser. Namely, just have some domain expert specify an exact, deterministic algo for anti-aliasing that works well enough, then have all the browsers use that going forward. I'd bet the only measurable perf hit would be to tutorials that show how to use the web audio api to generate signals from the stock anti-aliased oscillators. :)
I wonder why audio API's are even available without giving a website permission? It feels like this could easily be fixed with a simple "This site would like to use your sound devices"-dialog.
It raises the question of whether the current networking stack is the one we want to have for the next 100 years. The internet in its current form has ruined a lot of the dream of personal computing because companies (and the state) are so asymmetrically powerful versus individuals. Should it be possible for my technology to send data to a server without my explicit approval?
I assumed a level of irony here, from fingerprint.com. It’s like if a website popped up popularising loopholes to get around tax burdens as an attempt to disgust the world into closing those loopholes.
Even if that’s wishful thinking, there’s still immense virtue in publishing this research and getting it out in the open. If an article gets published explaining how a particular brand of green backpack helps with shoplifting do we worry that everyone’s going to shoplift more? I’d err more on the side of knowing shops are more likely to catch on to the tactic.
It seems like rather than adding a random amount to each sample (which lets them compute a mean by recreating the same audio and extracting out the differences), Safari could instead add randomness that is based on a key that rotates every hour. (Function of audio sample and key, so the noise would be the same in a given session, but useless for tracking an hour later).
If you averaged together ten such samples, you'd get something that approaches the true values from the device. The more samples you have, the closer it would get.
Fixing this would require removing the information leak entirely, not just masking it under a layer of random deviations.
Wouldn’t it help if the noise added were deterministic based on origin? That way it can’t be averaged out by oversampling. So something like RNG_SEED = HMAC_SHA256(PERSISTENT_SECRET,Location.origin)
The problem is that that by being "that guy" you're probably giving them 10 bits or more of identification. If they can just scrape a few more bits from somewhere they'll have you uniquely identified.
But, yeah, these guys can get on Golgafrinchan Ark B with the rest of the adtech industry as far as I am concerned.
Good luck. It's amazing how little of today's web is good old HTML. A while ago I visited a website that used Markup - but that wasn't compiled into HTML and then statically served, oh no - it was rendered in JS client side. WTF.
Join me, and do it! There is a great Firefox extension called uMatrix, which makes it easy to disable JavaScript not just on a site-by-site basis, but also by subdomain (and easy to re-enable for sites that break without js).
I really don’t see how this can come up with more than a few thousand unique combinations. Browser type x browser version x os version x accelerator version x … what else? That doesn’t seem like enough variation to create anything remotely unique. I don’t get it.
This is similar. Audio algorithms often call OS functions and make use of CPU optimizations. One example they mentioned is the fast-fourier transform (FFT). All OS's include a version of that function but it tends to be optimized over time, and tends to behave differently on different CPUs depending on what SIMD instructions are available.
Couldn’t you just replace the prototype of the Audio API to return back whatever you wanted? The difficulty would be in getting enough fingerprints for your desired imitation but the article itself seems to have that information.
Wait, is it just me, or is it not wild that there's a company openly advertising their fingerprinting services? Their landing page implies it's primarily for fraud detection / abuse prevention. But one of their customer testimonials is from Neiman Marcus boasting they increased the number of repeat customers they could identify.
> "With the adoption of Fingerprint, we can now recognize and personalize approximately 23% of total visits to NeimanMarcus.com, up from the previous baseline of 8-10%."
Of course, these companies have always been around. But this post reads like it threads the line between "our product defeats Apple's futile defenses" and "we care so much about user privacy we're white-hat cracking Apple's defenses".
And another "between-the-lines" joke is that this site doesn't throw up a cookie dialog when you load it. What a joke! "We don't need cookies to track you, haha!"
So they say this is for fraud prevention and that all other uses need consent.
On their front page they tell me how often I have visited and that my incognito mode does not prevent their tracking.
Isn’t that “other use”?
> Does Fingerprint Pro require consent?
> Our technology is intended to be used for fraud detection only; for this case, no user consent is required. However, any use outside of fraud detection must comply with GDPR user consent rules.
I expected this article to be published by some hackers or defenders of privacy like EFF, not by a company whose goal is to fingerprint people. Such dystopian times.
There’s a push to make every single last thing a normal application can do, available to web apps through some half-standardized JavaScript API or another. Generally google comes up with use cases, implements it in chrome, and tries to call it a standard. Then everyone complains when Apple doesn’t implement these standards fast enough, and that Safari is “holding back the web” or “the new IE” because it’s not keeping up with every last feature Chrome implements.
I would prefer websites just be websites and that we don’t have every single damned API available to whatever trashy site I accidentally click on, but I guess you and me are outliers here. Most people on HN seem to welcome every single JS API because web development is the only platform anyone seems to care about any more.
Did I read this correctly and audio fingerprinting is mainly about identifiying the used browser version and OS or laptop, but it cant identify end-users in a stable way?
> Fingerprinting is used to identify bad actors when they want to remain anonymous. For example, when they want to sign in to your account or use stolen credit card credentials. Fingerprinting can identify repeat bad actors, allowing you to prevent them from committing fraud. However, many people see it as a privacy violation and therefore don’t like it.
This doesn't seem to acknowledge the use of fingerprinting in intentional violation of the privacy of ordinary people, for marketing profiling and just selling them out because someone is willing to pay.
On https://demo.fingerprint.com/ , they do start to hint at non-anti-fraud purposes, but the use case seems to be full of poo. (Logins or cookies are the way to do this. Anything else is trying to circumvent privacy mechanisms. And if they don't distinguish users perfectly, they're doubly violating privacy by then leaking private information between people.)
> Personalization -- Improve user experience and boost sales by personalizing your website with Fingerprint device intelligence. Provide your visitors with their search history, interface customization, or a persistent shopping cart without having to rely on cookies or logins.
> Heads up! -- Fingerprint Pro technology cannot be used to circumvent GDPR and other regulations and must fully comply with the laws in the jurisdiction. You should not implement personalization elements across incognito mode and normal mode because it violates the users expectations and will lead to a bad experience. -- This technical demo only uses incognito mode to demonstrate cookie expiration for non-technical folks.
Sounds a bit like a disingenuous bad actor doing CYA while demonstrating their capabilities, nudge, nudge, wink, wink.
[+] [-] redbell|2 years ago|reply
Codenamed 'DrawnApart', the technique relies on WebGL to count the number and speed of the execution units in the GPU, measure the time needed to complete vertex renders, handle stall functions, and more stuff
________________
1. https://www.bleepingcomputer.com/news/security/researchers-u...
[+] [-] chii|2 years ago|reply
[+] [-] saagarjha|2 years ago|reply
[+] [-] lapcat|2 years ago|reply
A lot of Apple's "privacy" features nowadays are marketing. It's privacy theater. What matters is whether they can tell a plausible story to the public, not whether is technically effective.
[+] [-] h4x0rr|2 years ago|reply
[+] [-] dmazzoni|2 years ago|reply
[+] [-] a-dub|2 years ago|reply
it's a shame that browser people have to add noise to audio buffer handling to try and thwart it.
[+] [-] Retr0id|2 years ago|reply
TL;DR different codepaths even within the same codebase (e.g. SIMD variants) can result in subtly different floating point results (iiuc, likely related to to the fact that floating point math is unexpectedly sensitive to order of operations etc.)
[+] [-] echoangle|2 years ago|reply
[+] [-] jancsika|2 years ago|reply
"There are several practical approaches that an implementation may take to avoid this aliasing. Regardless of approach, the idealized discrete-time digital audio signal is well defined mathematically. The trade-off for the implementation is a matter of implementation cost (in terms of CPU usage) versus fidelity to achieving this ideal.
It is expected that an implementation will take some care in achieving this ideal, but it is reasonable to consider lower-quality, less-costly approaches on lower-end hardware."
AFAICT this means that the OscillatorNode output they are exploiting here is almost guaranteed to not be deterministic across browsers (or even in the same browser on different hardware). The non-determinism is based on whatever anti-aliasing method is chosen by the browser (or, possibly, multiple paths within the same browser which could get chosen based on the underlying hardware). This includes changes/fixes to the same anti-aliasing algos.
I don't really understand this choice of relegating anti-aliasing to the browser given that:
a) any high-quality audio app/library will want full control over how the signals they generate avoid aliasing and will not use these stock oscillators anyway, or
b) the kinds of web applications that would accept arbitrary anti-aliasing algos (and the consequent browser-dependent discrepancies therein) probably wouldn't care whether the aliasing algo is hardcoded SIMD instructions or some 20MB javascript web audio helper framework
1: https://webaudio.github.io/web-audio-api/#OscillatorNode
Edit: clarification
Edit 2: more clarifications. :)
Edit 3: I wonder if the same kind of solution could be used here as was used by Hixie to standardize the HTML5 parser. Namely, just have some domain expert specify an exact, deterministic algo for anti-aliasing that works well enough, then have all the browsers use that going forward. I'd bet the only measurable perf hit would be to tutorials that show how to use the web audio api to generate signals from the stock anti-aliased oscillators. :)
[+] [-] pillusmany|2 years ago|reply
So you want to allow the implementation to decide how much to spend on it depending on available compute, battery and so on.
[+] [-] modeless|2 years ago|reply
[+] [-] capitainenemo|2 years ago|reply
https://web.archive.org/web/20120505042746/https://developer...
[+] [-] docEdub|2 years ago|reply
[+] [-] chrisbrandow|2 years ago|reply
[+] [-] mavamaarten|2 years ago|reply
I wonder why audio API's are even available without giving a website permission? It feels like this could easily be fixed with a simple "This site would like to use your sound devices"-dialog.
[+] [-] IceHegel|2 years ago|reply
[+] [-] FabHK|2 years ago|reply
On the other hand, I did clear my browser cache and switched on the VPN, and they mis-identified me as a new visitor.
Still, despicable business model.
[+] [-] gorgoiler|2 years ago|reply
Even if that’s wishful thinking, there’s still immense virtue in publishing this research and getting it out in the open. If an article gets published explaining how a particular brand of green backpack helps with shoplifting do we worry that everyone’s going to shoplift more? I’d err more on the side of knowing shops are more likely to catch on to the tactic.
[+] [-] sshumaker|2 years ago|reply
[+] [-] Borealid|2 years ago|reply
Fixing this would require removing the information leak entirely, not just masking it under a layer of random deviations.
[+] [-] tatersolid|2 years ago|reply
[+] [-] exabrial|2 years ago|reply
[+] [-] rgmerk|2 years ago|reply
But, yeah, these guys can get on Golgafrinchan Ark B with the rest of the adtech industry as far as I am concerned.
[+] [-] FabHK|2 years ago|reply
[+] [-] burnerthrow008|2 years ago|reply
[+] [-] DaSHacka|2 years ago|reply
It's not even just cloudflare and similar DDOS checks, but now even things that should just be in the HTML of the page are loaded with JS.
[+] [-] chii|2 years ago|reply
As the internet gets more and more hostile, this will become more and more correct.
[+] [-] Finesse|2 years ago|reply
[+] [-] kmlx|2 years ago|reply
[+] [-] tagCollector|2 years ago|reply
[deleted]
[+] [-] demondemidi|2 years ago|reply
[+] [-] water-data-dude|2 years ago|reply
[+] [-] gary_0|2 years ago|reply
I believe there are (or were, hopefully) similar techniques using <canvas> that exposed differences between the underlying graphics devices.
[+] [-] dmazzoni|2 years ago|reply
[+] [-] iamleppert|2 years ago|reply
[+] [-] knodi|2 years ago|reply
[+] [-] arijun|2 years ago|reply
[+] [-] ryan-c|2 years ago|reply
[+] [-] pradn|2 years ago|reply
> "With the adoption of Fingerprint, we can now recognize and personalize approximately 23% of total visits to NeimanMarcus.com, up from the previous baseline of 8-10%."
Of course, these companies have always been around. But this post reads like it threads the line between "our product defeats Apple's futile defenses" and "we care so much about user privacy we're white-hat cracking Apple's defenses".
And another "between-the-lines" joke is that this site doesn't throw up a cookie dialog when you load it. What a joke! "We don't need cookies to track you, haha!"
[+] [-] ano-ther|2 years ago|reply
On their front page they tell me how often I have visited and that my incognito mode does not prevent their tracking.
Isn’t that “other use”?
> Does Fingerprint Pro require consent?
> Our technology is intended to be used for fraud detection only; for this case, no user consent is required. However, any use outside of fraud detection must comply with GDPR user consent rules.
[+] [-] Einenlum|2 years ago|reply
[+] [-] omnicognate|2 years ago|reply
[+] [-] ninkendo|2 years ago|reply
I would prefer websites just be websites and that we don’t have every single damned API available to whatever trashy site I accidentally click on, but I guess you and me are outliers here. Most people on HN seem to welcome every single JS API because web development is the only platform anyone seems to care about any more.
[+] [-] nox101|2 years ago|reply
[+] [-] unknown|2 years ago|reply
[deleted]
[+] [-] dannyw|2 years ago|reply
[+] [-] balls187|2 years ago|reply
So as a user my preference not to be fingerprinted or tracked takes a back seat in the name of fraud detection?
So we should allow police to wiretap in the name of crime prevention?
[+] [-] stockhorn|2 years ago|reply
[+] [-] neilv|2 years ago|reply
This doesn't seem to acknowledge the use of fingerprinting in intentional violation of the privacy of ordinary people, for marketing profiling and just selling them out because someone is willing to pay.
On https://demo.fingerprint.com/ , they do start to hint at non-anti-fraud purposes, but the use case seems to be full of poo. (Logins or cookies are the way to do this. Anything else is trying to circumvent privacy mechanisms. And if they don't distinguish users perfectly, they're doubly violating privacy by then leaking private information between people.)
> Personalization -- Improve user experience and boost sales by personalizing your website with Fingerprint device intelligence. Provide your visitors with their search history, interface customization, or a persistent shopping cart without having to rely on cookies or logins.
Popup warning on "https://demo.fingerprint.com/personalization":
> Heads up! -- Fingerprint Pro technology cannot be used to circumvent GDPR and other regulations and must fully comply with the laws in the jurisdiction. You should not implement personalization elements across incognito mode and normal mode because it violates the users expectations and will lead to a bad experience. -- This technical demo only uses incognito mode to demonstrate cookie expiration for non-technical folks.
Sounds a bit like a disingenuous bad actor doing CYA while demonstrating their capabilities, nudge, nudge, wink, wink.