A particularly bad instance of link tracking I've found is in TikTok's link sharing feature.
If you share a link from the TikTok app, it gives you a vm.tiktok.com/[xyz] link to send/post elsewhere. It gives you no indication that this isn't a generic link to the post, nor does it give you an option to expose the generic link to the post.
Instead, when you share that link and someone clicks on it and does not have the app, it opens with a header saying "[First Last] is on TikTok." On the other hand, once you do click on that link (if and only if you don't have the app installed), you get redirected to the static link to the video and finally obtain it.
This is an anti-pattern that enables further tracking and potentially unknowingly exposes user data when links are shared publicly. And there's no indication to the user that this is happening, since the link is structured as if it does not contain any tracking. Ie a tool like this wouldn't be able to "strip out" the tracking since it isn't tacked on in any way, but embedded as the generated link itself.
Any company running out of mainland China is going to have serious privacy problems due to CCP influence and their need to comply with both local laws and the government’s interest in influencing public sentiment.
With websites, at least you can just copy the URL from the address bar and clean it. Of course, people are being slowly dumbed down by browser's (mostly Chrome, but Firefox seems to follow its stupid trends not long afterwards) attempts at removing or hiding the URL, which is no surprise when you realise that herding the userbase to use dedicated "share" buttons (complete with tracking) is one of the reasons they're doing that.
Stack Overflow does something similar, and adds a user tracking ID to any shared link, though apparently it's possible to remove it without breaking the link[1].
I only noticed when I received a badge for how many times it was clicked, and even though it's not nefarious I'd still prefer it to be opt-in rather than done by default.
Yes, I regularly warn people on Reddit that their full name is being leaked in the TikTok link they shared. I have an iOS shortcut that expands the URL and chops off the gross tracking stuff so I can share links in private/public without exposing my TikTok "name" (I don't link any accounts and my name is made up).
VRBO is another egregious example. My friend asked what I thought about a house she was thinking of renting for a trip. VRBO wouldn't let me view the link on my phone unless I downloaded their app. I had her copy and paste the house's description which I then Googled to get to the right listing.
When twitter's snowflake was lengthened recently I was worried they might be doing this too. I'm afraid of the big ones moving to this. Spotify, instagram, twitter, etc
Assuming any certificate pinning can be defeated, it is easy to manipulate URLs with a loopback-bound forward proxy. Would be great if someone provided example of one of these TikTok URLs so we could investigate.
A short video platform can hardly be expected to be a paragon of security and privacy. It has no utility whatsoever. I don't see where the concern comes from. A video of someone drinking coffee does not particularly invoke a point of concern.
What may be the real concern is China and the fact that the app is tied to it. Thats more race/geo-politics/war-mongering issue than a privacy concern.
Also, I see a few interesting comments in this HN thread; this evening when the dust settles, I'll aggregate & bring them to the bug for consideration if/when fixing this bug is considered.
I don't really know how I feel about having the browser mess with URLs without the user engaging it deliberately. It feels to me something that should perhaps be approached with caution. On the other hand, it does make sense. It's a tricky one.
I think the problem with this is that ClearURLs can break legitimate uses for URL params. I need to disable it when I do things like online payment. That's not intuitive for users and means an integrated solution needs to take laypersons into account who wouldn't know how to solve the problem (or even what the actual problem is). Is that realistically solvable?
I am not a fan of making such functionality part of the browser.
I use the HTTPS only mode in Firefox - it breaks some sites, and telling Firefox to disable the mode for a specific site doesn't always work.
I feel like a plugin (HTTPS Everywhere) can deal with this a lot better than something that's integrated and reduced to a single checkbox in the settings.
I would love to see an 'educational' mode on this - rather than just removing the tracking elements, put some info on-screen that shows what was removed and why, so people can use this as a tool to learn more about what types of tracking exist online and how common it is. Hopefully that would lead to a more knowledgeable end user community online and we can have more nuanced discussions in the future about where tracking is benign, and where it is not.
Not exactly what you requested but there's the ability to log all requests that are processed: if you click the extension icon and then under "Configs" enable logging, then at the bottom of the ui there's a button for checking the logs. This will show you the before and after processing urls, the rules that were triggered, and when.
I agree. I uninstalled this add-on precisely because I couldn’t quite figure out what it was doing or where it was doing it. Unlike an add blocker there’s very little tangible difference when it’s on or off
While I greatly value my privacy to the point where I donate to noyb.eu, removing utm campaign tags feels too much. Those do not commonly contain private information. I believe that marketers should feel free to use those to measure the effectiveness of their campaigns, instead of relying on more privacy-intrusive and opaque methods (e.g. cookies, fingerprinting, IP address collecting, etc.).
> I believe that marketers should feel free to use those to measure the effectiveness of their campaigns
I don't. I believe marketers should have exactly zero ways to measure the effectiveness of their mind hacking efforts. Any data they try and collect should have negative value by virtue of being completely randomized by the browser.
Actually I believe marketers shouldn't even exist. Nothing they say is trustworthy by virtue of conflict of interest. The internet would be much better off without these constant attempts to subvert it for their purposes.
Every time any kind of measure to improve people's browsing experience is posted here someone comes along and explains how this one is too much. But they are always wrong. There is no "going too far" in optimizing the browser for the people who are using it.
That's not the only issue. The ids are then fed back into the facebook.
Facebook can use it to link contacts together. I get a share link, it gives it an ID, I send it to someone, they open it and now they have linked my account with their account.
Same works if I click on a page and get the ID, share just that page, and someone clicks it (and there's some fb element on the page).
Now if several users a day share a link here on HN, facebook will know about us as belonging to a certain group.
I always remove them. They're like referer headers. Where the visitor came from is just like any other info that might be useful to the site operator, but is really not any of their business unless the visitor voluntarily discloses it.
I don't mind people stripping these tags manually for link sharing, but stripping them across the board would be a major issue for website that finance themselves through affiliate links. Suddenly your referrals are no longer tracked and your main source of revenue dies up.
Related, if you're looking to clean urls on the backend, here's my current pattern used on https://upstract.com and some other news aggregators I've built:
Now the tracking parameters are all encoded in the last segment of the url. The backend just has to decode it accordingly and it will have both the item id and the bag of tracking parameters.
Note that this addon requires the "Access your data for all websites" permission[0], which means:
> The extension can read the content of any web page you visit as well as data you enter into those web pages, such as usernames and passwords.
I'm sure the devs are super trustworthy, but there have been cases of legitimate extensions falling in the wrong hands, and this, coupled with automatic extension updates, could be a big security hole in your setup.
This add-on together with Firefox, Bitwarden, uBlock Origin, HTTPS everywhere and EFF's Privacy Badger I us to improve my privacy online. Once a blue moon (few times per year) I have to switch them off to get a site to work.
Besides that I only have the Tree Style Tab add-on installed, which is much recommended.
It should be noted that this extension strips ETag headers from all responses by default, which can break sites in surprising ways. As a developer of a web application that relies on ETag headers for vital functionality, I see not-infrequent support inquiries from ClearURLs users who don't understand the technical ramifications of this feature - nor do they understand why so many of the websites they use are so broken.
There’s lots of rules and patterns in this implementation, but it’s worth bearing in mind that you can normally get a clean URL by looking at the <link rel=canonical> element.
Sites put this in because they want search engines to index a single clean URL rather than many tracking URLs, so it’s pretty reliable.
That works if you want to get a clean URL to share with others. But if instead you have gotten a link then not using built-in patters means you would first need to retrieve the site with the tracking parameters to get to the canonical URL.
Lovely extension, some discussions about its functionality can be found in this thread [0] after the removal of the extension from Chrome's Web Store.
One things I noticed is that it can be too aggressive from time to time. I encountered this "issue" when creating a Bitwarden account, I was unable to verify my e-mail address because ClearURLs was (unbeknownst to me) removing some of the parameters from the activation URL. While similar cases will most likely not be frequent, it can be really frustrating to determine why something does not work (also applies to ad blockers).
I love this just for the usability alone, never mind being anti-tracking.
I'm tired of every time I want to share a product page or post a URL or something, of having to strip 300 friggin' nonsense characters from the end of it.
Another way to browse one-off sites one visits is to through a mirror like https://archive.is/ (I exclusively use mirrors to view posts on content aggregators like Medium, Substack, Buzzfeed, Blogspot, Wordpress; annoying News websites that download a gazillion files; and file-hosting websies like imgur).
A caveat: When you submit a request to archive a url, archive.is sends the client-ip (X-Forwarded-For) to the destination server.
This is one of those things that either few use and it works, or if many start using it, the tracking will just get obfuscated.
I already see many sites use something like ?arg={BASE64 STRING OF ALL THE THINGS} and no automatic tool can decypher that as it's a custom list of bytes.
This is a neat extension but I think we should acknowledge that stripping parameters like these from affiliate links is going to cause major problems for websites that are financed through affiliate revenue, even if they are open and honest about it.
[+] [-] jacobajit|4 years ago|reply
If you share a link from the TikTok app, it gives you a vm.tiktok.com/[xyz] link to send/post elsewhere. It gives you no indication that this isn't a generic link to the post, nor does it give you an option to expose the generic link to the post.
Instead, when you share that link and someone clicks on it and does not have the app, it opens with a header saying "[First Last] is on TikTok." On the other hand, once you do click on that link (if and only if you don't have the app installed), you get redirected to the static link to the video and finally obtain it.
This is an anti-pattern that enables further tracking and potentially unknowingly exposes user data when links are shared publicly. And there's no indication to the user that this is happening, since the link is structured as if it does not contain any tracking. Ie a tool like this wouldn't be able to "strip out" the tracking since it isn't tacked on in any way, but embedded as the generated link itself.
[+] [-] gonehome|4 years ago|reply
https://stratechery.com/2020/the-tiktok-war/
Any company running out of mainland China is going to have serious privacy problems due to CCP influence and their need to comply with both local laws and the government’s interest in influencing public sentiment.
[+] [-] userbinator|4 years ago|reply
[+] [-] imiric|4 years ago|reply
I only noticed when I received a badge for how many times it was clicked, and even though it's not nefarious I'd still prefer it to be opt-in rather than done by default.
[1]: https://meta.stackoverflow.com/q/277769
[+] [-] joshstrange|4 years ago|reply
[+] [-] Breza|4 years ago|reply
[+] [-] milofeynman|4 years ago|reply
[+] [-] space_fountain|4 years ago|reply
[+] [-] 1vuio0pswjnm7|4 years ago|reply
[+] [-] jtbayly|4 years ago|reply
[+] [-] 3np|4 years ago|reply
[+] [-] vagrantJin|4 years ago|reply
A short video platform can hardly be expected to be a paragon of security and privacy. It has no utility whatsoever. I don't see where the concern comes from. A video of someone drinking coffee does not particularly invoke a point of concern.
What may be the real concern is China and the fact that the app is tied to it. Thats more race/geo-politics/war-mongering issue than a privacy concern.
[+] [-] ronjouch|4 years ago|reply
Bug 1697982: "Firefox Tracking Protection should protect against URL/queryparam-based tracking (like ClearURLs/NeatURL addons do)" , https://bugzilla.mozilla.org/show_bug.cgi?id=1697982
Please vote for the bug if you'd like it too.
Also, I see a few interesting comments in this HN thread; this evening when the dust settles, I'll aggregate & bring them to the bug for consideration if/when fixing this bug is considered.
[+] [-] eythian|4 years ago|reply
[+] [-] VortexDream|4 years ago|reply
[+] [-] surround|4 years ago|reply
[+] [-] wackget|4 years ago|reply
Any URLs in the addon description section are all tracked/redirected via `https://outgoing.prod.mozaws.net`
[+] [-] daveoc64|4 years ago|reply
I use the HTTPS only mode in Firefox - it breaks some sites, and telling Firefox to disable the mode for a specific site doesn't always work.
I feel like a plugin (HTTPS Everywhere) can deal with this a lot better than something that's integrated and reduced to a single checkbox in the settings.
[+] [-] guilhas|4 years ago|reply
And Stop trying to "re-implement" features for which there are already user extensions way more capable
[+] [-] nagarjun|4 years ago|reply
[+] [-] codingdave|4 years ago|reply
[+] [-] uo21tp5hoyg|4 years ago|reply
[+] [-] ycombinete|4 years ago|reply
[+] [-] anticristi|4 years ago|reply
[+] [-] matheusmoreira|4 years ago|reply
I don't. I believe marketers should have exactly zero ways to measure the effectiveness of their mind hacking efforts. Any data they try and collect should have negative value by virtue of being completely randomized by the browser.
Actually I believe marketers shouldn't even exist. Nothing they say is trustworthy by virtue of conflict of interest. The internet would be much better off without these constant attempts to subvert it for their purposes.
[+] [-] maple3142|4 years ago|reply
[+] [-] theshrike79|4 years ago|reply
https://aliexpress.com/item/4000336900709.html?spm=a2g01.126...
vs
https://www.aliexpress.com/item/4000336900709.html
Both take you to the same page.
URLs, especially when clicked from ads tend to have a HUGE amount of extra crap that's in no way needed for any kind of functionality.
[+] [-] cyborgx7|4 years ago|reply
[+] [-] rplnt|4 years ago|reply
Facebook can use it to link contacts together. I get a share link, it gives it an ID, I send it to someone, they open it and now they have linked my account with their account. Same works if I click on a page and get the ID, share just that page, and someone clicks it (and there's some fb element on the page).
Now if several users a day share a link here on HN, facebook will know about us as belonging to a certain group.
[+] [-] selfhoster11|4 years ago|reply
[+] [-] bottled_poe|4 years ago|reply
[+] [-] throwaway81523|4 years ago|reply
[+] [-] DangerousPie|4 years ago|reply
[+] [-] marban|4 years ago|reply
startswith: 'utm_', 'ga_', 'hmb_', 'ic_', 'fb_', 'pd_rd', 'ref_', 'share_', 'client_', 'service_'
or has: '$/ref@amazon.', '.tsrc', 'ICID', '_xtd', '_encoding@amazon.', '_hsenc', '_openstat', 'ab', 'action_object_map', 'action_ref_map', 'action_type_map', 'amp', 'arc404', 'affil', 'affiliate', 'app_id', 'awc', 'bfsplash', 'bftwuk', 'campaign', 'camp', 'cip', 'cmp', 'CMP', 'cmpid', 'curator', '[email protected]', 'efg', 'ei@google.', 'fbclid', 'fbplay', '[email protected]', 'feedName', 'feedType', '[email protected]', 'forYou', 'fsrc', 'ftcamp', 'ga_campaign', 'ga_content', 'ga_medium', 'ga_place', 'ga_source', 'ga_term', 'gi', '[email protected]', 'gs_l', 'gws_rd@google.', 'igshid', 'instanceId', 'instanceid', '[email protected]', 'maca', 'mbid', 'mkt_tok', 'mod', 'ncid', 'ocid', 'offer', 'origin', 'partner','[email protected]', 'print', 'printable', 'psc@amazon.', '[email protected]', 'rebelltitem', 'ref', 'referer', 'referrer', 'rss', 'ru', '[email protected]', 'scrolla', 'sei@google.', 'sh', 'share', '[email protected]', 'source', '[email protected]', 'sref', 'srnd', 'supported_service_name', 'tag', 'taid', 'time_continue', 'tsrc', 'twsrc', 'twcamp', 'twclid', 'tweetembed', 'twterm', 'twgr', 'utm', 'ved@google.', 'via', 'xid', 'yclid', 'yptr'
Edit: Will turn this into a Gist at some point.
[+] [-] sdevonoes|4 years ago|reply
Before
https://example.com/item/4000336900709?spm=a2g01.126...
After:
https://example.com/item/4000336900709000044323234
Now the tracking parameters are all encoded in the last segment of the url. The backend just has to decode it accordingly and it will have both the item id and the bag of tracking parameters.
[+] [-] asymmetric|4 years ago|reply
> The extension can read the content of any web page you visit as well as data you enter into those web pages, such as usernames and passwords.
I'm sure the devs are super trustworthy, but there have been cases of legitimate extensions falling in the wrong hands, and this, coupled with automatic extension updates, could be a big security hole in your setup.
[0]: https://support.mozilla.org/en-US/kb/permission-request-mess...
PS: Ironically, the link above has utm elements.
[+] [-] cies|4 years ago|reply
Besides that I only have the Tree Style Tab add-on installed, which is much recommended.
[+] [-] jordoh|4 years ago|reply
[+] [-] JimDabell|4 years ago|reply
Sites put this in because they want search engines to index a single clean URL rather than many tracking URLs, so it’s pretty reliable.
[+] [-] account42|4 years ago|reply
[+] [-] BerislavLopac|4 years ago|reply
[+] [-] tomudding|4 years ago|reply
One things I noticed is that it can be too aggressive from time to time. I encountered this "issue" when creating a Bitwarden account, I was unable to verify my e-mail address because ClearURLs was (unbeknownst to me) removing some of the parameters from the activation URL. While similar cases will most likely not be frequent, it can be really frustrating to determine why something does not work (also applies to ad blockers).
[0]: https://news.ycombinator.com/item?id=26564638
[+] [-] crazygringo|4 years ago|reply
I'm tired of every time I want to share a product page or post a URL or something, of having to strip 300 friggin' nonsense characters from the end of it.
[+] [-] l1am0|4 years ago|reply
Open Source and Free to Use
[+] [-] gdsdfe|4 years ago|reply
[+] [-] ignoramous|4 years ago|reply
A caveat: When you submit a request to archive a url, archive.is sends the client-ip (X-Forwarded-For) to the destination server.
[+] [-] slver|4 years ago|reply
I already see many sites use something like ?arg={BASE64 STRING OF ALL THE THINGS} and no automatic tool can decypher that as it's a custom list of bytes.
[+] [-] DangerousPie|4 years ago|reply
[+] [-] ChrisGranger|4 years ago|reply
[+] [-] pellias|4 years ago|reply