top | item 39630985

You cannot simply publicly access private secure links, can you?

420 points| vin10 | 2 years ago |vin01.github.io

218 comments

order

internetter|2 years ago

The fundamental issue is that links without any form of access control are presumed private, simply because there is no public index of the available identifiers.

Just last month, a story with a premise of discovering AWS account ids via buckets[0] did quite well on HN. The consensus established in the comments is that if you are relying on your account identifier being private as some form of security by obscurity, you are doing it wrong. The same concept applies here. This isn’t a novel security issue, this is just another method of dorking.

[0]: https://news.ycombinator.com/item?id=39512896

ta1243|2 years ago

The problem is links leak.

In theory a 256 hex-character link (so 1024 bits) is near infinitely more secure than a 32 character username and 32 character password, as to guess it

https://site.com/[256chars]

As there's 2^1024 combinations. You'd never brute force it

vs

https://site,com/[32chars] with a password of [32chars]

As there's 2^256 combinations. Again you can't brute force it, but it's more likely than the 2^1024 combinations.

Imagine it's

https://site,com/[32chars][32chars] instead.

But while guessing the former is harder than the latter, URLs leak a lot, far more than passwords.

bo1024|2 years ago

There's probably details I'm missing, but I think the fundamental issue is that "private" messages between people are presumed private, but actually the platforms we use to send messages do read those messages and access links in them. (I mean messages in a very broad sense, including emails, DMs, pasted links in docs, etc.)

mikepurvis|2 years ago

Bit of a tangent, but I was recently advised by a consultant that pushing private Nix closures to a publicly-accessible S3 bucket was fine since each NAR file has a giant hash in the name. I didn't feel comfortable with it so we ended up going a different route, but I've continued to think about that since how different is it really to have the "secret" be in the URL vs in a token you submit as part of the request for the URL?

And I think for me it comes down to the fact that the tokens can be issued on a per-customer basis, and access logs can be monitored to watch for suspicious behaviour and revoke accordingly.

Also, as others have mentioned, there's just a different mindset around how much it matters that the list of names of files be kept a secret. On the scale of things Amazon might randomly screw up, accidentally listing the filenames sitting in your public bucket sounds pretty low on the priority list since 99% of their users wouldn't care.

XorNot|2 years ago

Worked for a company which ran into an S3 bucket naming collision when working with a client - turns out that both sides decided hyphenated-company-name was a good S3 bucket name (my company lost that race obviously).

One of those little informative pieces where everytime I do AWS now all the bucket names are usually named <project>-<deterministic hash from a seed value>.

If it's really meant to be private then you encrypt the project-name too and provide a script to list buckets with "friendly" names.

There's always a weird tradeoff with hosted services where technically the perfect thing (totally random identifiers) is too likely to mostly be an operational burden compared to the imperfect thing (descriptive names).

bachmeier|2 years ago

> The fundamental issue is that links without any form of access control are presumed private, simply because there is no public index of the available identifiers.

Is there a difference between a private link containing a password and a link taking you to a site where you input the password? Bitwarden Send gives a link that you can hand out to others. It has # followed by a long random string. I'd like to know if there are security issues, because I use it regularly. At least with the link, I can kill it, and I can automatically have it die after a few days. Passwords generally don't work that way.

r2b2|2 years ago

To create private shareable links, store the private part in the hash of the URL. The hash is not transmitted in DNS queries or HTTP requests.

Ex. When links.com?token=<secret> is visited, that link will be transmitted and potentially saved (search parameters included) by intermediaries like Cloud Flare.

Ex. When links.com#<secret> is visited, the hash portion will not leave the browser.

Note: It's often nice to work with data in the hash portion by encoding it as a URL Safe Base64 string. (aka. JS Object ↔ JSON String ↔ URL Safe Base 64 String).

jmholla|2 years ago

> Ex. When links.com?token=<secret> is visited, that link will be transmitted and potentially saved (search parameters included) by intermediaries like Cloud Flare.

Note: When over HTTPS, the parameter string (and path) is encrypted so the intermediaries in question need to be able to decrypt your traffic to read that secret.

Everything else is right. Just wanted to provide some nuance.

phyzome|2 years ago

Huge qualifier: Even otherwise benign Javascript running on that page can pass the fragment anywhere on the internet. Putting stuff in the fragment helps, but it's not perfect. And I don't just mean this in an ideal sense -- I've actually seen private tokens leak from the fragment this way multiple times.

andix|2 years ago

Is there a feature of DNS I'm unaware of, that queries more than just the domain part? https://example.com?token=<secret> should only lead to a DNS query with "example.com".

klabb3|2 years ago

Thanks, finally some thoughts about how to solve the issue. In particular, email based login/account reset is the main important use case I can think of.

Do bots that follow links in emails (for whatever reason) execute JS? Is there a risk they activate the thing with a JS induced POST?

loginatnine|2 years ago

It's called a fragment FYI!

nightpool|2 years ago

The secret is still stored in the browser's history DB in this case, which may be unencrypted (I believe it is for Chrome on Windows last I checked). The cookie DB on the other hand I think is always encrypted using the OS's TPM so it's harder for malicious programs to crack

eterm|2 years ago

If it doesn't leave the browser, how would the server know to serve the private content?

rpigab|2 years ago

Links that are not part of a fast redirect loop will be copied and pasted to be shared because that's what URLs are for, they're universal, they facilitate access to a resource available on a protocol.

Access control on anything that is not short-lived must be done outside of the url.

When you share links on any channel that is not e2ee, the first agent to access that url is not the person you're sending it to, it is the channel's service, it can be legitimate like Bitwarden looking for favicons to enhance UX, or malicious like FB Messenger crawler that wants to know more about what you are sharing in private messages.

Tools like these scanners won't get better UX, because if you explicitly tell users that the scans are public, some of them will think twice about using the service, and this is bad for business, wether they're using it for free or paying a pro license.

QuercusMax|2 years ago

I've always been a bit suspicious of infinite-use "private" links. It's just security thru obscurity. At least when you share a Google doc or something there's an option that explicitly says "anyone with the URL can access this".

Any systems I've built that need this type of thing have used Signed URLs with a short lifetime - usually only a few minutes. And the URLs are generally an implementation detail that's not directly shown to the user (although they can probably see them in the browser debug view).

empath-nirvana|2 years ago

There's functionally no difference between a private link and a link protected by a username and password or an api key, as long as the key space is large enough.

voiper1|2 years ago

>At least when you share a Google doc or something there's an option that explicitly says "anyone with the URL can access this".

Unfortunately, it's based on the document ID, so you can't re-enable access with a new URL.

scblock|2 years ago

When it comes to the internet if something like this is not protected by anything more than a random string in a URL then they aren't really private. Same story with all the internet connected web cams you can find if you go looking. I thought we knew this already. Why doesn't the "Who is responsible" section even mention this?

AnotherGoodName|2 years ago

Such links are very useful in an 'it's OK to have security match the use case' type of way. You don't need maximum security for everything. You just want a barrier to widespread sharing in some cases.

As an example i hit 'create link share' on a photo in my photo gallery and send someone the link to that photo. I don't want them to have to enter a password. I want the link to show the photo. It's ok for the link to do this. One of the examples they have here is exactly that and it's fine for that use case. In terms of privacy fears the end user could re-share a screenshot at that point anyway even if there was a login. The security matches the use case. The user now has a link to a photo, they could reshare but i trust they won't intentionally do this.

The big issue here isn't the links imho. It's the security analysis tools scanning all links a user received via email and making them available to other users in that community. That's more re-sharing than i intended when i sent someone a photo.

Terr_|2 years ago

A workaround for this "email-based authentication" problem (without going to a full "make an account with a password" step) is to use temporary one-time codes, so that it doesn't matter if the URL gets accidentally shared.

1. User visits "private" link (Or even a public link where they re-enter their e-mail.)

2. Site e-mails user again with time-limited single-use code.

3. User enters temporary code to confirm ownership of e-mail.

4. Flow proceeds (e.g. with HTTP cookies/session data) with reasonable certainty that the e-mail account owner is involved.

amanda99|2 years ago

Off topic: but that links to cloudflare radar which apparently mines data from 1.1.1.1. I was under the impression that 1.1.1.1 did not use user data for any purposes?

kube-system|2 years ago

CF doesn't sell it or use it for marketing, but the entire way they even got the addresses was because APNIC wanted to study the garbage traffic to 1.1.1.1.

victorbjorklund|2 years ago

Can someone smarter explain to me what is different between?

1) domain.com/login user: John password: 5 char random password

2) domain.com/12 char random url

If we assume both either have the same bruteforce/rate limiting protection (or none at all). Why is 1 more safe than 2?

koliber|2 years ago

From the information theory angle, there is no difference.

In practice, there is.

There is a difference between something-you-have secrets and something-you-know secrets.

A UrL is something you have. It can be taken from you if you leave it somewhere accessible. Passwords are something-you-know and if managed well can not be taken (except for the lead pipe attack).

There is also something-you-are, which includes retina and fingerprint scans.

rkangel|2 years ago

This article is the exact reason why.

(1) Requires some out-of-band information to authenticate. Information that people are used to keeping safe.

On the other hand the URLs in (2) are handled as URLs. URLs are often logged, recorded, shared, passed around. E.g. your work firewall logging the username and password you used to log into a service would obviously be bad, but logging URLs you've accessed would probably seems fine.

[the latter case is just an example - the E2E guarantees of TLS mean that neither should be accessible]

amanda99|2 years ago

Two things:

1. "Password" is a magic word that makes people less likely to just paste it into anything.

2. Username + passwords are two separate pieces of information that are not normally copy-pasted at the same time or have a canonical way of being stored next to each other.

wetpaste|2 years ago

In the context of this article, it is that security scanning software that companies/users are using seem to be indexing some of the 12-char links out of emails which ends up in some cases on public scan. Additionally, if domain.com/12-char-password is requested without https, even if there is a redirect, that initial request went over the wire unencrypted and therefore could be MITM, whereas with a login page, there are more ways to guarantee that the password submit would only ever happen over https.

ApolloFortyNine|2 years ago

I researched this a while ago when I was curious if you could put auth tokens as query params.

One of the major issues is that many logging applications will log the full url somewhere, so now your logging 'passwords'.

jarofgreen|2 years ago

As well as what the others have said, various bits of software make the assumption that 1) may be private and to be careful with it and 2) isn't.

eg Your web browser will automatically save any URLs to it's history for any user of the computer to see but will ask first before saving passwords.

eg Any web proxies your traffic goes through or other software that's looking like virus scanners will probably log URLs but probably won't log form contents (yes HTTPS makes this one more complicated but still).

munk-a|2 years ago

Assuming that 5 char password is done in a reasonable way then that data is not part of the publicly visible portion of the request that anyone along the chain of the communication can trivially eavesdrop. In a lot of cases that password even existing (even if there's no significant data there) will transform a request from a cacheable request into an uncacheable request so intermediate servers won't keep a copy of the response in case anyone else wants the document (there are other ways to do this but this will also force it to be the case).

kube-system|2 years ago

The difference is that people (and software that people write) often treat URLs differently than a password field. 12 characters might take X amount of time to brute force, but if you already have the 12 characters, that time drops to zero.

hawski|2 years ago

You can easily make a regex to filter out URLs. There is no universal regex (other than maybe costly LLM) to match the URL, the username and the password.

sbr464|2 years ago

All media/photos you upload to a private airtable.com app are public links. No authentication required if you know the url.

andix|2 years ago

There is a dilemma for web developers with images loaded from CDNs or APIs. Regular <img> tags can't set an Authorization header with a token for the request, like you can do with fetch() for API requests. The only possibility is adding a token to the URL or by using cookie authentication.

Cookie auth only works if the CDN is on the same domain, even a subdomain can be problematic in many cases.

internetter|2 years ago

This is actually fairly common for apps using CDNs – not just airtable. I agree it's potentially problematic

ttymck|2 years ago

Zoom meeting links often have the password appended as a query parameter. Is this link a "private secure" link? Is the link without the password "private secure"?

bombcar|2 years ago

If the password is randomized for each meeting, the URL link is not so bad, as the meeting will be dead and gone by the time the URL appears elsewhere.

But in reality, nobody actually cares and just wants a "click to join" that doesn't require fumbling around - but the previous "just use the meeting ID" was too easily guessed.

boxed|2 years ago

Outlook.com leaks links to bing. At work it's a constant attack surface that I have to block by looking at the user agent string. Thankfully they are honest in the user agent!

snthd|2 years ago

"private secure links" are indistinguishable from any other link.

With HTTP auth links you know the password is a password, so these tools would know which part to hide from public display:

> https://username:password@example.com/page

jeroenhd|2 years ago

I think it's quite funny that the URL spec has a section dedicated to authentication, only for web devs to invent ways to pass authentication data in any way but using the built-in security mechanism.

I know there are valid reasons (the "are you sure you want to log in as usernam on example.com?" prompt for example) but this is just one of the many ways web dev has built hacks upon hacks where implementing standards would've sufficed. See also: S3 vs WebDAV.

andix|2 years ago

A while ago I started to only send password protected links via email. Just with the plaintext password inside the email. This might seem absurd and unsafe on the first glance, but those kind of attacks it can safely prevent. Adding an expiration time is also a good idea, even if it is as long as a few months.

godelski|2 years ago

There's a clear UX problem here. If you submit a scan it doesn't tell you it is public.

There can be a helpful fix: make clear that the scan is public! When submitting a scan it isn't clear, as the article shows. But you have the opportunity to also tell the user that it is public during the scan, which takes time. You also have the opportunity to tell them AFTER the scan is done. There should be a clear button to delist.

urlscan.io does a bit better but the language is not quite clear that it means the scan is visible to the public. And the colors just blend in. If something isn't catching to your eye, it might as well be treated as invisible. If there is a way to easily misinterpret language, it will always be misinterpreted. if you have to scroll to find something, it'll never be found.

heipei|2 years ago

Thanks for your feedback. We show the Submit button on our front page as "Public Scan" to indicate that the scan results will be public. Once the scan has finished it will also contain the same colored banner that says "Public Scan". On each scan result page there is a "Report" button which will immediately de-list the scan result without any interaction from our side. If you have any ideas on how to make the experience more explicit I would be happy to hear it!

dav43|2 years ago

A classic one that has a business built on this is pidgeonhole - literally private links for events with people hosting internal company events and users posing private sometimes confidential information. And even banks sign on to these platforms!

kgeist|2 years ago

Tried it with the local alternative to Google Disk. Oh my... Immediately found lots of private data, including photos of credit cars (with security codes), scans of IDs, passports... How do you report a site?

JensRantil|2 years ago

I'm surprised no one has mentioned creating a standard that allows a these sites to check whether it's a private link or not.

For example, either a special HTTP header returned when making a HEAD request for the URL, or downloading a file similar to robots.txt that defines globs which are public/private.

At least this would (mostly) avoid these links becoming publicly available on the internetz.

egberts1|2 years ago

Sure, you can!

This is the part where IP filtering by country and subnet can keep your ports hidden.

Also stateful firewall can be crafted to only let certain IP thru after sending a specially-crafted TOTP into a ICMP packet just to get into opening the firewall for your IP.

qudat|2 years ago

Over at pico.sh we are experimenting with an entirely new type of private link by leveraging ssh local forward tunnels: https://pgs.sh/

We are just getting started but so far we are loving the ergonomics.

65|2 years ago

Well this is interesting. Even quickly searching "docs.google.com" on urlscan.io gets me some spreadsheets with lists of people's names, emails, telephone numbers, and other personal information.

getcrunk|2 years ago

What’s wrong with using signed urls and encrypting the object with a unique per user key. It’s adds some cpu time but if it’s encrypted it’s encrypted.

* this obviously assumes the objects have a 1-1 mapping with users

rvba|2 years ago

Reminds me how some would search for bitcoin wallets via google and kazaa.

On a side note, can someome remind me what was the name of the file, I think I have some tiny fraction of a bicoin on an old computer

figers|2 years ago

We have done one time use query string codes at the end of a URL sent to a user email address or as a text message to allow for this...

overstay8930|2 years ago

Breaking news: Security by obscurity isn't actually security

panic|2 years ago

“Security by obscurity” means using custom, unvetted cryptographic algorithms that you believe others won’t be able to attack because they’re custom (and therefore obscure). Having a key you are supposed to keep hidden isn’t security by obscurity.

makapuf|2 years ago

Well, I like my password/ssh private key to be kept in obscurity.

zzz999|2 years ago

You can if you use E2EE and not CAs

BobbyTables2|2 years ago

What happened to REST design principles?

A GET isn’t supposed to modify server state. That is reserved for POST, PUT, PATCH…

AJ007|2 years ago

[deleted]