top | item 26768299

Clubhouse data leak: 1.3M user records leaked online for free

306 points| 0xmohit | 5 years ago |cybernews.com | reply

82 comments

order
[+] mittermayr|5 years ago|reply
I reported this to Clubhouse in February, no response whatsoever (I am not involved in this leak, just to be extra clear). Essentially anyone with the token from the iOS app (MITMproxy + SSL kill switch) can query through the entire public (records are cleaned) user profile database. It supports wildcard queries and just responds with some 20M records you can page through if you have the time. It luckily (!) doesn't expose e-mail and phone number, which is why I also agree with others here that this is only mildly interesting. The news won't care, however. I think at around 4M users or so they switched from auto-increasing IDs to a better numbering format, until then all records remain as-is (increasing).

I think Clubhouse can fix this quite easily (limit the records returned in search!!!) and apply some harsher rate limits on a per-token basis (tokens never expire, that's another thing).

I think they relied a bit too much on certificate pinning. Once that's bypassed, it's relatively easy to query your way through the data. If you managed to grab someone else's token (which doesn't expire), you impersonate them (without logging the other session out), and continue to show up/talk in rooms using the Agora SDK as that person.

They also do upload phone numbers of the address book in clear-text (non-hashed), although I can see that there's not too much of a point because reverse-hashes can maybe work around this easily if not salted.

[+] ramoz|5 years ago|reply
I was in some of those CH convos with you. I was actually suspended for a little while and tried clearing it up with them. Sent all the details I had and the original google doc I published w/ a lot of styprs work. They never responded but I was unbanned and given some fresh invites... but yea... strange it hasn't been cleared up.

Ultimately I think the premise is around a completely open and a transparent digital experience. Clubhouse still needs to defend against those with malicious intent and a new realm of psychographics to abuse.

Side note: I was hooked on the app until that suspension (lasted ~2w)... I haven't been able to get back into a groove. I rarely log on anymore.

[+] rsj_hn|5 years ago|reply
Reading your post, it's amazing the checkboxes of failed access control efforts:

* trying to control clients

* obfuscating IDs

* rate limiting data

...rather than the more boring yet standard approach of thinking through an access control policy and then enforcing that at the server.

[+] aasasd|5 years ago|reply
> Once that's bypassed

Do you mean that you trick the app into accepting a wrong cert? How does one do that, apart from decompilation?

[+] bitexploder|5 years ago|reply
Nothing wrong with auto incrementing identifiers if actual security controls (authorization) are implemented for already authenticated users.
[+] rvz|5 years ago|reply
From [0]:

> This is misleading and false. Clubhouse has not been breached or hacked. The data referred to is all public profile information from our app, which anyone can access via the app or our API.

So just like what happened to Parler and LinkedIn. A so-called 'data breach' of its public data via scraping.

But last time I checked on the private API in a GitHub repo, Clubhouse is using integer IDs which are not random alphanumberic strings for its users.

This can essentially be scraped by a while loop, incrementing all the way to whoever last signed up.

Did Clubhouse even implement rate limiting to combat this?

[0] https://twitter.com/joinClubhouse/status/1381066324105854977

[+] sschueller|5 years ago|reply
Does anyone remember the ATT "Hack"? These two just used curl to get e-mail address and ICC-ID of ATT iPad users which where publicly accessible. [1] It was still labeled a hack and went through the brain dead media that way. Instead of ATT getting in trouble Auernheimer got a 41 months sentence and the judge also ordered him and Spitler to pay $73,000 in restitution.

[1] https://www.wired.com/2013/03/att-hacker-gets-3-years/

[+] Kye|5 years ago|reply
It's only a problem if you think it's a problem for someone to trivially build a social graph for every person on your exclusive social network with lots of high profile people.

So...it's a problem.

[+] eswat|5 years ago|reply
Not a great response on their part since the article they reference in the tweet does not say that they have been breached or hacked. Only that there's a limited dataset of users out there and that Techmeme reached out to Clubhouse to know if they are aware of any breaches of their systems.

Pretty bad optics if the other stuff is true: incremental IDs, no rate limiting, tokens that don't expire.

[+] mvanaltvorst|5 years ago|reply
Correct, and judging from someone else in this thread, it was even possible to use wildcard matching to get access to an entire list of users at once.
[+] benja123|5 years ago|reply
I understand why people are saying this is not a breach and I tend to agree. I do think there are some basic measures you can put in place to make this kind of abuse harder.

The real problem is that most users don’t understand when they sign up for a service like clubhouse, what information is public, how easy it is for bad actors to get access to that information and how this information can be used to harm them later (phishing, identity theft etc.).

Who should be educating the average non technical user about the risk of agreeing to share you information publicly and even if they knew would it actually change anything.

Personally, I have hit the point where I have accepted that all my -and my families information is public and for that reason with people like my parents I tend focus on teaching them to avoid falling for phone scams and phishing.

[+] lovedswain|5 years ago|reply
I guess a leak requires private data to be exposed, this is just a collection of public data.
[+] p49k|5 years ago|reply
Is it public info who invited you to the Clubhouse app? If not, that would assume some kind of breach, since that info is part of the leak.
[+] xyst|5 years ago|reply
I agree. This "hack" is the equivalent of any search engine indexing public Facebook or LinkedIn profiles.
[+] gabipurcaru|5 years ago|reply
Same as the recent FB and Linkedin incidents. It's all scraped data. Doesn't mean that collecting public data at scale is not something bad
[+] jtokoph|5 years ago|reply
It looks like someone just scraped all of the public profiles.
[+] Zealotux|5 years ago|reply
It looks more like a SQL dump to me. The data doesn't seems to be too critical however.
[+] bloudermilk|5 years ago|reply
Yup, this seems to be the case. I don’t know how this could be characterize as a leak?
[+] monkey_monkey|5 years ago|reply
Perhaps we need to add a term such as "harvesting", to better distinguish between hacks/leaks and mass aggregation of public profile data.
[+] hashhar|5 years ago|reply
Looks like it's a scrape of public profile information from Clubhouse.

Also it reads more like an advertisement for the author's services.

I'd like to see a more credible source.

[+] coldcode|5 years ago|reply
That someone who wrote an iOS app with such a lame concept of security that anyone could dump the entire database (even if its only "public" data) in a script is not surprising, as most startups and even big companies don't give a crap about security. I've seen this way too often. If you are so cavalier about security in simple rest queries, imagine what lurks beneath not yet discovered.
[+] rogers18445|5 years ago|reply
This data seems to have been public and free for a while... Here: https://www.kaggle.com/johntukey/clubhouse-dataset
[+] xyst|5 years ago|reply
The generated graph is interesting. I guess everyone in the middle are early adopters and people with high numbers of followers. Then the clusters on the outer edges are people catering to a niche audience. Then those niche audiences spawn their own microcosm
[+] ulzeraj|5 years ago|reply
This is the cherry on the top for their policy that requires real names. People never learn.
[+] asimjalis|5 years ago|reply
They also know your birthdate and phone number. The only thing they don’t know is the name of your first pet.
[+] swiley|5 years ago|reply
I'm completely done with centralized social media "apps." I'm not signing up for any more and other than HN I've stopped using all of them and recommended that my friends do the same (surprisingly, many have listened.)
[+] asimjalis|5 years ago|reply
I feel this validates their decision to only release on iOS first. On Android there would be even fewer barriers to this kind of scraping.
[+] cblconfederate|5 years ago|reply
I think we're watching the implosion of the cloud. Those leaks are not even illegal, yet they will lead to a lot of spam, a lot of phishing, and a lot of other clumsy actions by clumsy actors that will alienate users and make them more reluctant to give their information next time. At least, i hope we re post peak cloud and falling fast into the norm of the internet the way it was meant to be: pseudonymous
[+] runeks|5 years ago|reply
I don’t get it. This “leaked” information looks like something that would be displayed on a public website for each user. As far as I can see it’s just public information, like user names and avatars on e.g. stack overflow.
[+] _trampeltier|5 years ago|reply
2021 the year of leaks ..
[+] o_m|5 years ago|reply
Not really. It seems like we have redefined what a leak is.
[+] tonetheman|5 years ago|reply
All of their devs must be too busy working on an Android app to fix these minor security bugs... :)