The unattributable “db8151dd” data breach

Dataset for sale: [redacted]

Covve: This simple yet state-of-the-art app will revolutionise your business relations like you've never seen.

Edit: Response: https://twitter.com/covve/status/1261287954967941120

amatecha|5 years ago

haha, I found exactly the same! https://twitter.com/amatecha/status/1261231178423517184

A user who replied to me also shared some anecdotes that indicate further evidence towards that being the source (a private email address only used for GSuite admin purposes, on her iOS device, upon which she had Covve installed) -- thread here https://twitter.com/angelalgibson/status/1261314415829237761

Nextgrid|5 years ago

The metadata in the breached records like "Imported from EverContacts" or similar supports the theory that it comes from a contacts app.

Redoubts|5 years ago

Oh man, what is even going on with that raid forum.

eganist|5 years ago

The responses to the comment just below you (https://news.ycombinator.com/item?id=23190102) (and the nature of some of the corporate hits I've seen) seem to be consistent with a contacts database of sorts.

Not sure I'd go so far as to accuse a specific company on a public forum. But in this regard, the idea that a contact management app could be behind this DB is plausible.

mattlondon|5 years ago

Forked the stackblitz for posterity https://stackblitz.com/edit/angular-3nxvlm?file=src/app/app....

viro|5 years ago

thank you

alexproto|5 years ago

Hi all, Alex here, CTO at Covve. Just got alerted of incident db8151dd in . We’re investigating as top priority with our security experts what relation this may have with Covve. We are monitoring the feedback in this blog and would really appreciate any additional information you may have on this as we investigate (alex@covve.com).

service_bus|5 years ago

It appears your organization left an elasticsearch database exposed to the internet. This happens frequently due to poor configuration.

You're either going to have logs pointing to an IP that the individual used to siphon your data, or nothing.

With an exposed elasticsearch database, you possibly had the data being siphoned by many parties, and are only aware now because of this particular incident.

If you have any operations regarding customers in Europe, you need to notify your relevant Data Protection Authority

https://edpb.europa.eu/about-edpb/board/members_en

You should also sign your engineers up for this course:

https://www.elastic.co/training/specializations/elastic-stac...

mdip|5 years ago

Interesting; based on what I'm seeing, it certainly looks like a matching structure and it's got enough uncommon fields in it to suggest that it's likely to be related to Covve software. There is a link in the comments to the source in question, and I don't know enough about Covve's product -- can someone run this on-prem or in their own defined infrastructure (is all of it on GitHub?) or is this a case where the data/server is proprietary to Covve and making it unlikely that someone created a compatible server with a similar structure.

Kudos for reaching out to the greater HN community as a channel for information. A lot of companies are concerned that such a public request gives the impression that they don't have a handle on things. Let's be honest: there isn't a company on a planet that, immediately following a breach, has a handle on things. Honesty is a pre-requisite to re-establishing trust in the (seemingly likely) event that this is a breach if your customers' data.

I don't envy the position you're in. By now, you've hopefully downloaded the link to the data dump[0] and have compared it against your own data to confirm that it is or is not a breach of your own operations. Please put out a communication as soon as possible if you confirm it's their data. Immediately after closing off access to the data (and I'd consider taking the whole thing offline[1]), before you take the additional steps to protect your environment from breach.

The next step is to lock it all down, everywhere. Rank the risks associated with your data; bubble that up to the components that touch it. Encrypt data and protect your private keys (HSM/virtual HSM), to the extent possible, segregate your data by risk, assign separate accounts to different risk categories and ensure lower risk accounts lack permissions to the data and cannot acquire the key to decrypt. Your "Staging", "Development" and "Test" databases ... any chance they have a snapshot of production from some point in the past[2]? Reduce the public exposure of your infrastructure -- create multiple private networks; ensure data can be accessed only by the thing or things that need to access it on the permissions and network layer. Depending on how you're set up, isolate management interfaces to a private network requiring separate authentication in addition to device authentication. Grant permissions to staff on a "minimum required to work" policy. For staff that require day-to-day permissions to high-risk assets, minimally get them a separate (individually assigned) administrative account to avoid accidental changes. But generally stick with "this person, and this alternate (bus factor), only, can alter permissions related to accounts used in production infrastructure"; ideally, requiring both for permissions changes would be awesome, but I'm not aware of broad adoption anywhere.

Audit roles assigned to everything. If this is AWS, you're going to be spending some time in AAM-related tools. Look at every account, every permission and everything it's assigned to and challenge it: does it need this much access? Can I make the access more specific (device narrowed)? Can I assign less access and achieve the same result? Can I separate out these two services with different risk profiles so permissions can be assigned more carefully?

All the best -- not a fun situation to be in.

[0] Someone posted one in the comments; might be gone, dig around the usual places and find a link from a "direct download" site if it's been taken down. (aim for mega.co.nz links; less costly, or google awesome-piracy for workarounds).

[1] I co-authored large parts of the internal security policy at Global Crossing (carried forward to Level 3) about a decade ago - we had a "Critical" category -- when triggered, a situation call started and didn't end until the issue was stable and root causes/solutions were identified. It also meant "if a device was categorized as being able to be infected (we were often dealing with aggressive malware), it was allowed to be taken offline regardless of impacts to the business" - i.e. the cost of failing to contain this is higher than the cost of turning off customers' service. We threw the switch a handful of times. It was hell.

[2] I used to lose it when I saw people doing this with live customer data... except that I've encountered it on 80% of projects I've worked, so I'm numb to it. You can roll fake data pretty easily with various different tools (online and CLI); nobody protects staging/test/dev like they protect prod and since you've determined you must protect this data in production, you don't want to have to protect dev/staging/test that same way.

unknown|5 years ago

[deleted]

bob33212|5 years ago

Did you have guarduty or VPC flow logs turned on?

ItsPrivacyFool|5 years ago

[deleted]

dirtydroog|5 years ago

[deleted]

xenophonf|5 years ago

Troy's fighting the good fight, but it's so freaking depressing. If he has hundreds of millions of records worth of personal data from just the breaches that have been shared with him, what _else_ is out there in the hands of criminals and corporations, neither of which have the public interest at heart—only naked self interest in exploiting members of the public for as much money as they can get?

tialaramex|5 years ago

Millions per day. This used to be part of one of my old jobs. A feed of stolen PII would drop into our SFTP server every morning and we'd process it.

There's no honour among thieves so there were a bunch of duplicates pretending to be "new" data, but yes there is a cottage industry of stealing smaller quantities of PII, focused particularly on email addresses and passwords (because those get re-used elsewhere) and credit card data (because you may be able to either buy something with it or at least fool your way past an immediate check on the card)

Do not re-use passwords. Like, that's the really easy "Wash your fucking hands" level lesson here. As someone who isn't employed to work with this data any more I'd say that 99% of the value isn't with like stolen passports (though we did see some passport data) or even credit cards, but the passwords.

If you hate that this is even a problem adopt and (if you write code or specify software) implement WebAuthn. Nobody would steal passwords if they didn't work. Not only does stealing WebAuthn credentials from a site's database not work (they're public, the secret that's valuable never leaves the user's FIDO dongle) crooks also wouldn't bother doing it, just like crooks don't steal farm machinery to pull candy vending machines off the wall and steal candy, whereas they do attack ATMs in exactly this way.

cantrevealname|5 years ago

> what _else_ is out there in the hands of criminals and corporations

Don't forget governments. Whatever criminals and corporations have that they shouldn't have, governments probably have an order of magnitude more.

Nextgrid|5 years ago

For the people that use unique per-merchant e-mail addresses (like someone+amazon@...), could you try some of those aliases on HaveIBeenPwned and see which ones come up in this breach? That might shed some light onto its origin.

deng|5 years ago

BTW, since many people don't seem to be aware of this: If you have your own domain, you can get informed by haveibeenpwned automatically if any mail address from that domain is in a breach. All that is required is that you're reachable on that domain through an address like 'postmaster'. This feature can be found under 'domain search'. Since I use a new address for pretty much anything this is very handy.

huhtenberg|5 years ago

I am listed, but it's an address that was never used to register or subscribe to anything online. It's also under a year old.

It must've been vacuumed up from other people's contact or email data.

css|5 years ago

For me, the HaveIBeenPwned domain search only lists one item in this breach: my LinkedIn@... email. Searching my inbox shows that the only emails sent to that address are from LinkedIn, so it probably came from a company I sent a job application (LinkedIn Easy Apply) to at some point.

edent|5 years ago

I use unique emails. My record in this breach is just a generic "contact@" address.

StavrosK|5 years ago

I use the format you mention for almost everything, but my email address in this breach is one I haven't use in something like ten years.

alberts00|5 years ago

HaveIBeenPwned now has feature set to find e-mail addresses which were breached under a domain, there is normally no need to search for separate aliases if you own the e-mail domain.

https://haveibeenpwned.com/DomainSearch

willvarfar|5 years ago

Does hibp know enough about the regular providers such as gmail that support this, to be able to attribute someone+amazon@gmail.com with someone@gmail.com?

alias_neo|5 years ago

That was my first though, I also use "company@mydomain" sometimes. Too many to go through... if only I could get hold of my record....

eganist|5 years ago

I follow this pattern exclusively, though I haven't actually received any recent HIBP notifications. I'll do a manual check.

Edit: three personal domains registered nothing. One corporate domain registered a double digit hit. If I discern any clues I'll get back to the thread.

m-p-3|5 years ago

I'm waiting for Firefox Relay to become available just to better control who has my email address and the flow of emails, but I'm worried it will make the task more difficult to follow breaches.

Maybe Mozilla could partner with HaveIBeenPwned to help dealing with that?

tinus_hn|5 years ago

Remember that once you try an email on a service like that, it’s no longer unique to the merchant.

VectorLock|5 years ago

So many things disallow + in email addresses I don't even bother any more.

mattlondon|5 years ago

My gmail is on it, but not my burner-domain. So either the data is old (year or two), or they got my gmail from somewhere else.

I'd be interested to see the whole dump to see my full record...

cr3ative|5 years ago

It's got my generic one (firstname@), and an older Facebook login email address (facebook@, changed now since Kickstarter leaked that one). Interesting.

PanMan|5 years ago

I did, and I usually use site specific emails (eg amazon@username ) but it found my "generic" firstname@username email... So no insights there.

simias|5 years ago

I suspect that Troy Hunt would have noticed if there were many emails with "+someservice" in the dump since he can easily dump them all.

dgellow|5 years ago

> Why load it at all? Because every single time I ask about whether I should add data from an unattributable source, the answer is an overwhelming "yes"

To be fair, you’re asking your followers on twitter. That’s as biased as you can have, I would be really surprised if the majority would say no.

SideburnsOfDoom|5 years ago

I got notified that I'm in this breach, and I honestly don't know what (if anything) I can do with this information, which implies "If it's not actionable, why bother telling me at all?"

Unique passwords per site, with a password manager? Done a long time ago. Should I change some of them? OK, which ones? there are hundreds.

Details of what else about me is in this breech? Not clear where I can find that.

onefuncman|5 years ago

This is a positive bias IMO, and any negative reactions that bubble up in the replies are going to be more useful.

numpad0|5 years ago

Could it be Google+? 3 of 3 my Gmail addresses associated with their profile in some way were on it. Two of it I might have used to register a domain, but the last one I used for G+ and one other website only and none of any friends know this. Also I'm not in US or have US background, can't be from American friends' phones or retailer CRM.

onefuncman|5 years ago

This seems like a winner to me. Iterating a graph along some association explains the ordering mentioned in the blog post, and explains the breadth of connectivity.

unknown|5 years ago

[deleted]

londons_explore|5 years ago

> Recommended by Andie [redacted last name]. Arranged for carpenter apprentice Devon [redacted last name] to replace bathroom vanity top at [redacted street address], Vancouver, on 02 October 2007.

Given that, surely Troy can contact those people and ask "who knew this info?". Not many people would know who replaced my bathroom vanity top...

pfundstein|5 years ago

Sure but perhaps Devon used a SAAS CRM system whose servers were breached... Or maybe Andie posted on Devon's public Facebook page to organise the job. Maybe it's just the LinkedIn leaks resurfacing, etc, etc.

typpo|5 years ago

I use a unique email on my personal domain for everything I sign up for.

The email contained in this breach is the one I provided to Facebook. It was probably hacked or sold from one of the handful of apps I've connected with FB over the years.

secfirstmd|5 years ago

One of my emails is currently on:

"Pwned on 19 breached sites and found 5 pastes.

If this is public breaches, I would guess in reality I can probably assume it's on double/triple that for sites that have been breached but the data hasn't been posted online.

wincent|5 years ago

I don't really get the utility of HIBP. The answer to the "have I been pawned?" question is, of course, yes, multiple times. I think about the only way to keep your email out of the hands of the bad guys is to not use it or give it to anyone ever, at which point you don't need an email address.

What am I supposed to do whenever I'm involved in a new breach? Burn all my accounts and start again?

koheripbal|5 years ago

If you use a password manager to give you unique passwords per site, then these alerts allow you to only change the impacted site's passwords.

...though in a case like this it wouldn't help since we don't know the site.

Normal_gaussian|5 years ago

The monitoring service is useful, when a leak is detected you can reset that password.

Knowing that you have been historically breached is less useful.. Until I need to convince somebody to start taking account security seriously.

Its quite sobering to discover that data breaches are commonplace.

scrollaway|5 years ago

The biggest contribution HIBP makes is in teaching people not to reuse passwords (and use a password manager instead).

multidim|5 years ago

>What am I supposed to do whenever I'm involved in a new breach? Burn all my accounts and start again?

If you reuse passwords, then change your passwords for all the accounts that use the breached password. Hopefully, it'll spur you to start using a password manager so you can easily have strong, unique passwords.

If you don't reuse passwords, then change your password for the breached account. Sometimes services don't tell you about breaches and it is HIBP that first informs you about the breach.

If there is some email address that you really, really don't want bad guys to know about (perhaps a dedicated email address for your important financial accounts), then it helps you know when to switch to another email address.

HIBP helps you know how often a service has been breached in the past, and that might help guide what services you want to use/not-use in the future.

numpad0|5 years ago

Check account recovery procedures, change password for that website, check login history and active sessions, see if anyone had done anything that could be done through that credentials, on top of using random generated passwords in the first place.

And I think you’re about to describe Sign In with Apple.

sbarre|5 years ago

As the other comment also said, it's a public education service.

Remember that most of us on here have extremely advanced knowledge of the Internet and its workings. This is not the case for the vast majority of Internet users.

xondono|5 years ago

It depends how many emails do you keep. If you get a hit it’s a good idea to ensure that you keep control of the services related to that address (change passwords, set any extra security measures).

I mostly use it through 1Password, because it also notifies you when a service has enabled new security features like 2FA.

EmilioMartinez|5 years ago

For me it's a shortcut to explain why it's always a risk to divulge personal information to 3rd parties, however trustworthy they seem.

polote|5 years ago

After how many breach of ES clusters, Elastic will decide to make their db not accessible from external IP by default ?

zaat|5 years ago

That's the default for a long time already, but people actually want to use it from outside the server and so they configure the listener.

https://www.elastic.co/guide/en/elasticsearch/reference/6.3/...

r1ch|5 years ago

Is this dump online anywhere? I got the notification from HIBP but it only tells me my email address appeared and I'm curious how accurate the rest of the data is.

esnard|5 years ago

> Back in Feb, Dehashed reached out to me with a massive trove of data

I guess searching on https://www.dehashed.com/ should give you some additional data.

celticninja|5 years ago

exactly what I want to check. it's almost expected that at some point my email address is going to end up in a breach, but there is a chance that by reviewing the data I can ascertain where it came from, at least in part .

guessmyname|5 years ago

> Email addresses, Job titles, Names, Phone numbers, Physical addresses, Social media profiles

I just got the email notification from HIBP (Have I Been Pwned) a few minutes ago [1], but I am not worried about the compromised data because 1) my personal email address, job title and phone number are all visible in my resume which is publicly available in my website, I actually encourage people —mostly tech recruiters— to download the PDF and contact me via email or phone all the time and 2) my physical address is irrelevant because I have been moving houses every year for the last seven (7) years (even across countries a couple of times. All the social media accounts I have are completely empty, I just keep them around to get a hold on to my nickname.

I recently found, in my website’s HTTP logs, several requests from a web crawler controlled by ZoomInfo [3] an American subscription-based software as a service (SaaS) company that sells access to its database of information about business people and companies to sales, marketing and recruiting professionals. I was going to configure my firewall to block these requests but then I remembered —hey! my website only has information I am comfortable sharing, so it doesn’t matter— but I’ve been thinking it is just a matter of time before someone hacks one of their systems and leaks their database.

In my previous-previous job I found a fairly simple (persistent) XSS vulnerability in BambooHR that allowed non-authorized users to access data from all employees registered in the website including Social Security Numbers (SSN). I told my boss and we immediately edited everything before migrating to a different system. We never knew if BambooHR fixed the vulnerabilities and I wouldn’t be surprised if the data was leaked before or after I found the security hole.

Software security is such a Whac-A-Mole game, even if you get the budget to conduct security audits on your code, there is always going to be a weak link somewhere in the chain and that will be your doom. This is one of the many reasons why I left that job as a Security Engineer, the other reasons were Meltdown [3] and Spectre [4] they both made me realize I was fighting for a lost cause.

[1] https://haveibeenpwned.com/NotifyMe

[2] https://en.wikipedia.org/wiki/ZoomInfo

[3] https://en.wikipedia.org/wiki/Meltdown_%28security_vulnerabi...

[4] https://en.wikipedia.org/wiki/Spectre_%28security_vulnerabil...

cpv|5 years ago

> Email addresses, Job titles, Names, Phone numbers, Physical addresses, Social media profiles

Probably these can have a different impact if your threat model is a bit different (money, status, living area, position held, etc).

Reminds me the story about an investigative reporter known in these parts, who was swatted: https://krebsonsecurity.com/2013/03/the-world-has-no-room-fo...

or received a drug package from an investigated person, basically it was a trap: https://krebsonsecurity.com/2015/10/hacker-who-sent-me-heroi...

The journalist knew about this and informed the police beforehand. Happy end.

To add a little more, I have seen people posting on social media answers to posts like "your favorite car, your place of birth, name of mother, name of pet". Guess who uses those words for similar secret questions?

Some personal identifiable information can be used to fabricate fake IDs, for various purposes.

And if we have a linked graph with all the personal, job, address, interacted people, geo-places, etc, it can get creepy (sounds like Facebook, but much more open).

Not saying we all should get paranoid, but leaked data could be used in different ways.

sirius87|5 years ago

The BambooHR theory is interesting. I looked up email addresses of co-workers at a startup I worked for a few years ago (Jul'15-Jun'16). I was with them earlier in 2012-13. My work email isn't there. But the slice of people between Apr'13-Jul'15...all there. I guess we ran through a bunch of HR software during the period, BambooHR being one of them. So either it's a subset of BambooHR or its some other product a bunch of people at my workplace signed up for.

Thoughtful|5 years ago

On the BambooHR issue, can you elaborate a bit more?

throwaway834792|5 years ago

Based on a large (over 50 results) domain search for a company I work for, the data I found was very old, circa 2014.

I know this because almost everyone in the domain search stopped working for the company on or after 2014. Everyone else has worked at the company since 2013 or earlier.

eganist|5 years ago

Heads up, found at least one match for 2019 from a dataset I'm working with.

lawnchair_larry|5 years ago

That doesn’t set an upper bound on when the breach happened, it sets a lower bound. Old email addresses aren’t deleted by whoever had them. It just means it contains data from at least 2014, up to and including 2019.

koheripbal|5 years ago

The email notification doesnt list the emails impacted. Do you need to rerun the full report to get the details?

tru3_power|5 years ago

I did some quick searching for the dataformat included in the snippets from the article. Lots of repos with stored secrets that match:

https://github.com/acalvoa/SRID_CHANGER/blob/da367e68433b3fd...

Stored secret:

https://github.com/acalvoa/SRID_CHANGER/blob/master/config.p...

Will look more into this later

amatecha|5 years ago

Ehhh, to me those seem like pretty common fields for any kind of contact data. It doesn't have some of the more unusual or IMO implementation-specific fields like "ShowableNonVisibleToOthers" or "PopulatedCleanNumber", for example.

killswitched|5 years ago

Some emails that turned up on my end: Dr. Dobbs and New Relic, although the leaks occurred from parties to whom these sites had provided my data, including at least unique email addresses.

unknown|5 years ago

[deleted]

forgotmypw23|5 years ago

The first thing that comes to mind is recaptcha with some overlays. they would know almost every account you've registered for.

cm2187|5 years ago

Does elasticsearch have no authentication by default like mongodb or did someone deliberately make it public?

tyingq|5 years ago

Fixed now, but this was a common sequence of events at one time: https://discuss.elastic.co/t/ransom-attack-on-elasticsearch-...

leetbulb|5 years ago

No authentication by default.

wnevets|5 years ago

Am I the only one who dislikes some of those column names?

isNonIndividual, IsNonVisibleToOthers, ShowableNonVisibleToOthers

akersten|5 years ago

I can smell the enterprise ball-of-mud spaghetti code from here :)

outworlder|5 years ago

Negative flags sucks.

wjnc|5 years ago

Question: It was my understanding that a lawyer could sue the cloud provider for customer details of the cloud service in detail? It would be relevant information in determining liability for leaking this PII.

voidmain0001|5 years ago

Firefox Monitor includes the db8151dd data: https://monitor.firefox.com/?breach=db8151dd

yahelc|5 years ago

Probably because they include HIBP data https://www.troyhunt.com/were-baking-have-i-been-pwned-into-...

jonykakarov|5 years ago

what I can't understand is that I never heard of this covve app neither most of the affected users in the comment section on reddit or troy website or even here as no one thought of it , and my email does exist on the breach, also the data seem to be huge (103,150,616 rows/90GB)for an app that have about 100k install, need some explanations here.

bluesign|5 years ago

It’s contact data from iOS and android phones probably scraped via some malware app/apps

akmarinov|5 years ago

Contact data doesn’t contain CRM references

148 comments