> We believe that no data has been lost, unless the [...] GitLab copy was the only one.
One difference between how GitLab and GitHub run their infrastructure is that GitLab doesn't keep reflogs, and uses git's default "gc" settings.
As a result they won't have the data in question anymore in many cases[1]. Well, I don't 100% know that for sure, but it's the default configuration of their software, and I'm assuming they use like that themselves.
Whereas GitHub does keep reflogs, and runs "git repack" with the "--keep-unreachable" option. They don't usually delete git data unless someone bothers to manually do it, and they usually have data to reconstruct repositories as they were at any given point in time.
GitHub doesn't expose that to users in any way, although perhaps they'd take pity on some of their users after such an incident.
This isn't a critique of GitLab, just trivia about the storage trade-offs different major Git hosting sites have made, which might be informative to some other people.
I'm surprised no major Git hosting site has opted to provide such a "we have a snapshot of every version ever" feature. People would probably pay for it, you could even make them opt to pay for access to backups you kept already if they screwed things up :)
1. Well, maybe as disaster backups or something. But those are harder to access...
Re GitHub keeping unreachable data, if I understand it right, isn't that GitHub painting a giant target on their back?
Wouldn't that that imply every secret accidentally committed and then 'deleted' is still accessible, when one would expect it not to be?
It's one thing to have your source code in the wild, but pairing it up with thought-to-be-deleted secrets would be an absolute disaster.
Certainly one should not ever keep using a secret once it has escaped into a Git repo, but I'm sure it happens quite frequently.
That is a crazy-ass quote. “We believe that no data has been lost... well, except for the data we keep. But you weren’t actually relying on us to save any data, right?”
I know, back up everything at least twice. But still, when somebody loses one of your copies, they don’t get to say “it’s cool, no data was lost, you have other copies, right?”
I'm surprised github runs regular git. I'd always assumed they were emulating it, especially with the lag we've observed between github-api and github-git at $DAYJOB (update repo 1 via api, update repo 2 via api, fetch repo 1 and repo 2 via git, we've had cases where the repo 2 update was visible but not the repo 1).
"Gitlab.com was compromised" is a bad title. _Accounts on_ GitLab, and the credentials to access them, were compromised, but the title suggests that the whole platform was affected, which doesn't seem to be the case.
Looks like they changed the title. But I would have to say that the ability to delete a full repo with the credentials is a bit of a vulnerability.
To me, it seems like a good measure would be to mark deleted repos as "delete requested" then notify the users involved and give them a week or two to undo a total delete. Especially if it is an older repo with lots of commits.
Purging user data is one of the most common action attackers take when compromising an account. This makes it prudent for storage service providers to silently delay mass deletions to the extent allowed by their data deletion policy/GDPR to allow time to discover any breaches, or perhaps require second factor verification like a link sent via email.
There was a Docker Hub breach a few days ago, that's probably related.
I took a good look at how my personal tokens were used in Github and Gitlab.
- Enable 2FA.
- Enable Commit signing with GPG. for the past 2-3 years, I have slowly moved to sign commits and tags. GPG keys take a log of hygiene to work with (sub keys, revocation, etc), but they definitely can help in a situation like.
Git is a distributed VCS. If you have a repo cloned in a secure location (your server, Dev machine, etc), that is just as good as your Gitlab/hub hosted copy.
The current title "Gitlab.com Was Compromised" doesn't seem accurate. There's someone (or a group) currently attacking online repositories (gitlab is not the only affected provider) using passwords found in scans for files like .gitconfig's and the like. Unless new information comes to light about gitlab specifically being compromised, I'd say this is more about individual private repos being on the sights of a targetted attack.
So git doesn't let you add the `.git` to the index. Most reports I've seen mention that SourceTree was used as a git client. Is it possible that SourceTree committed .git and pushed it to remotes which were then scraped?
I don't get the ransom thing: users of a git repository have a clone of the repo that contains the whole history, no? So isn't it trivial to recreate the repository?
The attacker is also threatening to make these private repos public, or misuse their access to the repos in other ways (likely additional types of breaches).
Weird aside question: I notice the article says "at approximately 10:00pm GMT". Can someone explain why GMT might be chosen as a reference point here? Is there something I'm missing about the usage of GMT (and not UTC). It just seems particularly odd given that GMT is not (to my knowledge) actually being used as a concrete time-zone at the minute (BST is in effect for daylight savings).
I guess people who aren't overly pedantic say "GMT" to mean "UTC", just like everyone says "SSL" when it's actually "TLS". Older but better sounding names stick around.
While the fault lies with the users for not following security best practices, including enabling 2FA there are things gitlab/any site can do to help defend against these sorts of attacks. Some suggestions: Treat logins from datacenters as suspicious. (In this case the IP block identified belongs to World Hosting Farm Limited). Treat logins from a new/different ISP as suspicious. Limit access to the account and verify the login via email. It’s not foolproof but as part of a defense in depth strategy it can be quite effective.
Thank you for your feedback and suggestions. Unfortunately, for each of these proposals, we're likely to have users asking us why we are restricting and/or blocking access.
A better defense-in-depth strategy would be to scan each public repo for credentials, and act accordingly when credentials are discovered in repos. We are working on this strategy, currently.
A bit of a misleading headline. Somebody had passwords/tokens for certain repos, either from previous breaches at other services or those passwords being stored in plaintext as part of a deployment.
Since the threat is to make the code public, there is nothing more gitlab can do to shut down the attempted blackmail. It seems unlikely to be a real threat to most?
Odd threat in that paying the ransom doesn't assure they wouldn't make it public anyway.
Also, someone else noted the ransom email domain has no MX or A records, so the instructions to email them won't work. They seem to be hoping someone will blindly pay the ransom.
"...to wipe their Git repositories and hold them for ransom."
What an idiotic strategy to take with git repositories. Every local copy is a complete and fully-functioning copy of not just the code, but all history, etc. It's a non-centralized protocol.
A client of mine was hit by this. Makes me think I aught to go into security.. I brought up many concerns, and specifically about git access (using very weak mechanisms), a few months ago, to which I was told to "clean things up when I can" but that it wasn't a priority.
This is why I'm hosting GitLab myself even if I'm the sole user of the instance. For one thing I'm less of a target, the other is that this is not the first major problem with the hosted GitLab. I have better uptime for my instance than the hosted version.
Someone I know had one of their private repos on GitHub replaced despite having 2FA enabled so it may have been from a leaked personal access token somewhere. What's odd was that this user has push access to multiple active private repos yet only one was hit with the ransom.
Out of all things to hold for ransom git repos seem like a bad idea. Most of the time there are multiple clones lying around anyways. I agree that having the source code leak can be bad news, but the code itself being secret should not be a critical part of the business.
[+] [-] avar|6 years ago|reply
One difference between how GitLab and GitHub run their infrastructure is that GitLab doesn't keep reflogs, and uses git's default "gc" settings.
As a result they won't have the data in question anymore in many cases[1]. Well, I don't 100% know that for sure, but it's the default configuration of their software, and I'm assuming they use like that themselves.
Whereas GitHub does keep reflogs, and runs "git repack" with the "--keep-unreachable" option. They don't usually delete git data unless someone bothers to manually do it, and they usually have data to reconstruct repositories as they were at any given point in time.
GitHub doesn't expose that to users in any way, although perhaps they'd take pity on some of their users after such an incident.
This isn't a critique of GitLab, just trivia about the storage trade-offs different major Git hosting sites have made, which might be informative to some other people.
I'm surprised no major Git hosting site has opted to provide such a "we have a snapshot of every version ever" feature. People would probably pay for it, you could even make them opt to pay for access to backups you kept already if they screwed things up :)
1. Well, maybe as disaster backups or something. But those are harder to access...
[+] [-] lostmyoldone|6 years ago|reply
Certainly one should not ever keep using a secret once it has escaped into a Git repo, but I'm sure it happens quite frequently.
[+] [-] the_duke|6 years ago|reply
https://about.gitlab.com/handbook/engineering/infrastructure...
[+] [-] mikeash|6 years ago|reply
I know, back up everything at least twice. But still, when somebody loses one of your copies, they don’t get to say “it’s cool, no data was lost, you have other copies, right?”
[+] [-] masklinn|6 years ago|reply
[+] [-] Boulth|6 years ago|reply
[+] [-] saagarjha|6 years ago|reply
Well, links to orphaned commits still work, and GitHub has recently started surfacing UI when you force push a branch.
[+] [-] jtl999|6 years ago|reply
Does this also apply to self hosted GitLab CE/EE? Also how does Gogs/Gitea handle this?
[+] [-] blumomo|6 years ago|reply
https://github.com/search?o=desc&q=1ES14c7qLb5CYhLMUekctxLgc...
[+] [-] tyingq|6 years ago|reply
[+] [-] cdubzzz|6 years ago|reply
If we dont receive your payment in the next 10 Days, we will make your code public or use them otherwise.
[+] [-] nerdy|6 years ago|reply
It still suggests Gitlab's infrastructure (internally) was compromised: "Suspicious git activity detected on Gitlab"
Something like "Gitlab users' repos held for ransom" seems more appropriate.
[+] [-] unknown|6 years ago|reply
[deleted]
[+] [-] edwinbalani|6 years ago|reply
[+] [-] jjeaff|6 years ago|reply
To me, it seems like a good measure would be to mark deleted repos as "delete requested" then notify the users involved and give them a week or two to undo a total delete. Especially if it is an older repo with lots of commits.
[+] [-] paxys|6 years ago|reply
[+] [-] mortehu|6 years ago|reply
[+] [-] Ayesh|6 years ago|reply
I took a good look at how my personal tokens were used in Github and Gitlab.
- Enable 2FA.
- Enable Commit signing with GPG. for the past 2-3 years, I have slowly moved to sign commits and tags. GPG keys take a log of hygiene to work with (sub keys, revocation, etc), but they definitely can help in a situation like.
Git is a distributed VCS. If you have a repo cloned in a secure location (your server, Dev machine, etc), that is just as good as your Gitlab/hub hosted copy.
[+] [-] vasco|6 years ago|reply
[+] [-] commandersaki|6 years ago|reply
[+] [-] a3_nm|6 years ago|reply
[+] [-] ph4|6 years ago|reply
[+] [-] billconan|6 years ago|reply
[+] [-] duxup|6 years ago|reply
[+] [-] facorreia|6 years ago|reply
[+] [-] 2T1Qka0rEiPr|6 years ago|reply
[+] [-] greggyb|6 years ago|reply
For a long time GMT was a good reference point. Times have changed.
I used to work with a gentleman who would always schedule meetings on the phone as:
> Great, let's put that on the schedule for 2:00 o'clock Eastern Standard Time.
There was always a bit of officiousness to his tone and I think he just liked the idea of being precise.
And he certainly was precise. He was also off by an hour for half the year. Somehow no one ever missed a meeting, though.
I always sat on the other side of the room and ground my teeth.
[+] [-] floatboth|6 years ago|reply
[+] [-] s_ngularity|6 years ago|reply
[+] [-] unknown|6 years ago|reply
[deleted]
[+] [-] chiefalchemist|6 years ago|reply
For those who might not be aware: It's possible to configure your .git config to push to different remotes.
[+] [-] snowwolf|6 years ago|reply
[+] [-] gitlab-security|6 years ago|reply
A better defense-in-depth strategy would be to scan each public repo for credentials, and act accordingly when credentials are discovered in repos. We are working on this strategy, currently.
[+] [-] sylens|6 years ago|reply
[+] [-] kzrdude|6 years ago|reply
[+] [-] tyingq|6 years ago|reply
Also, someone else noted the ransom email domain has no MX or A records, so the instructions to email them won't work. They seem to be hoping someone will blindly pay the ransom.
[+] [-] _bxg1|6 years ago|reply
What an idiotic strategy to take with git repositories. Every local copy is a complete and fully-functioning copy of not just the code, but all history, etc. It's a non-centralized protocol.
[+] [-] exabrial|6 years ago|reply
[+] [-] theWheez|6 years ago|reply
[+] [-] kissgyorgy|6 years ago|reply
[+] [-] agildehaus|6 years ago|reply
[+] [-] smudgymcscmudge|6 years ago|reply
[+] [-] ssnistfajen|6 years ago|reply
[+] [-] jaden|6 years ago|reply
[+] [-] 1023bytes|6 years ago|reply
[+] [-] yoz-y|6 years ago|reply
[+] [-] based2|6 years ago|reply