top | item 39840461

(no title)

My point is: why you should change hashing algorithm in GIT ??? Let's elaborate:

1. Do SHA-1 put a security risk in GIT ?

2. Is that practically exploitable in any way?

In some application, for example password hashing, SSH MAC, etc, you have good reasons to change hashing algorithm when it became obsolete: because an attacker can be computationally advantaged to crack a password, to compromise the integrity of transmitted packets, etc.

But not because an hashing algorithm became obsolete for some application is obsolete for ALL possible application. Moreover, in some specific application could be DESIRABLE a fasted hashing algorithm.

So why You should change SHA-1 in GIT ?

>> "But a few more of these tricks and I can see those "garbage comments" collision happening"

I don't think so, is computationally astronomically difficult whatever tricks yo u invent. The point here IS NOT to generate a collision adding "garbage comments", again, is to alter the behaviour of committed code in a functional way.

>> "Even without language models you could use something like a language's EBNF grammar as a token generator for source code which would probably pass any glance checks, but definitely not dedicated inspection like a code review. That is probably something that IS PRACTICAL TODAY for SHA1"

Yeah, prove it!

discuss

cyph3r0|1 year ago

I agree, the necessity of something stronger than SHA-1 should be demonstrate.

TrueDuality|1 year ago

I don't need to prove that I can do a thing to prove that a thing is possible and the burden of proof is on you claiming that this isn't an active security problem because that's basically well known and well understood. The only outstanding questions is how-detectable, impactful, and available those attacks are.

Specifically the things you need to counter is at least one of the thing in the following list:

* Hash security: SHA1 collisions are feasible to generate and companies are actively moving away from them with good reason and have been doing so for at least seven years (https://security.googleblog.com/2017/02/announcing-first-sha..., https://www.howtogeek.com/238705/what-is-sha-1-and-why-will-...)

* Content generation: As I've already discussed, the contents of what you use to make that collision can be anything you want and meet any requirements you have the ability to produce a generator for. To meet this you're going to have to prove to me that no engineer can make a seeded random number that uses a language's grammar to produce plausible and valid to compile token, or to just use a language model to produce plausible code and comments (also requiring a seed). This is a _trivial_ thing to do.

* The attack: Git relies on a chain-of-hashes based on SHA1, those hashes are over the complete files included in the repository if you can generate a collision for a file in git's history you can replace the files in that commit and all subsequent commits will remain valid. This is the attack everyone is worried about related to git. The only thing that protects against this right now is the security of SHA1. Additionally signatures on commits and tags DO NOT protect against this, they're over the hash, commit message, and list of objects not the objects themselves. The attacked files will still look like they came from a valid signed commit.

The extra scary part of that attack is the malicious/changed file will not be visible to any existing checkouts, those clients will believe they have the correct object and will continue to show that correct object. But anything that does regular fresh checkouts, like say a CI system that deploys to prod, will get the poisoned object. Even if its checking the signatures on every commit, it won't see this coming.

So the security of all our git repos, our production environments, new devs are foundationally rooted in the security of either write access to the repository OR the foundational security of SHA1.

I would say that is a practical and useful attack. A faster hashing algorithm will EXACERBATE this problem as you're almost always trading collision resistance for speed. Any hashing algorithm that allows you to calculate its hashes faster is MORE vulnerable to collision attacks not less.

"Computationally astronomical" isn't a very good argument. 20 years ago SHA1 was insane in its security. These thing get weaker over time and need to be periodically replaced, not because they're failing, but because increased resource capacity has fundamentally changed the original assumptions the algorithm was designed for.

Even with the computationally astronomical argument that is a matter of cost and resources, not practicality. It absolutely is practical to do if the result is worth the outcome. What is the most famous git based project? Maybe the thing it was originally designed to manage... Think maybe _any_ nation state would be happy to pay less than ~$100k USD (https://sha-mbles.github.io/) to get some malicious code running in production builds of the Linux kernel? The kernel project specifically has extra manual checks and multiple "known good repos" with commits literally being added by hand to protect against this attack. It's practical, it's a problem. It needs to be fixed.

If you still insist on a working example pay me $125k and I'll produce one for you.

cyph3r0|1 year ago

If someone can change a committed file inside a git repository , the main problem is that your system is FUBAR. Let's say I'm the attacker and I'm inside I can change committed files and I can generate a collision for each. If my goal is to deface the repository I can insert file with gibberish, i.e. I have a file with source code:

... omissis ...

ptr=calloc(SIZE, sizeof(long));

... etc ...

then I have :

aDjw'pfojqe'rf[24oijgfpoemgl;m,g02ir-9u13]9fu24[efgje2ioprn

Same sha1 hash.

But wait, why should waste 1000 GPU to deface a Git repository when I can simply delete it. I can change the files, I can delete it. It's simply stupid.

An attack with a sense is to change this:

ptr=calloc(SIZE, sizeof(long));

inserting:

ptr=calloc(SIZE-10, sizeof(long));

Now I have a BOF, same hash, only a code review can find the fraudulent change.

This is beyond "I make a collision inserting commented gibberish" , like this:

// adojwqf'pjqeworivhneq;lnvl;dqjnfvljeqrvneljvn

You have to insert a change that works and implement an attack making it invisible.

Good luck with that. I also read in some comments some AI nonsense I find Star Trek bullshit.

> If you still insist on a working example pay me $125k and I'll produce one for you

Even with 100M$ budget, you can't.

But why I even want to do that ? I have access, I can replace the whole repo with one full of exploitable bugs !

So the initial question: "If I change sha-1 in Git with some newer version, is that a security improvement?" . I feel the the answer is "NO".