Proof of Concept or GTFO issue 0x14 is a PDF document file that can also be run as a NES ROM. The file will display its own MD5 hash in a PDF viewer, and also displays its own MD5 hash in a NES emulator (only first 40KB+16 bytes are actually loaded there)
I made https://github.com/DavidBuchanan314/monomorph, which packs up to 4KB of shellcode into an executable that always has the same hash. So you're not just limited to a good/evil pair, you can arbitrarily change the behaviour in future without changing the hash.
Also, a more recent innovation in MD5 collisions is textcoll, which creates colliding blocks that are completely plaintext. This would allow for colliding PHP source files like in OP but without any obvious binary artefacts (although this requires identical prefixes).
Not only is MD5 broken as shown here, if you have a modern CPU it's also quite slow compared to good, non-broken alternatives. See for example this comparison[1] (post says JavaScript but it's actually OpenSSL's implementation that's actually tested).
I only see new CPUs benchmarked, maybe that's because newer CPUs have SHA acceleration extensions? I'd expect SHA256 to be more complex and therefore be more computationally expensive.
1. You can upload scripts that get scanned for malicious code
2. These scripts can be executed once deemed "safe"
3. The server is using MD5 hashes to determine if you uploaded the same file or if it should re-scan it
3. Is where the issue is. It should probably always re-scan it and it definitely should not be using MD5.
the safe file is not a valid php file? it might be executed if php is like javascript ignorning valid chars, but i doubt something actually 'looking at it' would accept it as benign or valid.
If you don't know, then you aren't the target audience.
But there are two applications: the first is breaking in to a system under some very obscure set of circumstances that you are very unlikely to encounter in the real world. The second is to bump up your karma on HN.
After, sometimes, the initial scanning, the security and AV industry deals with file hashes, not actual files. This means that if you wrote a legitimate, harmful program, and a malicious version with the same hash, you would be able to troll the security rolls in many cases. Basically, those two files would look the same to the security program.
The thing that makes this blog post not realistic is:
* Such tricks would make much more sense with normal programs, where you're trying to trick an user to download and execute it. Webshells are downloaded by the attacker knowingly.
* Md5 is not used anymore (although I know security vendors who used it for embarrassingly long time). If this was SHA256, that attack would be devastating for many more severe reasons.
honestly, normal.php is not a valid php file. i do understand that it might bypass some checks if say normal.php was somehow flagged as a valid / benign file but in all honesty that would be really bad sec product u wanna swap with something that more intelligently classifies files... additionally, most products these days also use sha1, sha2 and sometimes things like ssdeep to have multiple hash variants to check. this ensures that any collisions will be mitigated as it's not known yet to make 1 file match on all of these different types of hashes, despite collisions being possible in a number of them for sure.
if normal.php had actual php code in there, being really 'normal' as the name implies, this would be much more severe / interesting because it might be more easy to convince modern security products it's actually a benign file.
Currently if it would be analysed, it would be flagged as suspicious simply because its not a valid file. and really, it dont need to be php, it could be any valid file format as long as it's an actually file that has benign behavior or contents.
plaintext might be easier to generate, but you'd need it to be 'executable' format or something interpretable like a script to have it actually stored in databases marking files as malicious or benign. matching filetype with the malicious file, in a valid form that does actual benign behavior would be 'best'.
don't take me wrong tho. still fun to see these things and honestly props, if it bypasses anything that's always a 'nice result' :)
normal.php is a perfectly valid php file. Sure, it doesn't contain php code but that doesn't make it invalid php file. If it did have <?php somewhere and if the following wasn't a syntactically valid PHP code, then you could say it's not a valid php file.
Dwedit|5 months ago
https://github.com/angea/pocorgtfo#0x14
And yes, documents are not normally supposed to be able to display their own MD5 hash.
Retr0id|5 months ago
Also, a more recent innovation in MD5 collisions is textcoll, which creates colliding blocks that are completely plaintext. This would allow for colliding PHP source files like in OP but without any obvious binary artefacts (although this requires identical prefixes).
https://github.com/cr-marcstevens/hashclash?tab=readme-ov-fi...
magicalhippo|5 months ago
[1]: https://lemire.me/blog/2025/01/11/javascript-hashing-speed-c...
gruez|5 months ago
andreareina|5 months ago
o11c|5 months ago
Incipient|5 months ago
chipsrafferty|5 months ago
1. You can upload scripts that get scanned for malicious code 2. These scripts can be executed once deemed "safe" 3. The server is using MD5 hashes to determine if you uploaded the same file or if it should re-scan it
3. Is where the issue is. It should probably always re-scan it and it definitely should not be using MD5.
sim7c00|5 months ago
dsab|5 months ago
lisper|5 months ago
But there are two applications: the first is breaking in to a system under some very obscure set of circumstances that you are very unlikely to encounter in the real world. The second is to bump up your karma on HN.
integralid|5 months ago
The thing that makes this blog post not realistic is:
* Such tricks would make much more sense with normal programs, where you're trying to trick an user to download and execute it. Webshells are downloaded by the attacker knowingly.
* Md5 is not used anymore (although I know security vendors who used it for embarrassingly long time). If this was SHA256, that attack would be devastating for many more severe reasons.
But it's still a fun PoC.
chipsrafferty|5 months ago
h4ck_th3_pl4n3t|5 months ago
IshKebab|5 months ago
> Can use it bypass some cached webshell detections.
sim7c00|5 months ago
if normal.php had actual php code in there, being really 'normal' as the name implies, this would be much more severe / interesting because it might be more easy to convince modern security products it's actually a benign file.
Currently if it would be analysed, it would be flagged as suspicious simply because its not a valid file. and really, it dont need to be php, it could be any valid file format as long as it's an actually file that has benign behavior or contents.
plaintext might be easier to generate, but you'd need it to be 'executable' format or something interpretable like a script to have it actually stored in databases marking files as malicious or benign. matching filetype with the malicious file, in a valid form that does actual benign behavior would be 'best'.
don't take me wrong tho. still fun to see these things and honestly props, if it bypasses anything that's always a 'nice result' :)
Blahagun|5 months ago