Collisions aren't a major risk with MD5 when you also give someone the file size (even approximate).
Finding a collision in MD5 is costly, finding a collision in MD5 which is within -+10% of the actual size is extremely costly (technically possible, but maybe not in your lifetime).
As to the other reply "because it is zip something something" I disagree. Zip is an extremely good format for crafting fake files which match a checksum. Really any format which can take arbitrary metadata (which is MOST) is pretty easy.
I suspect the reason they use MD5 is because everywhere supports it and it is "good enough," particularly with file size. Plus the person downloading them knows the files are malware, so what could the security services do, inject an even more malware-malware that they then expect the user to run?! Seems dumb. You're likely more at risk from day to day applications installers which aren't digitally signed.
They should definitely not being using MD5 for anything. Even if the people in this thread saying that finding MD5 collisions is hard were correct, and they aren't, why take the risk? The performance benefits aren't large for MD5 over competing hash functions that don't have know systemic weaknesses and MD5 attacks will only get better.
Use SHA256, SHA-3 or MD6 (I like MD6, others may disagree. Disclaimer I worked on proving the differential resistance of MD6).
Collision resistance is more interesting when hashes are used in cryptographic protocols and large amounts of data can be captured, seen and analyzed.
I can't think of a purpose where a collision of a non-malicious sample with a malicious file can be used by an attacker (let alone the same attacker). In addition, there are lots of historical threat data (tactical intelligence) that is based on md5sums. Newer tools support newer checksums, but will more than likely just increase the types of checksums supported, and not deprecate them.
Checksums are less and less useful when the malware can be configured, recompiled and re-assembled for a particular target. There are some good discussions on HN more fuzzy detection techniques that can't be evaded by changing inert parts of the payload, but that is orthogonal to using stronger checksums. Indicator of Compromise data including md5sums can be useful for general security, but because a determined attacker will mutate the files it is better suited to more commodity malware.
Because it would be very hard to find a collision of a file that behaves exactly like a ZipFile.
To make a collision work, you would need to inject the payload into the program, and find a specific blob to put into the zip file, that once compressed and hashed would cause a collision. This isn't computationally efficient.
Too shame that they're using so much of money, which is basically tax money of Mongolian people, on surveillance tool when Mongolians living their like hell. shame on them.
[+] [-] Strom|11 years ago|reply
[+] [-] q3k|11 years ago|reply
[+] [-] JetSpiegel|11 years ago|reply
[+] [-] JetSpiegel|11 years ago|reply
[+] [-] markvdb|11 years ago|reply
[+] [-] Someone1234|11 years ago|reply
Finding a collision in MD5 is costly, finding a collision in MD5 which is within -+10% of the actual size is extremely costly (technically possible, but maybe not in your lifetime).
As to the other reply "because it is zip something something" I disagree. Zip is an extremely good format for crafting fake files which match a checksum. Really any format which can take arbitrary metadata (which is MOST) is pretty easy.
I suspect the reason they use MD5 is because everywhere supports it and it is "good enough," particularly with file size. Plus the person downloading them knows the files are malware, so what could the security services do, inject an even more malware-malware that they then expect the user to run?! Seems dumb. You're likely more at risk from day to day applications installers which aren't digitally signed.
[+] [-] EthanHeilman|11 years ago|reply
Use SHA256, SHA-3 or MD6 (I like MD6, others may disagree. Disclaimer I worked on proving the differential resistance of MD6).
[+] [-] un1xl0ser|11 years ago|reply
I can't think of a purpose where a collision of a non-malicious sample with a malicious file can be used by an attacker (let alone the same attacker). In addition, there are lots of historical threat data (tactical intelligence) that is based on md5sums. Newer tools support newer checksums, but will more than likely just increase the types of checksums supported, and not deprecate them.
Checksums are less and less useful when the malware can be configured, recompiled and re-assembled for a particular target. There are some good discussions on HN more fuzzy detection techniques that can't be evaded by changing inert parts of the payload, but that is orthogonal to using stronger checksums. Indicator of Compromise data including md5sums can be useful for general security, but because a determined attacker will mutate the files it is better suited to more commodity malware.
[+] [-] arturventura|11 years ago|reply
To make a collision work, you would need to inject the payload into the program, and find a specific blob to put into the zip file, that once compressed and hashed would cause a collision. This isn't computationally efficient.
[+] [-] billyboar|11 years ago|reply
[+] [-] andy_ppp|11 years ago|reply
[+] [-] DennisP|11 years ago|reply
[+] [-] D4AHNGM|11 years ago|reply
finspy_master.zip: Permission denied