Some security tools still stick to MD5 when identifying malware samples years after practical collisions were shown against the algorithm. This can be exploited by first showing these tools a harmless sample (Sheep) and then a malicious one (Wolf) that have the same MD5 hash. Please use this code to test if the security products in your reach use MD5 internally to fingerprint binaries and share your results by issuing a pull request updating the contents of results/
!
Works-on-a-different-machine-than-mine version, feedback is welcome!
- 32-bit Windows (virtual) machine (64-bit breaks stuff)
- Visual Studio 2012 to compile the projects (Express will do)
- Fastcoll for collisions
- Optional: Cygwin+MinGW to compile Evilize
Extract Fastcoll to the fastcoll
directory. Name the executable fastcoll.exe
Use shepherd.bat
to generate wolf.exe
and sheep.exe
(in the VS Development Command Prompt):
> shepherd.bat YOURPASSWORD your_shellcode.raw
After this step you should have your two colliding binaries (sheep.exe
and wolf.exe
in the evilize
directory).
For more information see the tutorial of Peter Selinger, older revisions of this document or the source code...
shepherd.bat
executesshepherd.exe
with the user supplied command line argumentsshepher.exe
generates a header file (sc.h
) that contains the encrypted shellcode, the password and the CRC of the plain shellcode
shepherd.bat
executes the build process ofsheep.exe
sheep.exe
is built withsc.h
included by Visual Studio
shepherd.bat
executesevilize.exe
evilize.exe
calculates a special IV for the chunk ofsheep.exe
right before the block where the collision will happenevilize.exe
executesfastcoll.exe
with the IV as a parameterfastcoll.exe
generates two 128 byte colliding blocks:a
andb
evilize.exe
replaces the original string buffers ofsheep.exe
so that they contain combinationsa
andb
- The resulting files (
evilize/wolf.exe
andevilize/sheep.exe
) have the same MD5 hashes but behave differently. The real code to be executed only appears in the memory ofevilize/wolf.exe
.
To test the security products in your reach you should generate two pairs of samples (SHEEP1-WOLF1 and SHEEP2-WOLF2), preferably with the same payload. Since samples (or their fingerprints) are usually uploaded to central repositories (or "the cloud") precompiled samples are not included to avoid conflicts between independent testers.
After the samples are ready follow the methodology shown on the diagram below:
(*) If the product is not able to detect the first malicious sample, there are more serious problems to worry about than crypto-fu. In fact, the simple cryptography included in the provided boilerplate code poses as a hard challenge for various products... Try to use more obvious samples!
(**) The product most probably uses some trivial method to detect the boilerplate insted of the actual payload. You can try to introduce simple changes to the code like removing debug strings.
Please don't forget to share your positive results by issuing a pull request to the RESULTS.md file!
- Poisonous MD5 - Wolves Among the Sheep
- Peter Selinger: MD5 Collision Demo
- How to make two binaries with same MD5
- Stop using MD5 now!
Licenced under GNU/GPL if not otherwise stated.