MD5 Checksum for Images (Binaries) in CF

I've been working on a piece of an application which is needs to verify that the images received have not been modified or tampered by using MD5 hashes.

Note: If you want to skip my ramble and get to the solution scroll to the bottom for the example code and download.

I thought it would be nice and easy to do it with the CF Hash function and on first attempt it appeared to be:

<cffile action="read" file="#getDirectoryFromPath(getCurrentTemplatePath())#firework.jpg" variable="myTextFile">
Hash: <cfoutput>#hash(myTextFile)#</cfoutput>
Gives:
Hash: 2A3643821420D5349665D2231842FD89

However, the hashes get created on by CF but get verified by an Flex application so I was using the as3corelib's MD5 function and couldn't get the hashes to match.

So I decided to try another MD5 implementation to see which of my functions was wrong - unfortunately it turned out that both of them were wrong but for two different reasons. I'll cover the AS3 problem in a different post.

The output from md5sum was as follows:

$ md5sum firework.jpg
2346b6ab017de31688ee35949612db07 firework.jpg

This didn't match what I had generated. A bit more head scratching and I realised that the problem was that I was trying to do an md5 hash on a binary file but the CF Hash() only takes a "string".

After loading the file with "readbinary" the hash function didn't work (no surprise) and a bit of digging around led me to use some Java functionality to handle the generation of the Hash.

<cffile action="readbinary" file="#getDirectoryFromPath(getCurrentTemplatePath())#firework.jpg" variable="myBinaryFile">
<cfset var md5 = createObject("java","java.security.MessageDigest").getInstance("MD5")>
<cfset md5.update(myBinaryFile,0,len(myBinaryFile))>
<cfset checksumByteArray = md5.digest()>

This leaves us with a ByteArray digest of the file - which we need to convert to a familiar hex encoded MD5 Hash.

<cfloop from="1" to="#len(checksumByteArray)#" index="i">
   <cfset hexCouplet = formatBaseN(bitAND(checksumByteArray[i],255),16)>
   <!--- Pad with 0's --->
   <cfif len(hexCouplet) EQ 1>
      <cfset hexCouplet = "0#hexCouplet#">
   </cfif>
   <cfset checkSumHex = "#checkSumHex##hexCouplet#">
</cfloop>
Binary Hash: <cfoutput>#checkSumHex#</cfoutput>
This gives the following checksum:
Binary Hash: 2346b6ab017de31688ee35949612db07

Bingo - all working nicely now. I've packaged it all up to make my life easier I've packaged it up into a nice CFC - Crypto.cfc.

Solution and Download

Here's the example code and a running demo of it.

The Crypto CFC is available from the subversion repository.

Here is the code to use it:

<cffile action="readbinary" file="#getDirectoryFromPath(getCurrentTemplatePath())#firework.jpg" variable="myBinaryFile">
<cfset md5 = createObject("component","au.com.lynchconsulting.cfc.utility.Crypto").hashBinary(myBinaryFile)>

<h3>MD5 Sum calculated from hashBinary()</h3>
Hash Binary: <cfoutput>#md5#</cfoutput>

Hope it helps. Cheers, Mark

Comments
Marko Tomic's Gravatar Very clever solution indeed.
# Posted By Marko Tomic | 1/18/08 5:27 AM
Steven's Gravatar Fantastic! We've been doing a project that uses MD5 hashing, however, due the number of collisions, we've decided to use SHA. Is there any chance you can update your code to also use SHA?

Cheers
ST
# Posted By Steven | 4/9/08 12:34 AM
Mark Lynch's Gravatar Hi Steven,

I haven't tested this but I'm pretty sure that this will work for any algorithm that Java supports. You just need to know the standard names, and there is a java reference here:
http://java.sun.com/j2se/1.4.2/docs/guide/security...

If you change this line:
<cfset var md5 = createObject("java","java.security.MessageDigest").getInstance("MD5")>

to this:
<cfset var md5 = createObject("java","java.security.MessageDigest").getInstance("SHA-1")>

Then it should work - I would also change the md5 variable name to something more sensible.

Let me know how you go.

Cheers,
Mark
# Posted By Mark Lynch | 4/14/08 4:58 AM
Mark Lynch's Gravatar Hi Steven,

I was intrigued so had a quick look - the crypto CFC is now updated to pass through an optional algoritm parameter - everything else works as before and it defaults to MD5.

The reference for the java names is here:
http://java.sun.com/javase/6/docs/technotes/guides...

The new code is here:
https://developer.lynchconsulting.com.au//svn/open...

Cheers,
Mark
# Posted By Mark Lynch | 4/14/08 8:52 AM
BlogCFC was created by Raymond Camden. This blog is running version 5.1.004.