Thursday, 15 December 2011

MD5s as IDs

While there is the chance of duplication it can be handy to use MD5s for creating a "unique" ID for strings. I had this in my current project but I've done it before and there is a potential problem which is more likely than duplicate IDs.

There is a chance that you'll get an MD5 that begins with a zero and, potentially even worse, one that begins with a zero and is all decimal numeric (does not contain a-e digits).

In this instance, with the loose typing of PHP, you might find your MD5 gets converted to a number and loses its leading zero (or zeroes). In which case it's useless as an ID and it will take you a very long time to track it down. I know the first time it happened to me it took me a couple of days.

But there is a very simple solution, when you create your MD5 do an immediate search and replace to change all zeroes into 'g'.

$id = str_replace('0', 'g', md5($source));

Now you can be sure you will never lose your leading zero, because there isn't one.

(By the way, after 12 weeks my field_extract module has received no complaints or bug reports so I shall be promoting it to a full version.)