Thursday 15 December 2011

MD5s as IDs

While there is the chance of duplication it can be handy to use MD5s for creating a "unique" ID for strings. I had this in my current project but I've done it before and there is a potential problem which is more likely than duplicate IDs.

There is a chance that you'll get an MD5 that begins with a zero and, potentially even worse, one that begins with a zero and is all decimal numeric (does not contain a-e digits).

In this instance, with the loose typing of PHP, you might find your MD5 gets converted to a number and loses its leading zero (or zeroes). In which case it's useless as an ID and it will take you a very long time to track it down. I know the first time it happened to me it took me a couple of days.

But there is a very simple solution, when you create your MD5 do an immediate search and replace to change all zeroes into 'g'.

$id = str_replace('0', 'g', md5($source));

Now you can be sure you will never lose your leading zero, because there isn't one.

(By the way, after 12 weeks my field_extract module has received no complaints or bug reports so I shall be promoting it to a full version.)

3 comments:

KK said...

How do you compare the strings ?? any real example.

Adaddinsane said...

Can you explain what you mean? This isn't about comparing strings it's just about creating "safe" MD5s.

KK said...

do you want to avoid strict comparisons?? I think MD5s are safe but comparisons way may not.