Tanjent (tanjent) wrote,

MurmurHash, final version.

UPDATE - If you're reading this via a link from Google or Reddit, please go here - http://murmurhash.googlepages.com. All future updates about MurmurHash will be posted there.

UPDATEUPDATE - MurmurHash is now at version 2.0. The new version uses a different mix function than the below that is much faster & mixes better. Code is on the website linked above.





OK, I'm done with this for the time being. Figured out a clever way to generate and test mixing constants and found a much better one, replaced the "h ^= data" with "h += data" (very slightly better collision resistance), and ended up removing the rotate instruction and replacing with a shift-xor (should I rename it MusxmusxHash?).

This version passes Jenkin's frog.c torture-test up to 2^25 keypairs (beats Hsieh's 2^17; previous Murmur failed after 2^9), distribution is still excellent (only 3 collisions on the SCOWL english-words.95 list vs. Hsieh's 41), and still runs faster than Hsieh and Jenkins at 1300 mb/sec vs. their ~900 mb/sec.

You should be able to use this hash function anywhere you like with very good results. I'll work on getting it posted up to interesting places this week.

Performance results from Hsieh's test app -

CRC32           :  4.7810s
oneAtATimeHash  :  3.5470s
alphaNumHash    :  2.2500s
FNVHash         :  2.1870s
Jenkins lookup3 :  1.2650s
SuperFastHash   :  1.2970s
MurmurHash      :  0.9060s


-tanjent


//-------------------------------------------------------------------
unsigned int MurmurHash ( const unsigned char * data, int len, unsigned int h )
{
	const unsigned int m = 0x7fd652ad;
	const int r = 16;

	h += 0xdeadbeef;

	while(len >= 4)
	{
		h += *(unsigned int *)data;
		h *= m;
		h ^= h >> r;

		data += 4;
		len -= 4;
	}

	switch(len)
	{
	case 3:
		h += data[2] << 16;
	case 2:
		h += data[1] << 8;
	case 1:
		h += data[0];
		h *= m;
		h ^= h >> r;
	};

	h *= m;
	h ^= h >> 10;
	h *= m;
	h ^= h >> 17;

	return h;
}
  • Post a new comment

    Error

    default userpic

    Your reply will be screened

  • 31 comments