SMhasher

Alternative timings with my Sony Xperia L4 with an Octa-core 2.0 GHz Cortex-A53 aarch64 rev4 phone (-march=armv8.4-a) with 26.0 BogoMIPS and the features: fp asimd evtstrm aes pmull sha1 sha2 crc32.

CPU implementer : 0x41
CPU architecture: 8
CPU variant : 0x0
CPU part : 0xd03

Hash function	MiB/sec	cycl./hash	cycl./map	size	Quality problems
donothing32	4526930.92	5.00	-	13	test NOP
donothing64	4489668.65	5.01	-	13	test NOP
donothing128	4445924.24	5.00	-	13	test NOP
NOP_OAAT_read64	24033.35	16.51	-	47	test NOP
BadHash	1715.32	36.84	-	47	test FAIL
sumhash	7347.43	22.18	-	363	test FAIL
sumhash32	18330.75	22.14	-	863	UB, test FAIL
multiply_shift	3630.03	32.00	-	345	fails all tests
pair_multiply_shift	4285.38	28.77	588.39 (50)	609	fails all tests

crc32	589.16	92.16	800.67 (1)	422	insecure, 8590x collisions, distrib
md5_32a	808.90	300.50	1026.34 (3)	4419	8590x collisions, distrib
sha1_32a	802.81	477.70	1202.77 (82)	5126	collisions, 36.6% distrib
md5-128	807.97	302.53	994.18 (84)	4419
sha1-160	802.90	554.79	1223.65 (84)	5126	Comb/Cyclic low32
sha2-224	346.43	663.12	1408.28 (3)		Cyclic low32
sha2-224_64	346.04	658.74	1403.09 (10)		Cyclic low32
sha2-256	346.35	661.08	-
sha2-256_64	346.14	658.24	1403.63 (38)
sha1ni	2019.96	135.84	564.40 (6)	989	insecure,sanity, Permutation, Zeroes, amd epyc only
sha1ni_32	2019.94	136.82	589.46 (1)	989	insecure,sanity, Permutation, Zeroes, TwoBytes, amd epyc only
sha2ni-256	1906.77	145.47	603.08 (22)	4241	insecure,sanity, Permutation, Zeroes, amd epyc only
sha2ni-256_64	1920.36	145.47	603.08 (2)	4241	insecure,sanity, Permutation, Zeroes, TwoBytes, amd epyc only
blake3_c	549.14	340.01	552.63 (3)		no 32bit portability
rmd128	850.79	326.33	1003.77 (91)
rmd160	568.31	434.45	1068.76 (103)
rmd256	824.23	335.39	1068.31 (106)
edonr224	525.27	365.89	504.28 (4)
edonr256	525.27	365.89	504.28 (4)
blake2s-128	646.95	385.84	959.61 (113)
blake2s-160	647.33	385.42	1072.25 (107)
blake2s-224	647.28	385.94	-
blake2s-256	646.99	385.49	-
blake2s-256_64	647.92	383.10	-
blake2b-160	1019.27	446.51	1214.96 (73)
blake2b-224	1002.03	446.87	1562.28 (29)
blake2b-256	1001.43	445.38	1177.98 (5)		Sparse high 32-bit
blake2b-256_64	1016.87	446.89	1060.34 (114)
asconhashv12	223.37	635.49	-
asconhashv12_64	198.84	336.31	1020.51 (20)
sha3-256	114.64	3399.99	4139.02 (2)		PerlinNoise
sha3-256_64	114.75	3404.12	-		PerlinNoise
hasshe2	1650.87	87.66	484.20 (87)	445	Permutation,TwoBytes,Zeroes,Seed
poly_1_mersenne	2051.08	40.84	654.14 (88)	479	fails most tests
poly_2_mersenne	2051.07	46.04	623.78 (82)	479
poly_3_mersenne	2032.53	50.81	716.83 (61)	479
poly_4_mersenne	2051.00	56.23	704.80 (69)	479
tabulation32	5500.42	26.54		848	collisions
tabulation	3419.86	42.19	179.93 (2)	554
crc32_hw	6921.11	35.95	768.17 (5)	653	insecure, 100% bias, collisions, distrib, machine-specific (SSE4.2/NEON)
crc64_hw	8731.63	35.45	751.88 (2)	652	insecure, 100% bias, collisions, distrib, machine-specific (SSE4.2/NEON)
crc32_pclmul			-	481	insecure, 100% bias, collisions, distrib, machine-specific (x86 PCLMUL)
crc64_jones1					insecure, 100% bias, collisions, distrib
crc64_jones2					insecure, 100% bias, collisions, distrib
crc64_jones3					insecure, 100% bias, collisions, distrib
crc64_jones					insecure, 100% bias, collisions, distrib
o1hash	2571095.19	9.81	166.13 (1)	101	insecure, zeros, fails all tests
fibonacci	12565.38	18.31	705.64 (2)	1692	UB, zeros, fails all tests
FNV1a	1270.88	40.11	177.84 (2)	204	zeros, fails all tests
FNV1A_Totenschiff	9119.65	16.77	198.20 (1)	270	UB, zeros, fails all tests
FNV1A_Pippip_Yurii	9115.62	17.74	184.41 (2)	147	UB, sanity, fails all tests
FNV1a_YT	13563.63	18.06	175.19 (2)	321	UB, fails all tests
FNV2	7020.94	24.83	142.89 (1)	278	fails all tests
FNV64	879.98	55.81	159.29 (1)	79	fails all tests
FNV128	791.82	70.24	159.29 (1)	171	fails all tests
fletcher2	17074.04	15.97	298.60 (1)	248	UB, fails all tests
fletcher4	17009.74	15.88	293.49 (2)	371	UB, fails all tests
bernstein	1905.66	28.34	180.71 (2)	41	fails all tests
sdbm	1272.88	37.66	177.06 (2)	41	fails all tests
x17	1429.60	37.17	184.09 (2)	79	99.98% bias, fails all tests
JenkinsOOAT	1143.79	57.26	213.93 (2)	153	53.5% bias, fails all tests
JenkinsOOAT_perl	1143.80	48.31	194.78 (1)	65	1.5-11.5% bias, 7.2x collisions, LongNeighbors
MicroOAAT	1382.03	39.06	185.06 (2)	68	100% bias, distrib
beamsplitter	764.67	456.19	-		UB, too many bad seeds
discohash	3160.29	164.25	343.63 (4)	1294
VHASH_32	2719.68	71.21	250.57 (2)	1231	sanity, Seed, MomentChi2
VHASH_64	2717.88	71.27	227.92 (2)	1231	sanity, Seed, Sparse
farsh32	3258.47	84.51	808.53 (1)	944	insecure: AppendedZeroes, collisions+bias, MomentChi2, LongNeighbors
farsh64	1630.16	162.29	882.37 (1)	944	insecure: AppendedZeroes, collisions+bias, MomentChi2, LongNeighbors
farsh128	813.92	331.06	1055.40 (1)		??
farsh256	407.09	633.14	302.44 (3)		??
jodyhash32	3806.84	20.26	185.85 (3)	102	bias, collisions, distr, LongNeighbors
jodyhash64	7597.40	15.60	164.36 (1)	118	bias, collisions, distr, LongNeighbors
lookup3	3425.80	23.52	194.15 (2)	341	UB, 28% bias, collisions, 30% distr
superfast	3806.29	26.64	180.10 (3)	210	UB, 91% bias, 5273.01x collisions, 37% distr, BIC
MurmurOAAT	953.28	56.73	197.83 (2)	47	collisions, 99.998% distr., BIC, LongNeighbors
Crap8	4152.87	23.32	195.11 (1)	342	UB, 2.42% bias, collisions, 2% distrib
Murmur1	3808.43	25.50	352.67 (0)		UB, fails all tests, 1 bad seed
Murmur2	2286.52	29.18	187.89 (2)	358	UB, 1.7% bias, 81x coll, 1.7% distrib, BIC
Murmur2A	2286.56	35.14	191.96 (4)	407	UB, 12.7% bias, LongNeighbors
Murmur2B	2857.40	30.44	149.43 (2)	433	UB, 1.8% bias, collisions, 3.4% distrib, BIC
Murmur2C	3889.90	30.38	164.65 (2)	444	UB, 91% bias, collisions, distr, BIC, LongNeighbors
Murmur3A	2935.50	33.25	182.37 (3)	351	UB, Moment Chi2 69
PMurHash32	2828.78	37.45	196.43 (4)	1862	UB, Moment Chi2 69
Murmur3C	4013.17	43.12	198.00 (2)	859	UB, LongNeighbors, DiffDist
PMPML_32	10950.76	87.30	432.56 (20)	1084	Avalanche >512, unseeded: Seed, MomentChi2
PMPML_64	12574.53	37.84	380.22 (26)	1305	unseeded: Seed, MomentChi2
xxHash32	7133.88	32.51	177.91 (4)	738	LongNeighbors, collisions with 4bit diff, MomentChi2 220
metrohash64_1	13247.85	28.52	152.31 (2)	624	UB, LongNeighbors, BIC, MomentChi2
metrohash64_2	13247.88	28.21	164.30 (2)	627	UB, LongNeighbors
metrohash64crc_1	5740.80	44.78	769.99 (2)	632	UB, cyclic collisions 8 byte, BIC, MomentChi2, machine-specific (SSE4.2/NEON)
metrohash64crc_2	5738.67	44.71	772.18 (2)	632	UB, cyclic collisions 8 byte, BIC, machine-specific (SSE4.2/NEON)
metrohash128_1	13245.58	45.19	175.94 (2)	773	UB, LongNeighbors
metrohash128_2	13242.49	45.31	167.43 (2)	773	UB, LongNeighbors
cmetrohash64_1o	13988.03	38.02	346.37 (19)	3506	LongNeighbors, MomentChi2
cmetrohash64_1	18199.72	35.68	360.79 (22)	652	LongNeighbors, BIC, MomentChi2
cmetrohash64_2	14207.75	41.29	407.54 (55)	655	LongNeighbors
City64noSeed	12984.39	22.49	171.53 (3)	1038	Avalanche, Sparse, TwoBytes, MomentChi2, Seed
City64	13054.65	34.67	197.78 (1)	1120	Sparse, TwoBytes
t1ha1_64le	14079.61	27.82	176.91 (1)	517	Avalanche
t1ha1_64be	11609.75	28.45	193.22 (2)	555	Avalanche
t1ha0_32le	7696.82	34.30	193.53 (2)	509	Sparse, LongNeighbors
t1ha0_32be	6201.19	36.11	183.45 (2)	533	Sparse, LongNeighbors
t1ha2_stream	11557.53	56.32	219.85 (6)	1665	Sparse, Permutation, LongNeighbors
t1ha2_stream128	11555.91	63.49	236.50 (3)	1665	Sparse, Permutation, LongNeighbors
MeowHash				1764	Sparse, machine-specific (x64 AES-NI)
tifuhash_64	83.80	672.88	-	276
floppsyhash	83.80	806.97	-	623
chaskey	1458.27	73.27	288.26 (2)	1609	PerlinNoise
SipHash	2557.58	50.93	379.41 (1)	1071
HalfSipHash	1336.56	60.87	395.95 (4)	700	zeroes
GoodOAAT	1260.90	46.70	192.19 (1)	237
pearsonbhash64	2408.16	64.63	381.25 (0)	683
pearsonbhash128	2405.80	72.68	390.00 (0)	1134
pearsonbhash256	1475.59	110.08	-	844
prvhash64_64m	4626.31	32.68	368.75 (1)	349
prvhash64_64	4835.63	32.76	370.49 (1)	384
prvhash64_128	4201.37	53.82	394.00 (0)	718
prvhash64s_64	6378.92	128.72	469.88 (0)	2640
prvhash64s_128	6018.48	310.72	1115.55 (30)	2799
SipHash13	3633.57	54.61	778.11 (2)	778	0.9% bias
discohash1	4131.12	199.00	398.35 (5)	1294	bad seeds
discohash1-128	4072.95	234.17	438.43 (5)	1294
discohash2	3986.52	207.52	421.99 (2)	1294
discohash2-128	4094.73	236.61	433.35 (4)	1294
discoNONG	3698.45	399.67	597.78 (9)		bad seeds
TSip	6493.56	30.64	211.71 (3)	519	!msvc
seahash	5998.17	35.86	201.58 (2)	871	PerlinNoise, !msvc
seahash32low	5982.16	35.86	227.31 (4)	871	PerlinNoise, !msvc
clhash	15065.60	69.98	490.26 (105)	1809	PerlinNoise, machine-specific (SSE4.2/NEON)
HighwayHash64	12276.60	91.89	494.56 (30)	2546
Murmur3F	4682.05	38.64	175.85 (1)	699	UB
fasthash32	4113.29	30.99	181.86 (2)	566	UB
fasthash64	4113.26	29.38	164.87 (2)	509	UB, Moment Chi2 5159 !
MUM	4879.06	36.22	172.34 (1)	1912	UB, too many bad seeds, machine-specific (32/64 differs)
MUMlow	4879.07	36.23	197.92 (3)	1912	UB, 5 bad seeds
xmsx32	2039.10	46.39	249.30 (7)	192	2 bad seeds
mirhash	2949.56	36.39	154.47 (3)	1112	2^36 bad seeds, UB, LongNeighbors, machine-specific (32/64 differs)
mirhash32low	2949.60	36.45	182.13 (3)	1112	4 bad seeds, UB, Cyclic, LongNeighbors, machine-specific (32/64 differs)
mirhashstrict	2315.14	49.05	182.07 (2)	1112
mirhashstrict32low	2315.14	48.86	190.59 (4)	1112	1 bad seed, MomentChi2 9
mx3	5889.70	40.55	173.09 (3)	734	UB
pengyhash	11213.81	45.98	222.45 (4)	421
City32	4977.90	32.59	212.04 (3)	1319
City64low	13040.13	34.66	201.73 (1)	1120
City128	12301.53	54.23	415.92 (15)	1841
CityCrc128	16203.64	58.52	491.33 (100)	295
CityCrc256	19603.25	42.27	149.40 (4)
FarmHash32	4979.63	31.38	214.71 (2)	11489	machine-specific (SSE4/AVX/NEON)
FarmHash64	13893.34	34.63	200.51 (4)	3758
FarmHash128	13316.34	46.00	210.25 (3)	163
farmhash32_c	19718.60	47.50	544.34 (107)	762	machine-specific (SSE4/AVX/NEON)
farmhash64_c	11742.66	44.03	447.32 (74)	3688
farmhash128_c	11340.55	56.54	719.54 (134)	1890
metrohash64	13247.81	27.49	150.74 (2)	624	LongNeighbors
metrohash128	13242.64	44.50	167.53 (2)	624	UB
metrohash128crc_1	5738.65	70.77	790.86 (1)	723	UB, machine-specific (SSE4.2/NEON)
metrohash128crc_2	5740.67	70.77	793.03 (2)	723	UB, machine-specific (SSE4.2/NEON)
xxHash64	7590.63	47.29	174.34 (3)	1999
Spooky32	15284.64	37.11	196.96 (4)	2221	UB
Spooky64	15230.69	37.01	191.71 (2)	2221	UB
Spooky128	15284.57	37.66	192.47 (2)	2221	UB
t1ha2_atonce	11565.14	30.18	194.32 (2)	541
t1ha2_atonce128	11564.68	41.92	203.53 (4)	613	LongNeighbors
xxh3	12040.53	21.45	184.86 (2)	744	Moment Chi2 14974, BIC
xxh3low	12125.36	21.34	199.79 (2)	756	Moment Chi2 1.8e+9 !
xxh128	10938.99	26.50	195.65 (2)	1012	Moment Chi2 14974
xxh128low	10919.75	23.58	187.05 (2)	1012	Moment Chi2 14974, BIC
MeowHash				1764	Sparse low32, machine-specific (x64 AES-NI)
MeowHash32low				1764	Sparse, machine-specific (x64 AES-NI)
wyhash	12997.46	24.91	358.83 (2)	556
wyhash32	2532.89	37.95	222.17 (4)	556	4 bad and broken seeds, 32-bit
wyhash32low	12966.37	25.49	363.49 (5)	556	5 bad seeds
w1hash	2532.89	37.95	222.17 (4)
rapidhash	23789.79	22.80	138.71 (7)	574
rapidhash_unrolled	23892.88	23.41	139.47 (12)	782
umash32				1530
umash32_hi				1530
umash64				1530
umash128				1530
halftime_hash64	4707.69	82.51	418.81 (0)	1530	UB overflow
halftime_hash128	10472.76	67.77	407.26 (1)	1530	UB overflow
halftime_hash256	7546.13	70.11	too slow	1530	UB overflow
halftime_hash512	3255.39	155.01	436.32 (2)	1530
nmhash32	6310.27	58.13	385.80 (1)	2445
nmhash32x	6391.35	27.37	360.90 (2)	1494
komihash	2208.60	134.35	268.71 (6)	2799
polymur	1017.15	178.25	313.64 (9)	1128
gxhash32					oob UB, AES only
gxhash64					oob UB, AES only

Other timings:

Summary

I added some SSE assisted hashes and fast intel/arm CRC32-C and AES HW variants, but not the fastest crcutil yet. See our crcutil results. See also the old https://code.google.com/p/smhasher/w/list.

So the fastest hash functions on x86_64 without quality problems are:

wyhash v3
t1ha2_atonce
umash
xxh3low
metrohash64crc (x86 only)
FarmHash (not portable, too machine specific: 64 vs 32bit, old gcc, ...)
Spooky32
fasthash
MUM/mir (different results on 32/64-bit archs)

Hash functions for symbol tables or hash tables typically use 32 bit hashes, for databases, file systems and file checksums typically 64 or 128bit, for crypto now starting with 256 bit.

Typical median key size in perl5 is 20, the most common 4. Similar for all other dynamic languages. See github.com/rurban/perl-hash-stats

When used in a hash table the instruction cache will usually beat the CPU and throughput measured here. In my tests the smallest FNV1A beats the fastest crc32_hw1 with Perl 5 hash tables. Even if those worse hash functions will lead to more collisions, the overall speed advantage and inline-ability beats the slightly worse quality. See e.g. A Seven-Dimensional Analysis of Hashing Methods and its Implications on Query Processing for a concise overview of the best hash table strategies, confirming that the simplest Mult hashing (bernstein, FNV*, x17, sdbm) always beat "better" hash functions (Tabulation, Murmur, Farm, ...) when used in a hash table.

The fast hash functions tested here are recommendable as fast for file digests and maybe bigger databases, but not for 32bit hash tables. The "Quality problems" lead to less uniform distribution, i.e. more collisions and worse performance, but are rarely related to real security attacks, just the 2nd sanity zeroes test against \0 invariance is security relevant.

Columns

MiB/sec: The average of the Bulk key speed test for alignments 0-7 with 262144-byte keys. The higher the better.

cycl./hash: The average of the Small key speed test for 1-31 byte keys. The smaller the better.

cycl./map: The result of the Hashmap test for /usr/dict/words with std::unordered_map get queries, with the standard deviation in brackets. This tests the inlinability of the hash function (see size). The smaller the better.

size: The object size in byte on AMD64. This affects the inlinability in e.g. hash tables. The smaller the better.

Quality problems: See the failures in the linked doc. The less the better.

Other

http://www.strchr.com/hash_functions lists other benchmarks and quality of most simple and fast hash functions.
http://bench.cr.yp.to/primitives-hash.html lists the benchmarks of all currently tested secure hashes.
The Hash Function Lounge overviews the known weaknesses and attacks.

SECURITY

The hash table attacks described in SipHash against City, Murmur or Perl JenkinsOAAT or at Hash Function Lounge are not included here.

Such an attack avoidance cannot be the problem of the hash function, but only the hash table collision resolution scheme. You can attack every single hash function, even the best and most secure if you detect the seed, e.g. from language (mis-)features, side-channel attacks, collision timings and independly the sort-order, so you need to protect your collision handling scheme from the worst-case O(n), i.e. separate chaining with linked lists. Linked lists chaining allows high load factors, but is very cache-unfriendly. The only recommendable linked list scheme is inlining the key or hash into the array. Nowadays everybody uses fast open addressing, even if the load factor needs to be ~50%, unless you use Cuckoo Hashing.

I.e. the usage of SipHash for their hash table in Python 3.4, ruby, rust, systemd, OpenDNS, Haskell and OpenBSD is pure security theatre. SipHash is not secure enough for security purposes and not fast enough for general usage. Brute-force generation of ~32k collisions need 2-4m for all these hashes. siphash being the slowest needs max 4m, other typically max 2m30s, with <10s for practical 16k collision attacks with all hash functions. Using Murmur is usually slower than a simple Mult, even in the worst case. Provable secure is only uniform hashing, i.e. 2-5 independent Mult or Tabulation, or using a guaranteed logarithmic collision scheme (a tree) or a linear collision scheme, such as Robin Hood or Cockoo hashing with collision counting.

One more note regarding security: Nowadays even SHA1 can be solved in a solver, like Z3 (or faster ones) for practical hash table collision attacks (i.e. 14-20 bits). All hash functions with less than 160 bits tested here cannot be considered "secure" at all.

The '' vulnerability attack with binary keys is tested in the 2nd Sanity Zero test.

CRYPTO

The official NIST hash function testsuite does not do such extensive statistical tests, to search for weak ranges in the bits. Also crypto does not change the initial state, which we do here for our random 32bit seed. Crypto mostly cares about unreversable key -> hash functions without changing the initial fixed state and timings/sidechannel attacks.

The NIST "Cryptographic Algorithm Validation Program" (CAVP) involves the testing of the implementations of FIPS-approved and NIST-recommended cryptographic algorithms. During the NIST SHA-3 competition, the testing methodology was borrowed from the "CAVP", as the KATs and MCTs of the SHA-3 Competition Test Suite were based on the CAVP tests for SHA-2. In addition to this, the “Extremely Long Message Test,” not present in the CAVP for SHA-2, required the submitters to generate the hash value corresponding to a message with a length of 1 GiB. “NIST - Cryptographic Algorithm Validation Program (CAVP),” June 2017. Available: http://csrc.nist.gov/groups/STM/cavp (No testing source code provided, just high-level descriptions)

Two other independent third party testsuites found an extensive number of bugs and weaknesses in the SHA3 candidates. "Finding Bugs in Cryptographic Hash Function Implementations", Nicky Mouha, Mohammad S Raunak, D. Richard Kuhn, and Raghu Kacker, 2017. https://eprint.iacr.org/2017/891.pdf

Maybe independent researchers should come together to do a better public SHA-4 round, based on better and more testing methods, open source code for the tests, and using standard industry practices, such as valgrind, address-sanitizer and ubsan to detect obvious bugs.

PROBLEMS

Typical undefined behaviour (UB) problems:

Misaligned

Many word-wise hashes (in opposite to safe byte-wise processing) don't check the input buffer for proper word alignment, which will fail with ubsan or Sparc. word being int32_t or int64_t or even more. On some old RISC hardware this will be a BUS error, you can even let Intel HW generate such a bus error by setting some CPU flag. But generally using misaligned accesses is fine.

These are: mx3, Spooky, mirhash (but not strict), MUM, fasthash, Murmur3*, Murmur2*, metrohash* (all but cmetro*), Crap8, beamsplitter, lookup3, fletcher4, fletcher2, all sanmayce FNV1a_ variants (FNV1a_YT, FNV1A_Pippip_Yurii, FNV1A_Totenschiff, ...), fibonacci

The usual mitigation is to check the buffer alignment either in the caller, provide a pre-processing loop for the misaligned prefix, or copy the whole buffer into a fresh aligned area. Put that extra code inside #ifdef HAVE_ALIGNED_ACCESS_REQUIRED.
oob - Out of bounds

Some hash function assume a padded input buffer which can be accessed past its length up to the word size. This allows for faster loop processing, as no 2nd loop or switch table for the rest is needed, but it requires a cooperative calling enviroment and is as such considered cheating.
Signed integer overflow

A simple type error, this hash needs to use unsigned integer types internally, to avoid undefined and inconsistent behaviour. i.e. SuperFastHash: signed integer overflow: -2147483641 + -113 cannot be represented in type 'int'
shift exponent overflow

With: FNV1A_Pippip_Yurii, FNV1A_Totenschiff, pair_multiply_shift, sumhash32 shift exponent 64 is too large for 64-bit type 'long unsigned int'