This page looks at some non-cryptographic hash designs. The hashes are from n-bit inputs to n-bit outputs. These are mainly for use with PRNG designs.

Methodology

Our goal is to find functions that are good at mixing/diffusing.

Evaluation

We will evaluate how the functions behave using an avalanche diagram. An avalanche diagram is a graph that displays the probability of each output bit flipping in response to each input bit flipping. It allows us to quickly see if the function has any terrible biases.

Design criteria

It is not difficult to make a good hash function by combining a lot of subdiffusions. What we want is to achieve good mixing with the smallest number of operations. Pretty much all uses cases of non-cryptographic hash functions need to execute quickly and consume minimal resources.

Algorithms

Identity function

The identity function (f(x) = x) does not modify the input value, so each bit only affects itself. This cannot be considered a hash function by any means, but serves as a nice sanity check to see if the graph is what we expect.

In [2]:
%%cython

ctypedef unsigned long long u64

cpdef u64 identity_function(u64 x):
    return x
In [3]:
plot_hash(identity_function)
Out:
<Figure size 900x600 with 2 Axes>

Non-deterministic random values

On the opposite end of the spectrum, we completely throw out determinism and just use real random numbers.

In [4]:
def urandom64(x):
    return int.from_bytes(os.urandom(8), 'big')

plot_hash(urandom64)
Out:
<Figure size 900x600 with 2 Axes>
In [5]:
def urandom32(x):
    return int.from_bytes(os.urandom(4), 'big')

plot_hash(urandom32, bits=32)
Out:
<Figure size 900x600 with 2 Axes>

Multiplication with a large prime

In [6]:
%%cython

ctypedef unsigned long long u64

cpdef u64 prime_multiplication(u64 x):
    cdef u64 prime = 10115642443237858459;
    return x * prime
In [7]:
plot_hash(prime_multiplication)
Out:
<Figure size 900x600 with 2 Axes>

Rotate-XOR-Prime (rxprime)

This hash is based on multiplying the input by primes, and XORing it with rotated versions of itself.

32-bit rxprime

In [8]:
%%cython

ctypedef unsigned char u8
ctypedef unsigned long u32

cdef u32 rotl32(u32 n, u8 k):
    cdef u32 a = n << k
    cdef u32 b = n >> (32 - k)
    return a | b

cpdef u32 rxprime32(u32 x):
    x *= 7919
    x ^= rotl32(x, 7)
    x *= 7723
    x ^= rotl32(x, 11)
    x *= 7561
    x ^= rotl32(x, 13)
    return x
In [9]:
plot_hash(rxprime32, bits=32)
Out:
<Figure size 900x600 with 2 Axes>

64-bit rxprime

In [10]:
%%cython

ctypedef unsigned char u8
ctypedef unsigned long long u64

cdef u64 rotl64(u64 n, u8 k):
    cdef u64 a = n << k
    cdef u64 b = n >> (64 - k)
    return a | b

cpdef u64 rxprime64(u64 x):
    x *= 7919
    x ^= rotl64(x, 7)
    x *= 7723
    x ^= rotl64(x, 11)
    x *= 7561
    x ^= rotl64(x, 13)
    x *= 7411
    x ^= rotl64(x, 17)
    return x
In [11]:
plot_hash(rxprime64)
Out:
<Figure size 900x600 with 2 Axes>

SplitMix64

In [24]:
%%cython -a

ctypedef unsigned long long u64

cpdef u64 splitmix64(u64 x):
    x ^= x >> 30
    x *= <u64>0xbf58476d1ce4e5b9
    x ^= x >> 27
    x *= <u64>0x94d049bb133111eb
    x ^= x >> 31
    return x
Out [24]:
In [13]:
plot_hash(splitmix64)
Out:
<Figure size 900x600 with 2 Axes>

Prospector

https://nullprogram.com/blog/2018/07/31/

In [14]:
%%cython

ctypedef unsigned long u32

cpdef u32 prospector32(u32 x):
    x ^= x >> 15
    x *= <u32>0x2c1b3c6d
    x ^= x >> 12
    x *= <u32>0x297a2d39
    x ^= x >> 15
    return x
In [15]:
plot_hash(prospector32, bits=32)
Out:
<Figure size 900x600 with 2 Axes>
In [16]:
%%cython

ctypedef unsigned long u32

cpdef u32 triple32(u32 x):
    x ^= x >> 17
    x *= <u32>0xed5ad4bb
    x ^= x >> 11
    x *= <u32>0xac4c1b51
    x ^= x >> 15
    x *= <u32>0x31848bab
    x ^= x >> 14
    return x
In [17]:
plot_hash(triple32, bits=32)
Out:
<Figure size 900x600 with 2 Axes>

ARX - Add, Rotate, XOR

This hash is based on ARX, which is usually cheap to execute in software. One advantage is that there are no multiplications. In exchange for having cheaper operations, we have to execute more of them.

In [18]:
%%cython

ctypedef unsigned char u8
ctypedef unsigned long u32

cdef u32 rotl32(u32 n, u8 k):
    cdef u32 a = n << k
    cdef u32 b = n >> (32 - k)
    return a | b

cpdef u32 arx32(u32 a):
    cdef u32 b = 0
    cdef u32 c = 0
    cdef u32 d = 0
    
    for _ in range(3):
        b ^= rotl32(a + d, 7)
        c ^= rotl32(b + a, 9)
        d ^= rotl32(c + b, 13)
        a ^= rotl32(d + c, 18)
    
    return a ^ c
In [19]:
plot_hash(arx32, bits=32)
Out:
<Figure size 900x600 with 2 Axes>
In [20]:
%%cython

ctypedef unsigned char u8
ctypedef unsigned long long u64

cdef u64 rotl64(u64 n, u8 k):
    cdef u64 a = n << k
    cdef u64 b = n >> (64 - k)
    return a | b

cpdef u64 arx64(u64 a):
    cdef u64 b = 0
    cdef u64 c = 0
    
    for _ in range(4):
        b ^= rotl64(a + c, 7)
        c ^= rotl64(b + a, 9)
        a ^= rotl64(c + b, 13)
    
    return a
In [21]:
plot_hash(arx64)
Out:
<Figure size 900x600 with 2 Axes>

References and useful links