Today I was reading the Rumba20 paper and realized I had no idea how to take a compression function and bootstrap useful cryptographic functionality from it. So I’ll be attemping to do that today.
Rumba20 is a compression function developed by Daniel J. Bernstein. This article provides an implementation of Rumba20, and explores common use cases for compression functions.
Warning! I haven’t seen any implementations of Rumba20 in the wild. This means it has not been battle-tested as much as other functions. Even less than the number of Rumba20 implementations is the number of people using it in the way I am using it in this post. I may be “holding it wrong”, so there are no security guarantees about any of the things below.
Rumba20 is a compression function based on the Salsa20 function.
In the future I will write a detailed article about Salsa20. For this article, all you need to know is that Salsa20 is a function that maps 512-bit blocks to other 512-bit blocks. The Wikipedia article for Salsa20 is really good and provides all the necessary details. It’s linked at the bottom of this page, make sure to check it out.
The traditional use of Salsa20 is for encryption, as a stream cipher. In this use case, the input block is filled with a block with a 128-bit constant, a 256-bit key, and 128 bits of nonce/counter. After the Salsa20 core is executed on this block, the output block is well-diffused, hard to predict and hard to invert.
For Rumba20, we will use the same Salsa20 core with different contents in the input block.
Rumba20 uses the Salsa20 core to produce transform functions. A different 128-bit constant is used for each function, leaving 384 bits for the input. This results in a transform function that takes 384 bits of input and produces 512 bits of output.
This is hardly compression. In fact, our input just became larger. The next section explains how Rumba20 uses these transform functions to produce an actual compression function.
The way Rumba20 fills the block is different from Salsa20. Instead of using a key, nonce or a counter, Rumba20 fills the blocks with message bits and constants. Each transform function uses a different constant. The four transform functions of Rumba20 are listed below.
- f1(M) → Salsa20(M, “firstRumba20bloc”)
- f2(M) → Salsa20(M, “secondRumba20blo”)
- f3(M) → Salsa20(M, “thirdRumba20bloc”)
- f4(M) → Salsa20(M, “fourthRumba20blo”)
Aside from these 16-byte constants, each Salsa20 block can fit 48 bytes of message to produce a 64-byte output. Since Rumba20 uses 4 blocks; we end up compressing 192 bytes to 64 bytes, or a 1536-bit block to a 512-bit block.
Rumba20(m1, m2, m3, m4) = f1(m1) ⊕ f2(m2) ⊕ f3(m3) ⊕ f4(m4)
Where m1, m2, m3, m4 are the 192 message bytes split into 48-byte chunks and ⊕ is the XOR operation.
Since this is an article exploring the Rumba20 function, and Rumba20 is based on Salsa20, it is assumed that there is a working implementation of Salsa20 available to us. Just like a good cooking show, I have one prepared in advance. It transforms 64-byte inputs into 64-byte outputs.
block = bytes(list(range(64))) block.hex() salsa20_block(block).hex()
We are hardcoding the position of the bytes where the constant is supposed to go. Every position that is not meant for a constant is used for a message byte.
const_indices = [ 0, 1, 2, 3, 20, 21, 22, 23, 40, 41, 42, 43, 60, 61, 62, 63, ] msg_indices = [x for x in range(64) if x not in const_indices]
def salsa20_transform(constant, message): constant = constant.encode("ascii") assert len(constant) == 16 assert len(message) == 48 block = bytearray(64) for i, x in enumerate(const_indices): block[x] = constant[i] for i, x in enumerate(msg_indices): block[x] = message[i] return salsa20_block(block)
Let’s define f1, f2, f3 and f4 exactly like we described above.
from functools import partial def make_transform(constant): return partial(salsa20_transform, constant) f1 = make_transform("firstRumba20bloc") f2 = make_transform("secondRumba20blo") f3 = make_transform("thirdRumba20bloc") f4 = make_transform("fourthRumba20blo")
The compression function itself is very short. Split the input block into four chunks, transform them using the Salsa20 core, and XOR the resulting blocks into one.
def rumba20_block(block): assert len(block) == 192 # Split the message into four blocks. # Each block will be 48 bytes. m1, block = block[:48], block[48:] m2, block = block[:48], block[48:] m3, m4 = block[:48], block[48:] result = bytearray(64) result = buf_xor(result, f1(m1)) result = buf_xor(result, f2(m2)) result = buf_xor(result, f3(m3)) result = buf_xor(result, f4(m4)) return result
That’s the whole compression function. We can see that it takes 192-byte inputs and produces 64-byte outputs.
x = bytearray(192) output = rumba20_block(x) len(output)
But on its own, this is not very usable. Let’s implement some familiar functionality.
Merkle–Damgård hash from Rumba20
Merkle–Damgård is a method of building hash functions from compression functions. We can use the
rumba20_block function we wrote above and turn it from a function that accepts a 192-byte input into one that takes arbitrary-sized inputs.
def rumba20_md(data): length = len(data) block = length.to_bytes(64, 'little') while data: data_block, data = data[:128], data[128:] block += data_block while len(block) < 192: block += b"\x00" block = rumba20_block(block) return block
rumba20_md(b"Hello, world!").hex() rumba20_md(b"Hello, World!").hex()
HMAC is a construction that turns a hash function into a keyed Message Authentication Code (MAC). This can be used as a signature with a secret key. It also mitigates some of the problems with Merkle–Damgård hashes. We can easily convert our
rumba20_md function into a HMAC by following the HMAC specification.
def hmac_rumba20(key, message): if len(key) > 192: key = rumba20_md(key) while len(key) < 192: key += b"\x00" outer_pad = bytes([x ^ 0x5c for x in key]) inner_pad = bytes([x ^ 0x36 for x in key]) return rumba20_md(outer_pad + rumba20_md(inner_pad + message))
As we can see below, computing the HMAC of the same message with two different keys produces completely different results.
hmac_rumba20(b"key1", b"Test message").hex() hmac_rumba20(b"key2", b"Test message").hex()
Similarly, using the same key with two different messages will yield different results.
hmac_rumba20(b"key", b"First message").hex() hmac_rumba20(b"key", b"Second message").hex()
def hkdf_extract(input_key, salt): if not salt: salt = bytes(64) return hmac_rumba20(salt, input_key)
def hkdf_expand(key, info, length): result = b"" t = b"" i = 1 while len(result) < length: t = hmac_rumba20(key, t + info + bytes([i])) result += t i += 1 return result[:length]
def hkdf_rumba20(input_key, salt = b"", info = b"", length = 32): prk = hkdf_extract(input_key, salt) return hkdf_expand(prk, info, length)
hkdf_rumba20(b"Test key", info=b"Map").hex() hkdf_rumba20(b"Test key", info=b"Config").hex()
We have started our journey by reading the Rumba20 paper, and implemented higher-and-higher level cryptographic constructions. The building blocks we created allow us to make
- Random number generators
- Hash functions
- Message authentication codes
- Key derivation functions
- Randomness extractors
- One-time passwords