This page includes my solutions to the Cryptopals Crypto Challenges. These are small problems that build upon each other in order to learn about and attack progressively more complex cryptographic constructions. The challenges are divided into 8 sets, made up of 8 challenges each.
The first challenge asks us to convert a hex encoded buffer into a Base64 encoded one. While Python has built-in modules to perform these operations, I’ve implemented them below in order to get familiar with how they work.
To turn a binary buffer into a printable string, you can encode it as a hexadecimal number. Each byte will become two hex digits (two characters from 0 to f).
In [2]:
hex_digits = “0123456789abcdef”
def split_chunks(l, n): ch = [] for x in l: ch.append(x) if len(ch) == n: yield ch ch = [] if ch: yield ch
def hex_decode(text): lookup = {y:x for x, y in enumerate(hex_digits)}
res = []
for first, second in split_chunks(text, 2):
first = lookup[first]
second = lookup[second]
byte = first << 4 | second
res.append(byte)
return bytes(res)
hex_decode(“48656c6c6f”)
Out [2]:
b’Hello’
Another way decode hex-encoded values is to use a lookup table. Here’s how it looks in practice.
In [3]:
hex_lookup = {}
i = 0 for x in hex_digits: for y in hex_digits: hex_lookup[x+y] = i i += 1
def hex_decode(text): pairs = split_chunks(text, 2) res = [hex_lookup[x+y] for x, y in pairs] return bytes(res)
hex_decode(“48656c6c6f”)
Out [3]:
b’Hello’
In [4]:
import string
values = string.ascii_uppercase + string.ascii_lowercase + string.digits + “+/”
def binary_digits(data): for byte in data: for i in range(8)[::-1]: i = (byte >> i) & 1 yield str(int(i))
def groups(data, n): group = [] for x in data: group.append(x) if len(group) == n: yield “”.join(group) group = [] if group: yield “”.join(group)
def b64_groups(data): digits = binary_digits(data) for num in groups(digits, 6): num = int(num, 2) yield values[num]
def base64_encode(data): res = “”
for b64 in groups(b64_groups(data), 4):
res += b64
for _ in range(4 - len(b64)):
res += '='
return res
In [5]:
hex_input = “49276d206b696c6c696e6720796f757220627261696e206c696b65206120706f69736f6e6f7573206d757368726f6f6d” expected = “SSdtIGtpbGxpbmcgeW91ciBicmFpbiBsaWtlIGEgcG9pc29ub3VzIG11c2hyb29t”
buffer = hex_decode(hex_input) base64_encode(buffer) base64_encode(buffer) == expected
Out [5]:
‘SSdtIGtpbGxpbmcgeW91ciBicmFpbiBsaWtlIGEgcG9pc29ub3VzIG11c2hyb29t’
Out [5]:
True
Let’s try to make a base64 decoder as well. The logic should be mostly similar.
In [6]:
def b64_to_binary(text): for c in text: x = values.find(c) if x != -1: for i in range(6)[::-1]: bit = (x >> i) & 1 yield str(int(bit))
def base64_decode(text): res = [] for gr in groups(b64_to_binary(text), 8): if len(gr) != 8: break res.append(int(gr, 2))
return bytes(res)
base64_decode(“VGVzdCBtZXNzYWdlCg==”)
Out [6]:
b’Test message\n’
This challenge asks us to produce the XOR of two equal-length buffers.
In [7]:
buffer1 = hex_decode(“1c0111001f010100061a024b53535009181c”) buffer2 = hex_decode(“686974207468652062756c6c277320657965”)
def xor_buffers(buf1, buf2): result = [x ^ y for x, y in zip(buf1, buf2)] return bytes(result)
xor_buffers(buffer1, buffer2).hex()
Out [7]:
‘746865206b696420646f6e277420706c6179’
https://en.wikipedia.org/wiki/Letter_frequency
In [8]:
def single_xor(ciphertext, key): plain = [x ^ key for x in ciphertext] return bytes(plain)
single_xor(b’Test message’, 23) single_xor(b’Crdc7zrddvpr’, 23)
Out [8]:
b’Crdc7zrddvpr’
Out [8]:
b’Test message’
Since the key is only a single byte, it is possible to iterate through all the options and pick the correct one with your eye. But it is useful to automate this, and this automation will come in handy in more complex scenarios.
We will make our program try all possible keys, and pick the key that results in the most English-looking decryption.
In [9]:
def english_score(data):
s = 0
data = data.lower()
common = b“etaoin shrdlu“[::-1]
for c in data:
if chr(c) not in string.printable:
return 0
i = common.find(c)
if i != -1:
s += i
return s
In [10]:
ciphertext = hex_decode(“1b37373331363f78151b7f2b783431333d78397828372d363c78373e783a393b3736”)
key = max(range(255), key=lambda x: english_score(single_xor(ciphertext, x))) key
Out [10]:
88
In [11]:
single_xor(ciphertext, key)
Out [11]:
b“Cooking MC’s like a pound of bacon“
In [12]:
import requests
URL = ‘https://cryptopals.com/static/challenge-data/4.txt’ data = requests.get(URL).text.split(“\n”) data = list(map(hex_decode, data))
In [13]:
candidates = []
for ciphertext in data: for key in range(255): plain = single_xor(ciphertext, key) candidates.append(plain)
print(max(candidates, key=english_score))
Out:
b’Now that the party is jumping\n’
In [14]:
data = b“Burning ’em, if you ain’t quick and nimble\nI go crazy when I hear a cymbal“ key = b“ICE“
def repeating_xor(data, key): res = [] for i, c in enumerate(data): k = key[i % len(key)] res.append(c ^ k) return bytes(res)
encrypted = repeating_xor(data, key).hex()
for gr in groups(encrypted, 25): print(gr)
Out:
0b3637272a2b2e63622c2e696 92a23693a2a3c6324202d623d 63343c2a26226324272765272 a282b2f20430a652e2c652a31 24333a653e2b2027630c692b2 0283165286326302e27282f
In [15]:
def edit_dist(buf1, buf2): bin1 = binary_digits(buf1) bin2 = binary_digits(buf2)
dist = 0
for b1, b2 in zip(bin1, bin2):
if b1 != b2:
dist += 1
return dist
edit_dist(b“this is a test“, b“wokka wokka!!!“) == 37
Out [15]:
True
In [16]:
URL = ‘https://cryptopals.com/static/challenge-data/6.txt’ text = requests.get(URL).text data = base64_decode(text)
len(data)
Out [16]:
2876
In [17]:
def keysize_edit_distance(ciphertext, keysize): prev = None diff = 0 n = 0 for i in range(0, len(data), keysize): chunk = data[i:i+keysize] if prev: diff += edit_dist(chunk, prev) / keysize n += 1 prev = chunk diff /= n return diff
keysize = min(range(2, 40), key=lambda x: keysize_edit_distance(data, x))
keysize
Out [17]:
29
In [18]:
key = []
blocks = [data[i:i+keysize] for i in range(0, len(data), keysize)]
for key_i in range(keysize): chunk = b““ for bl in blocks: if key_i < len(bl): chunk += bytes([bl[key_i]])
k = max(range(255), key=lambda x: english_score(single_xor(chunk, x)))
key.append(k)
If we’re lucky, we should have the correct key now. Let’s see if the key was human-readable.
In [19]:
bytes(key).decode(‘ascii’)
Out [19]:
‘Terminator X: Bring the noise’
Looks like it was. Just to be sure we got the right answer, let’s try to decode the data.
In [20]:
plaintext = repeating_xor(data, bytes(key)).decode(‘ascii’)
print(plaintext[:150])
Out:
I’m back and I’m ringin’ the bell A rockin’ on the mike while the fly girls yell In ecstasy in the back of me Well that’s my DJ Deshay cuttin’ all
In [21]:
URL = ‘https://cryptopals.com/static/challenge-data/7.txt’ ciphertext = requests.get(URL).text ciphertext = base64_decode(ciphertext)
In [22]:
from Crypto.Cipher import AES
def aes_ecb_encrypt(key, block): aes = AES.new(key, AES.MODE_ECB) return aes.encrypt(block)
def aes_ecb_decrypt(key, block): aes = AES.new(key, AES.MODE_ECB) return aes.decrypt(block)
for i in range(0, len(ciphertext), 16): chunk = ciphertext[i:i+16] chunk = aes_ecb_decrypt(b“YELLOW SUBMARINE“, chunk) chunk = chunk.decode(‘ascii’) print(chunk, end=“”) if i > 150: break
Out:
I’m back and I’m ringin’ the bell A rockin’ on the mike while the fly girls yell In ecstasy in the back of me Well that’s my DJ Deshay cuttin’ all them Z’s Hittin’ hard and
In [23]:
URL = ‘https://cryptopals.com/static/challenge-data/8.txt’ ciphertexts = requests.get(URL).text.split(“\n”)
len(ciphertexts)
Out [23]:
205
In [24]:
for num, ciphertext in enumerate(ciphertexts): data = hex_decode(ciphertext) chunks = [data[i:i+16] for i in range(0, len(data), 16)] for i, ch in enumerate(chunks): count = chunks.count(ch) if count != 1: print(num, i, count)
Out:
132 1 4 132 3 4 132 5 4 132 7 4
In [31]:
def pkcs7(data, block_size): for block in split_chunks(data, block_size): last = len(block) diff = block_size - len(block) while len(block) != block_size: block.append(diff) yield from block if last == block_size: yield from [block_size] * block_size
bytes(pkcs7(b“YELLOW SUBMARINE“, 20))
Out [31]:
b’YELLOW SUBMARINE\x04\x04\x04\x04’
In [32]:
def pkcs7_unpad(data): padbyte = data[-1] if padbyte > len(data): return None padding = data[padbyte:] if padding.count(padbyte) != padbyte: return None return data[:-padbyte]
In [53]:
valid = [ b“ICE ICE BABY\x04\x04\x04\x04“, b“NICE ICE BABY\x03\x03\x03“, ]
invalid = [ b“ICE ICE BABY\x05\x05\x05\x05“, b“ICE ICE BABY\x01\x02\x03\x04“, ]
for buf in valid + invalid: print(str(buf).ljust(31), “=>”, pkcs7_unpad(buf))
Out:
b’ICE ICE BABY\x04\x04\x04\x04’ => b’ICE ICE BABY’ b’NICE ICE BABY\x03\x03\x03’ => b’NICE ICE BABY’ b’ICE ICE BABY\x05\x05\x05\x05’ => None b’ICE ICE BABY\x01\x02\x03\x04’ => None