Cryptanalysis: Recovering an Affine Encryption Scheme Using GF(2) Linear Algebra

Cryptanalysis: Recovering an Affine Encryption Scheme Using GF(2) Linear Algebra
Welcome to a cryptanalysis challenge. In this challenge, we will learn how a block cipher built enti 2026-7-1 10:14:27 Author: infosecwriteups.com(查看原文) 阅读量:2 收藏

Welcome to a cryptanalysis challenge. In this challenge, we will learn how a block cipher built entirely from linear components can be broken, and why secure block ciphers require nonlinear components.

Press enter or click to view image in full size

This is not a very basic level challenge. It covers moderate level cryptographic concepts, so some familiarity with basic mathematics, cryptographic functions, and encryption algorithms will be helpful. I’ll try to explain every part as clearly as possible.

In this challenge, a hash value and the corresponding encryption code are provided. The goal is to understand how the plaintext is transformed into that hash. We need to reverse engineer or perform cryptanalysis on the encryption process to recover the original plaintext from the given hash.

Here is the encryption and hashing code used in the challenge.

You are given the following hash:
1d6a2172025e1858754075123b6658532d1a4e775e1f43336e227d5a4529734f

Below is the complete source code used in the challenge.

#!/usr/bin/env python3# This cipher uses 100 rounds and therefore provides extremely strong security.
# Repeated substitution and permutation remove all algebraic structure.
# Since the SBox is key dependent, recovering the plaintext without brute force is infeasible.
BLOCK_SIZE = 32
KEY_SBOX = [211, 243, 147, 179, 83, 115, 19, 51, 210, 242, 146, 178, 82, 114, 18, 50, 209, 241, 145, 177, 81, 113, 17, 49, 208, 240, 144, 176, 80, 112, 16, 48, 215, 247, 151, 183, 87, 119, 23, 55, 214, 246, 150, 182, 86, 118, 22, 54, 213, 245, 149, 181, 85, 117, 21, 53, 212, 244, 148, 180, 84, 116, 20, 52, 219, 251, 155, 187, 91, 123, 27, 59, 218, 250, 154, 186, 90, 122, 26, 58, 217, 249, 153, 185, 89, 121, 25, 57, 216, 248, 152, 184, 88, 120, 24, 56, 223, 255, 159, 191, 95, 127, 31, 63, 222, 254, 158, 190, 94, 126, 30, 62, 221, 253, 157, 189, 93, 125, 29, 61, 220, 252, 156, 188, 92, 124, 28, 60, 195, 227, 131, 163, 67, 99, 3, 35, 194, 226, 130, 162, 66, 98, 2, 34, 193, 225, 129, 161, 65, 97, 1, 33, 192, 224, 128, 160, 64, 96, 0, 32, 199, 231, 135, 167, 71, 103, 7, 39, 198, 230, 134, 166, 70, 102, 6, 38, 197, 229, 133, 165, 69, 101, 5, 37, 196, 228, 132, 164, 68, 100, 4, 36, 203, 235, 139, 171, 75, 107, 11, 43, 202, 234, 138, 170, 74, 106, 10, 42, 201, 233, 137, 169, 73, 105, 9, 41, 200, 232, 136, 168, 72, 104, 8, 40, 207, 239, 143, 175, 79, 111, 15, 47, 206, 238, 142, 174, 78, 110, 14, 46, 205, 237, 141, 173, 77, 109, 13, 45, 204, 236, 140, 172, 76, 108, 12, 44]
PBOX = [41, 214, 131, 48, 221, 138, 55, 228, 145, 62, 235, 152, 69, 242, 159, 76, 249, 166, 83, 0, 173, 90, 7, 180, 97, 14, 187, 104, 21, 194, 111, 28, 201, 118, 35, 208, 125, 42, 215, 132, 49, 222, 139, 56, 229, 146, 63, 236, 153, 70, 243, 160, 77, 250, 167, 84, 1, 174, 91, 8, 181, 98, 15, 188, 105, 22, 195, 112, 29, 202, 119, 36, 209, 126, 43, 216, 133, 50, 223, 140, 57, 230, 147, 64, 237, 154, 71, 244, 161, 78, 251, 168, 85, 2, 175, 92, 9, 182, 99, 16, 189, 106, 23, 196, 113, 30, 203, 120, 37, 210, 127, 44, 217, 134, 51, 224, 141, 58, 231, 148, 65, 238, 155, 72, 245, 162, 79, 252, 169, 86, 3, 176, 93, 10, 183, 100, 17, 190, 107, 24, 197, 114, 31, 204, 121, 38, 211, 128, 45, 218, 135, 52, 225, 142, 59, 232, 149, 66, 239, 156, 73, 246, 163, 80, 253, 170, 87, 4, 177, 94, 11, 184, 101, 18, 191, 108, 25, 198, 115, 32, 205, 122, 39, 212, 129, 46, 219, 136, 53, 226, 143, 60, 233, 150, 67, 240, 157, 74, 247, 164, 81, 254, 171, 88, 5, 178, 95, 12, 185, 102, 19, 192, 109, 26, 199, 116, 33, 206, 123, 40, 213, 130, 47, 220, 137, 54, 227, 144, 61, 234, 151, 68, 241, 158, 75, 248, 165, 82, 255, 172, 89, 6, 179, 96, 13, 186, 103, 20, 193, 110, 27, 200, 117, 34, 207, 124]
xor = lambda a,b: bytes([x ^ y for x,y in zip(a,b)])
class SecureCipher:
    def __init__(self, key):
        if len(key) < BLOCK_SIZE:
            pad = BLOCK_SIZE - len(key)
            key += bytes([pad])*pad
        self.key = key[:BLOCK_SIZE]
        self.sbox = [KEY_SBOX[i] ^ self.key[0] for i in range(256)]
    def substitute(self, data):
        return bytes(self.sbox[b] for b in data)
    def permute(self, data):
        out = [0]*BLOCK_SIZE
        for num in range(256):
            outnum = PBOX[num]
            inbyte = num // 8
            inbit = 7 - (num % 8)
            outbyte = outnum // 8
            outbit = 7 - (outnum % 8)
            if data[inbyte] & (1 << inbit):
                out[outbyte] |= (1 << outbit)
        return bytes(out)
    def encrypt(self, data):
        data = self.pad(data)
        blocks = [data[i:i+BLOCK_SIZE]
                  for i in range(0, len(data), BLOCK_SIZE)]
        out = b''
        for block in blocks:
            for _ in range(100): # No linear or any cryptanalysis here!
                block = self.substitute(block)
                block = self.permute(block)
                block = xor(block, self.key)
            out += block
        return out
    @staticmethod
    def pad(data):
        if len(data) % BLOCK_SIZE == 0:
            return data
        diff = BLOCK_SIZE - (len(data) % BLOCK_SIZE)
        return data + bytes([diff])*diff
class SecureHash:
    def __init__(self, data=None):
        self.state = b"\x00"*BLOCK_SIZE
        if data:
            self.update(data)
    def update(self, data):
        data = SecureCipher.pad(data)
        blocks = [data[i:i+BLOCK_SIZE]
                  for i in range(0, len(data), BLOCK_SIZE)]
        for block in blocks:
            c = SecureCipher(block)
            self.state = xor(self.state, c.encrypt(block))
    def digest(self):
        return self.state
    def hexdigest(self):
        return self.state.hex()
if __name__ == "__main__":
    plaintext = input("Message: ").encode()
    h = SecureHash(plaintext)
    print("\nDigest:")
    print(h.hexdigest())
    cipher = SecureCipher(plaintext)
    ct = cipher.encrypt(plaintext)
    print("\nCiphertext:")
    print(ct.hex())

Your task is to analyze the implementation, determine whether the cipher and hash construction are secure, and recover the original plaintext if a weakness exists. Study the source carefully and verify every assumption yourself.

Now, let’s try to solve the challenge. I’ll explain the solution in detail. how the encryption works, what vulnerability was discovered, which cryptanalysis methods were used to identify it, and how to build a solver to reverse the hash and recover the original plaintext.

How the Encryption Works

Let’s first understand what the hashing code does with our input and how the digest 1d6a2172025e1858754075123b6658532d1a4e775e1f43336e227d5a4529734f is generated.

You can see that the cipher uses a fixed block size of 32 bytes, which means each block is 256 bits long. Therefore, the input is processed in 32 byte blocks. If the input length is not exactly 32 bytes or is not a multiple of 32, the remaining bytes are filled using padding. We’ll discuss how the padding works later.

The code also defines a KEY_SBOX and a PBOX. Both are simply rearrangements of the numbers from 0 to 255. The KEY_SBOXis used during the substitution step, while the PBOX is used during the permutation step.

KEY_SBOX = [211, 243, 147, 179, …….., 44]

PBOX = [41, 214, 131, 48, …….., 124]

The cipher reads the input and uses a self keyed design, meaning that the plaintext itself is used as the encryption key.

First, a new SBox is derived by XORing the first byte of the key with every entry of the original KEY_SBOX. Since the key is the plaintext itself, the generated SBOX depends entirely on the input.

After deriving this new SBOX, each block goes through the following operations:

Substitute the bytes using the derived SBOX.
Permute the bits using the PBOX.
XOR the result with the key, which is again the plaintext.

These three operations constitute a single round, and the cipher repeats this process 100 times. The output produced after the 100th round becomes the final ciphertext.

SBOX Reconstruction, Substitution, Permutation, and XOR with the Key

Let’s discuss how the encryption process works in detail by breaking down each operation step by step.

Code Breakdown:

BLOCK_SIZE = 32 The cipher processes data in blocks of 32 bytes, which corresponds to 256 bits. Two large lookup tables are defined KEY_SBOX = [211, 243, 147, 179, …, 44]
PBOX = [41, 214, 131, 48, …, 124]. KEY_SBOX is used for byte substitution, while PBOX is used for bit permutation. An XOR helper function is also defined:

xor = lambda a, b: bytes([x ^ y for x, y in zip(a, b)])

This performs a byte by byte XOR operation between two byte strings.

Constructor:

The user supplied key is first padded to 32 bytes if necessary:

if len(key) < BLOCK_SIZE:
    pad = BLOCK_SIZE - len(key)
    key += bytes([pad]) * pad

self.key = key[:BLOCK_SIZE]

If the input is shorter than 32 bytes, the padding routine extends it to exactly 32 bytes. The number of missing bytes is calculated and that value is repeatedly appended until the block reaches 32 bytes.

For example, if the input length is 29 bytes:

32 - 29 = 3

the following padding is added:

0x03 0x03 0x03

A key dependent SBOX is then generated:

self.sbox = [KEY_SBOX[i] ^ self.key[0] for i in range(256)]

Each entry of KEY_SBOX is XORed with the first byte of the key, producing a different SBOX for every key. For example, suppose the original SBOX begins as: KEY_SBOX = [211, 243, 147, 179, …, 44]. Assume the input is: “sharafu”. Since the cipher uses the plaintext itself as the key, the first byte of the key is: ‘s’ ASCII value = 115. Therefore, every entry of KEY_SBOX is XORed with 115:

211 XOR 115 = 160
243 XOR 115 = 128
147 XOR 115 = 224
179 XOR 115 = 192
…
44 XOR 115 = 95

As a result, the reconstructed SBOX begins as: self.sbox = [160, 128, 224, 192, …, 95]

Substitution Layer:

def substitute(self, data):
    return bytes(self.sbox[b] for b in data)

Each byte of the input block is replaced using the derived SBOX.

Input byte -> S-Box lookup -> Output byte

Only byte values change, their positions remain unchanged.Suppose the input block begins with: [115, 104, 97, 114]

which corresponds to: “s h a r” s = 115 h = 104 a = 97 r = 114 The substitution layer performs: Take the byte value, Use it as an index into the SBOX, Replace the original byte with the value stored at that index. For example: Take first byte ‘s’ = 115. In sbox table count 0 to 115th poition The value is 90. So replace 115 with 90. After substitution: [115, 104, 97, 114] became [90, 17, 204, 88].

Permutation Layer:

def permute(self, data):
    for num in range(256):
        outnum = PBOX[num]

The permutation layer works at the bit level. Every input bit is moved to a new position according to PBOX.

For example:

PBOX[0] = 41

means:

Input bit #0 position -> Output bit #41 position

Bit values remain the same, but their positions are rearranged. After substitution, suppose we obtain: [90, 17, 204, 88]. These values are bytes, but the PBOX operates at the bit level. For example, the first byte: 90 = 01011010. Since each block contains 32 bytes, the total number of bits is: 32 × 8 = 256 bits. The PBOX specifies where each input bit should be moved: The PBOX operates on all 256 bits of the block. For example, consider the first byte 90 to bit = 01011010, which represents only the first 8 bits of the 256 bit block. The PBOX begins as [41, 214, 131, 48, …, 124], meaning that the first bit ( 0 ) is moved to position 41, the second bit ( 1 ) is moved to position 214, the third bit ( 0 ) is moved to position 131, and so on. The bit values themselves never change. only their positions are rearranged according to the PBOX. This process continues for all 256 bits, resulting in a heavily shuffled block and providing diffusion across the entire state.

Finally, the resulting 256 bit block is XORed with the key. In this challenge, the plaintext itself is used as the key, so the permuted block is XORed with the original input block. This key mixing step is performed at the end of every round before the next round begins.

Get Muhammad Ashraf Ali’s stories in your inbox

Join Medium for free to get updates from this writer.

Remember me for faster sign in

Encryption Routine:

for _ in range(100):
    block = self.substitute(block)
    block = self.permute(block)
    block = xor(block, self.key)

Each block undergoes 100 rounds consisting of:

Substitution
      ↓
Permutation
      ↓
XOR with Key

The output of the final round becomes the ciphertext.

Hash Construction:

self.state = b"\x00" * BLOCK_SIZE

for block in blocks:
    c = SecureCipher(block)
    self.state = xor(self.state, c.encrypt(block))

The hash state starts as 256 zero bits.

For each block:

The plaintext block itself is used as the encryption key.
The block is encrypted.
The result is XORed into the running hash state.

The final state is returned as the digest. In this challenge, the resulting digest is: 1d6a2172025e1858754075123b6658532d1a4e775e1f43336e227d5a4529734f

The Weakness: The Vulnerability

The core weakness of this cipher is that every component is either linear or affine over GF(2). Since the substitution layer, permutation layer, and key mixing step all preserve linearity, repeating them 100 times does not improve security. Instead, the entire 100 round cipher collapses into a single affine transformation, which can be modeled and inverted efficiently using linear algebra.

What is GF(2)?

GF stands for Galois Field. GF(2) is the simplest possible field and contains only two elements: 0 and 1. All arithmetic is performed modulo 2, meaning addition is equivalent to XOR.

0 + 0 = 0
0 + 1 = 1
1 + 0 = 1
1 + 1 = 0

while multiplication follows the usual binary rules:

0 × 0 = 0
0 × 1 = 0
1 × 0 = 0
1 × 1 = 1

Since computers represent data as bits, cryptographic algorithms are often analyzed over GF(2).

Linear vs. Nonlinear Functions

A function is considered linear over GF(2) if it satisfies: f(a XOR b) = f(a) XOR f(b) for every possible pair of inputs. For example, the function: f(x) = x so f(5 XOR 3) = 5 XOR 3 = 6 f(6) = 6 and f(5) XOR f(3) = 5 = 5 XOR 3 =3 = 5 XOR 3 = 6 Since both sides are equal, the function satisfies the linearity property. a linear function means that it does not matter whether we apply the function before or after combining the inputs. In other words, “apply the function first, then XOR” and “XOR first, then apply the function” always produce the same result.

Modern ciphers intentionally introduce strong nonlinear components because linear systems can be solved efficiently. For example, AES relies heavily on its nonlinear SBOX.

What Is an Affine Function?

An affine function is simply a linear function followed by the addition of a constant. Mathematically, it is written as: f(x) = Mx XOR c. where M is a matrix and C is a constant vector. For example: f(x) = x XOR 1010. The XOR with a constant means the function is no longer strictly linear, but it still retains a highly structured form. Affine functions are therefore still easy to analyze and manipulate using linear algebra.

Identifying the Vulnerability

The first step was analyzing the SBOX. The SBOX is expected to introduce nonlinearity. we tested whether: S(a XOR b) = S(a) XOR S(b) XOR S(0).

Every possible pair was tested. This formula comes directly from the definition of an affine function. Suppose a function is affine: f(x) = Mx XOR c. where M is a matrix and c is a constant vector. Evaluating f(a XOR b) gives: f(a XOR b) = M(a XOR b) XOR c = Ma XOR Mb XOR c.

On the other hand: f(a) XOR f(b) = (Ma XOR c) XOR (Mb XOR c) = Ma XOR Mb. since cXOR c = 0. We are missing one copy of the constant c. To recover it, we evaluate the function at zero: f(0) = M0 XOR c = c because M0 = 0. Therefore, for every affine function: f(a XOR b) = f(a) XOR f(b) XOR f(0). This is why we test the SBOX using S(0) when checking whether it is affine. For affine functions, this equation must always be true.

To verify whether the SBOX is affine, every possible pair of input bytes was tested. Since an SBOX operates on 8 bit values, there are 256 x 256 = 65536 possible pairs (a, b). The test returned True for all 65,536 cases, meaning that the affine property holds for every possible input pair. Therefore, the entire SBOX is affine over GF(2)

Why is the PBOX Linear?

The PBOX is linear because it only rearranges the positions of bits without changing their values. For example, given an input such as 10110010, the PBOX may move bit 0 to position 5, bit 1 to position 7, and so on, but the actual bit values (0 or 1) remain unchanged. Since no arithmetic is performed and only positions are shuffled, the permutation always preserves XOR operations, meaning P(A XOR B) = P(A) XOR P(B). Therefore, the PBOX is a linear transformation over GF(2).

Why Doesn’t 100 Rounds Improve Security?

One might expect that performing 100 rounds would make the cipher secure, but this is not the case. Each round consists of substitution, permutation, and XOR with the key, all of which are either affine or linear operations. The composition of affine transformations remains affine, meaning Affine * Affine * Affine is still just an affine transformation. Therefore, even after repeating the round function 100 times, the entire cipher still collapses into a single affine transformation over GF(2).

Mathematical Representation

Since every component of the cipher is either linear or affine, the entire 100-round construction can be represented as a single affine transformation: E(x) = Mx XOR c. where x is the 256 bit plaintext vector, M is a 256 x 256 binary matrix, and c is a constant vector. In other words, the complete cipher behaves like one large algebraic equation.

Understanding the Matrix

A matrix simply describes how input bits influence output bits. For example, consider:

M =
1 0 1
0 1 1
1 1 0

If the input is x = [1 0 1], matrix multiplication over GF(2) computes the output bits as bit0 XOR bit2, bit1 XOR bit2, and bit0 XOR bit1. Thus, the matrix completely captures how every input bit affects the ciphertext.

Recovering the Matrix

Because the cipher satisfies E(x) = Mx XOR c, an attacker only needs to recover M and c. Encrypting the all zero plaintext immediately reveals the constant since M0 = 0, giving E(0) = 0. The attacker then encrypts the 256 basis vectors (1000…, 0100…, 0010…, …) all possible combinations in 256 bits. each containing exactly one bit set. After XORing each resulting ciphertext with c, the remaining value directly reveals one column of M. Repeating this process for all 256 basis vectors reconstructs the entire 256 x 256 matrix.

Building the Exploit

So, let’s build the exploit. First, we verify the linearity of the SBOX by testing whether S(a XOR b) = S(a) XOR S(b) XOR S(0) holds for all possible input pairs. If the property holds for all 65536 combinations, the SBOX is confirmed to be affine. Next, we reconstruct the affine representation by encrypting the all zero vector (0…0) and each basis vector (100..0, 010..0, 001..0, … ) which allows us to recover both the matrix M and the constant vector c. Using Gaussian elimination over GF(2), we then solve the resulting linear system to recover the plaintext. Given a ciphertext y, we solve Mx = y XOR c to obtain x, which directly yields the plaintext. No brute force is required, as the entire encryption collapses into solving a system of linear equations. The cipher completely lacks a nonlinear component. Because the SBOX itself is affine over GF(2), the entire 100 round construction remains affine and can be modeled, reconstructed, and inverted using linear algebra. This allows complete plaintext recovery without knowledge of the secret key.

Now that we understand the vulnerability, we can construct a solver that uses the SBOX and PBOX tables to recover the original plaintext. In this writeup, however, I'll provide a complete recovery script specifically for this hashing scheme, which can be used to recover the plaintext corresponding to the digests created with this hashing scheme.The solver reconstructs the affine transformation implemented by the cipher and then solves the resulting system of linear equations over GF(2) to recover the original message.

Run the script and provide the target digest as input.

Press enter or click to view image in full size

Now you can see that the hash has been successfully reversed and the original plaintext has been recovered.

Press enter or click to view image in full size

recovered plain text

This challenge is inspired by the HTB challenge “Always Has Been”. However, since the challenge is still active and continues to award points,

I cannot publish a direct writeup or solver, as doing so would violate HTB’s Terms of Service. Instead, I reconstructed the challenge with minimal modifications and wrote this explanation to demonstrate the underlying vulnerability and attack methodology for educational purposes.

Happy Hacking

文章来源: https://infosecwriteups.com/cryptanalysis-recovering-an-affine-encryption-scheme-using-gf-2-linear-algebra-ef21a98880b6?source=rss----7b722bfd1b8d---4
如有侵权请联系:admin#unsafe.sh