Collatzeral Damage: Bitwise and Proof Foolish

Collatzeral Damage: Bitwise and Proof Foolish
2025-1-6 05:43:36 Author: soatok.blog(查看原文) 阅读量:19 收藏

Let’s talk about the Collatz Conjecture, which is like mathematicians’ original version of this programmer joke:

Dear programmer:
When I wrote this code, only god and I knew how it worked. Now, only god knows it!

Therefore, if you are trying to optimize this routine and it fails (most surely), please increment this counter as a warning for the next person:

total_hours_wasted_here = 254 — Except the number of mathematician hours wasted is much larger, possibly too large for uint32_t to hold it.

The Collatz conjecture is an infamous trap for the young and ambitious. Despite its simple construction, it has evaded proofs and general solutions for nearly a century. Veritasium made a video about this conjecture, which I recommend:

The Collatz conjecture involves a recursive function that contains one branch: If a number is odd, multiply it by 3 then add 1. If it is even, divide it by 2.

The conjecture states that repeating this operation will eventually reach 1 for all numbers.

You can write recursive code that implements the Collatz function like so:

function collatz(num) {
  console.log(num);
  if (num === 1) {
    return;
  }
  return (num % 2 === 1) 
    ? collatz((3 * num) + 1)
    : collatz(num >> 1);
}

If the Collatz conjecture is false, there is some integer for which the return statement will never be reached.

We don’t know if the conjecture is true or not.

We do know that it has held up for a hell of a lot of integers (from a human perspective), and have yet to find a counterexample, but we don’t know if it’s necessarily true for all integers.

What if there’s actually a cycle somewhere (similar to what I discussed in the context of hash functions)?

That mathematicians don’t know the answer isn’t really interesting for the readers of this blog, but why the answer is so elusive (despite the intuitive simple construction of the function central to the Collatz conjecture) is something I think we can say something interesting about.

But first, let’s talk about a class of cryptographic algorithm that serves as the building block for several types of hash functions and stream ciphers used across the Internet today.

Important

I am taking a lot of liberties in this blog post, and I am prioritizing clarity over technical precision.

Readers will be better served by cross-referencing this entertainment-focused blog post with the work of actual mathematicians.

And for the pedants in the audience: if something seems imprecise, it’s probably because I made a trade-off to help a wider audience gain a basic intuition.

Add, Rotate, XOR (ARX)

ARX is a category of cryptography algorithms that is used to build various cryptography building blocks. The SHA-2 family of hash functions and the ChaCha stream cipher are both an ARX construction, which each is used in a lot of Internet traffic.

Let’s focus on ChaCha for the moment, focusing on the reference implementation that ships with libsodium:

#define U32C(v) (v##U)

#define U32V(v) ((uint32_t)(v) &U32C(0xFFFFFFFF))

#define ROTATE(v, c) (ROTL32(v, c))
#define XOR(v, w) ((v) ^ (w))
#define PLUS(v, w) (U32V((v) + (w)))
#define PLUSONE(v) (PLUS((v), 1))

#define QUARTERROUND(a, b, c, d) \
    a = PLUS(a, b);              \
    d = ROTATE(XOR(d, a), 16);   \
    c = PLUS(c, d);              \
    b = ROTATE(XOR(b, c), 12);   \
    a = PLUS(a, b);              \
    d = ROTATE(XOR(d, a), 8);    \
    c = PLUS(c, d);              \
    b = ROTATE(XOR(b, c), 7);

At the core of ChaCha is the quarter round function. This is applied on alternating columns and diagonals of the input state until the desired number of rounds has been completed.

for (i = 20; i > 0; i -= 2) {
    QUARTERROUND(x0, x4, x8, x12)
    QUARTERROUND(x1, x5, x9, x13)
    QUARTERROUND(x2, x6, x10, x14)
    QUARTERROUND(x3, x7, x11, x15)
    QUARTERROUND(x0, x5, x10, x15)
    QUARTERROUND(x1, x6, x11, x12)
    QUARTERROUND(x2, x7, x8, x13)
    QUARTERROUND(x3, x4, x9, x14)
}

After all rounds are complete, the initial state is added to the output. This 512-bit state includes the key (which consists of up to 256 bits), nonce, and some constant values. Because half of the input bytes are your secret key, an attacker without knowledge of the key cannot invert the calculation.

ChaCha is an improvement of another stream cipher from the same family as the eSTREAM finalist, Salsa20. ChaCha improved the diffusion per round and performance. This makes ChaCha less resistant to cryptanalysis, even in extremely reduced-round variants (e.g., ChaCha8 vs ChaCha20).

As interesting as all that is, the important bits to know is that the ChaCha update emphasized improving diffusion.

What does that mean, exactly?

ARX Diffusion

ARX consists of three operations: Rotation (sliding bits around like a flywheel), addition, and eXclusive OR (also known as XOR).

Comparing Salsa20 and ChaCha’s quarter round, using the notation from the source code on Wikipedia, you see:

Salsa20 Quarter Round

b ^= (a + d) <<<  7;
c ^= (b + a) <<<  9;
d ^= (c + b) <<< 13;
a ^= (d + c) <<< 18;

Addition then rotation then XOR.

ChaCha Quarter Round

a += b; d ^= a; d <<<= 16;
c += d; b ^= c; b <<<= 12;
a += b; d ^= a; d <<<=  8;
c += d; b ^= c; b <<<=  7;

Addition then XOR then rotation.

Each step of the quarter round function still involves addition, rotation, and XOR, but their usage is different. (Also, they just update values directly rather than involving an extra temporary value to implicitly occupy a stack register.)

And it’s subtle, but if you play with these different quarter rounds with slightly different inputs, you can see how the diffusion is improved with the second construction in fewer numbers of rounds.

“Why does diffusion matter?”

Bit diffusion is ARX constructions is one of the ways that ciphers ensure their output remains indistinguishable from a random oracle.

If you’ve ever looked at a cryptographic hash function before, or heard about the “avalanche effect”, that’s precisely what we want out of these ARX constructions.

“So what?”

As some of you might remember from your studies, XOR is just addition without carry (mod 2).

If you repeat your same experimentation but only use one operation (AR or RX), you’ll find that your diffusion is poor.

This is because addition is an abstraction that hides a very important feature that’s often taken for granted.

Carry Propagation

Let’s say, for a learning exercise, you wanted to build integer addition entirely out of bitwise operators: AND, OR, NOT, XOR, and the left and right bit shift operators.

As already mentioned above, XOR is just addition without carry. So that part’s easy:

def add_bits_no_carry(x, y):
    return x ^ y

How about carrying values to the next place? Well, consider the following table:

X	Y	Calculated Carry Value
0	0	0
1	0	0
0	1	0
1	1	1

That third column sure looks like an “AND” operator, does it not?

Great, but what if you had a carry value from the previous step?

Well, now you have to implement two half-adders: One to handle the input carry value with one input, and the other to handle the other input and produce the next output carry value.

def half_adder(x, y):
    return [x ^ y, x & y]

def add_bits(x, y, c_in):
    [a, b] = half_adder(x, y)
    [d, e] = half_adder(a, c_in)
    return [d, b ^ e]

If you feel lost, this hardware tutorial explains it with diagrams.

The main thing I want you to take away is that addition is much more complicated than XOR because of carry propagation.

Soatok with nerd glasses held together by scotch tape, with a speech bubble with "I did the math" — Original sticker made by CMYKat
(Poor edits made my me)

On Computation and Information Theory

We use XOR to mix data (which could be plaintext, or could be all zeroes) with pseudo-random bytes, since it’s perfectly hiding so long as the bytes we’re mixing them with is unknown. This is the intuition underlying one-time pads and modern stream ciphers (including the ones we’re discussing).

In the context of ARX, because some operations (addition) propagate carries and others don’t (XOR), when you combine these steps with rotating the bits in-place, it becomes very easy to mix the output bits in a short number of rounds of operations. Cryptographers measure how well bits are mixed across a large number of inputs and reject designs that don’t perform well (generally speaking).

But a direct consequence of the hidden complexity of addition with carry is that the state you’re operating within is larger than the output. This means that some information is used (carried over from previous bits or limbs) that is not revealed directly in the output bit(s).

It’s easy to add two numbers together, but if you don’t know either of the numbers, it’s impossible to know the other (Unless, of course, a side-channel leaks some of these carry values through power usage or timing information.)

“That’s neat and all, but what does it imply?”

Don’t worry, I’m going somewhere with this.

Turing the Page

Let’s briefly talk about Turing machines.

The relevant Wikipedia article covers them adequately well. For everyone else, another Veritasium video:

A Turing machine is a mathematical model for computation.

The basic idea is that you have a tape of symbols, a head that reads from the tape, and an internal state that determines the next move.

We don’t need too formal of a treatment here. I’m not exactly trying to prove the halting problem is undecidable.

A dumb joke I like to tell my computer science friends:

I’ve solved the Halting problem! It’s called: “the heat death of the universe,” at which point the program fucking halts!

But do put a pin in this, because it will come up towards the end.

Bitwise Collatz Functions

Above, I wrote a bit of code that implements the Collatz function, but I was a bit lazy about it.

In truth, you don’t need multiplication or the modulo operator. You can, instead, use bitwise operations and one addition.

Dividing by 2 (when the least significant bit is 0) is the same as right-shifting by 1.
Multiplying by 3 then adding 1 (when the least significant bit is 1) can be rewritten as the following steps:
- Left shift by 1 (2n)
- Set the lower bit to 1 (+1), using bitwise OR
- Add the original number (+n)

Thus, our function instead looks like:

function collatz(num) {
  console.log(num);
  if (num === 1) {
    return;
  }
  return (num & 1)
    ? collatz(((num << 1) | 1) + num)
    : collatz(num >> 1);
}

Suddenly, the discussion above about carry propagation might seem a lot more relevant!

Small Example

Imagine you encode a number as a binary string. For example, 257.

When you work through the algorithm sketched out above, you end up doing this:

       n == 0001_0000_0001
     2*n == 0010_0000_0010 # left shift by 1
 2*n + 1 == 0010_0000_0011 # bitwise OR with 1

       add: 0001_0000_0001 # n
            0010_0000_0011 # 2n + 1
            # This is where carry propagation comes in!

    result: 0011_0000_0100

When you perform the 3n+1 branch of the Collatz function the way I constructed it, that last addition of n will propagate carries.

And that carry propagation is where the trouble starts.

Collatz Machines

The input and output of the Collatz function is an integer of arbitrary size. The behavior branches depending on the least significant bit of the input.

You can think of the least significant bit as the “head” of a machine similar to a Turing machine.

However, instead of moving the head along a tape, the Collatz function does one of two things:

Moves the symbols on the tape one space to the right (somewhat familiar territory for Turing Machines).
Rewrites all of the symbols on the tape to the left of the head, according to some algorithm. This algorithm makes the tape longer.

As we observed previously, the carry propagation implicit to addition makes the bits diffuse in a way that’s hard to generalize faster than simply performing the addition and seeing what results from it.

Proving that this Collatz machine halts for all inputs would also prove the Collatz Conjecture. But as we saw with proper Turing Machines, this might not be possible.

Is It Unsolvable?

With all this in mind, in the general case, the Collatz Conjecture may very well one day prove to be as undecidable as the Halting Problem.

Or, maybe someone will find a cycle within the integer space that fails to ever reach 1.

As it stands right now, there have been a lot of interesting approaches to try to solve it. The first Veritasium video linked above talked about some of these ideas.

Maybe we need new mathematic tools first. Or perhaps the Langlands project will uncover a relationship between unrelated areas of mathematical research that already exist today that will yield an answer to this nearly century-old conjecture.

Either way, I hope you find this topic… mildly interesting. Enough to appreciate the problem, not so much that you think you can solve it yourself.

Stay safe, don’t drink and derive, and happy hacking.

Header art: AJ and CMYKat.

文章来源: https://soatok.blog/2025/01/06/collatzeral-damage-bitwise-and-proof-foolish/
如有侵权请联系:admin#unsafe.sh