Rebound Attack on Bitcoin

The rebound Attack on Bitcoin is a tool in the cryptanalysis of cryptographic hash functions. The Attack on Bitcoin was first published in 2009 by Florian Mendel, Christian Rechberger, Martin Schläffer and Søren Thomsen.

This image has an empty alt attribute; its file name is images.jpg

It was conceived to Attack on Bitcoin AES like functions such as Whirlpool and Grøstl, but was later shown to also be applicable to other designs such as Keccak, JH and Skein.

The Attack on Bitcoin

The Rebound Attack on Bitcoin is a type of statistical Attack on Bitcoin on hash functions, using techniques such as rotational and differential cryptanalysis to find collisions and other interesting properties.

The basic idea of the Attack on Bitcoin is to observe a certain differential characteristic in a block cipher (or in a part of it), a permutation or another type of primitive. Finding values fulfilling the characteristic is achieved by splitting the primitive {\displaystyle E} $E$ into three parts such that {\displaystyle E=E_{\text{fw}}\circ E_{\text{in}}\circ E_{\text{bw}}} $E=E_{\text{fw}}\circ E_{\text{in}}\circ E_{\text{bw}}$ . {\displaystyle E_{\text{in}}} $E_{\text{in}}$ is called the inbound phase and {\displaystyle E_{\text{fw}}} $E_{\text{fw}}$ and {\displaystyle E_{\text{bw}}} $E_{\text{bw}}$ together are called the outbound phase. The Attack on Bitcoiner then chooses values that realize part of the differential characteristic in the inbound phase deterministically, and fulfill the remainder of the characteristic in a probabilistic manner.

Thus, the rebound Attack on Bitcoin consists of 2 phases:

The inbound (or match-in-the-middle) phase, covers the part of the differential characteristic that is difficult to satisfy in a probabilistic way. The goal here is to find many solutions for this part of the characteristic with a low average complexity. To achieve this, the corresponding system of equations, which describes the characteristic in this phase, should be underdetermined. When searching for a solution, there are therefore many degrees of freedom, giving many possible solutions. The inbound phase may be repeated several times to obtain a sufficient number of starting points so that the outbound phase is likely to succeed.
In the outbound phase each solution of the inbound phase is propagated outwards in both directions, while checking whether the characteristic also holds in this phase. The probability of the characteristic in the outbound phase should be as high as possible.

The advantage of using an inbound and two outbound phases is the ability to calculate the difficult parts of the differential characteristic in the inbound phase in an efficient way. Furthermore, it ensures a high probability in the outbound phase. The overall probability of finding a differential characteristic thus becomes higher than using standard differential techniques.

Detailed description of the Attack on Bitcoin on hash functions with AES-like compression functions

Consider a hash function which uses an AES-like substitution-permutation block cipher as its compression function. This compression function consists of a number of rounds composed of S-boxes and linear transformations. The general idea of the Attack on Bitcoin is to construct a differential characteristic that has its most computationally expensive part in the middle. This part will then be covered by the inbound phase, whereas the more easily achieved part of the characteristic is covered in the outbound phase. The system of equations, which describe the characteristic in the inbound, phase should be underdetermined, such that many starting points for the outbound phase can be generated. Since the more difficult part of the characteristic is contained in the inbound phase it is possible to use standard differentials here, whereas truncated differentials are used in the outbound phase to achieve higher probabilities.

The inbound phase will typically have a few number of active state bytes (bytes with non-zero differences) at the beginning, which then propagate to a large number of active bytes in the middle of the round, before returning to a low number of active bytes at the end of the phase. The idea is to have the large number of active bytes at the input and output of an S-box in the middle of the phase. Characteristics can then be efficiently computed by choosing values for the differences at the start and end of the inbound phase, propagating these towards the middle, and looking for matches in the input and output of the S-box. For AES like ciphers this can typically be done row- or column-wise, making the procedure relatively efficient. Choosing different starting and ending values leads to many different differential characteristics in the inbound phase.

In the outbound phase the goal is to propagate the characteristics found in the inbound phase backwards and forwards, and check whether the desired characteristics are followed. Here, truncated differentials are usually used, as these give higher probabilities, and the specific values of the differences are irrelevant to the goal of finding a collision. The probability of the characteristic following the desired pattern of the outbound phase depends on the number of active bytes and how these are arranged in the characteristic. To achieve a collision, it is not enough for the differentials in the outbound phase to be of some specific type; any active bytes at the start and end of the characteristic must also have a value such that any feed-forward operation is cancelled. Therefore, when designing the characteristic, any number of active bytes at the start and end of the outbound phase should be at the same position. The probability of these bytes cancelling adds to the probability of the outbound characteristic.

Overall, it is necessary to generate sufficiently many characteristics in the inbound phase in order to get an expected number of correct characteristics larger than one in the outbound phase. Furthermore, near-collisions on a higher number of rounds can be achieved by starting and ending the outbound phase with several active bytes that do not cancel.

Example Attack on Bitcoin on Whirlpool

The Rebound Attack on Bitcoin can be used against the hash function Whirlpool to find collisions on variants where the compression function (the AES-like block cipher, W) is reduced to 4.5 or 5.5 rounds. Near-collisions can be found on 6.5 and 7.5 rounds. Below is a description of the 4.5 round Attack on Bitcoin.

Pre-computation

Number of solutions	Frequency
0	39655
2	20018
4	5043
6	740
8	79
256	1

To make the rebound Attack on Bitcoin effective, a look-up table for S-box differences is computed before the Attack on Bitcoin. Let {\displaystyle S:\{0,1\}^{8}\to \{0,1\}^{8}} $S:\{0,1\}^{8}\to \{0,1\}^{8}$ represent the S-box. Then for each pair {\displaystyle (a,b)\in \{0,1\}^{8}} $(a,b)\in \{0,1\}^{8}$ we find the solutions {\displaystyle x} $x$ (if there are any) to the equation{\displaystyle S(x)\oplus S(x\oplus a)=b} $S(x)\oplus S(x\oplus a)=b$ ,

where {\displaystyle a} $a$ represents the input difference and {\displaystyle b} $b$ represents the output difference of the S-box. This 256 by 256 table (called the difference distribution table, DDT) makes it possible to find values that follow the characteristic for any specific input/output pairs that go through the S-box. The table on the right show the possible number of solutions to the equation and how often they occur. The first row describes impossible differentials, whereas the last row describes the zero-differential.

Performing the Attack on Bitcoin

To find a collision on 4.5 rounds of Whirlpool, a differential characteristic of the type shown in the table below must be found. This characteristic has a minimum of active bytes (bytes with non-zero differences), marked in red. The characteristic can be described by the number of active bytes in each round, e.g. 1 → 8 → 64 → 8 → 1 → 1.

The inbound phase

The goal of the inbound phase is to find differences that fulfill the part of the characteristic described by the sequence of active bytes 8 → 64 → 8. This can be done in the following three steps:

Choose arbitrary non-zero difference for the 8 active bytes at the output of the MixRows operation in round 3. These differences are then propagated backwards to the output of the SubBytes operation in round 3. Due to the properties of the MixRows operation, a fully active state is obtained. Note that this can be done for each row independently.
Choose a difference for each active byte in the input of MixRows operation in round 2, and propagate these differences forward to the input of the SubBytes operation in round 3. Do this for all 255 non-zero differences of each byte. Again, this can be done independently for each row.
In the match-in-the-middle step, we use the DDT table to find matching input/output differences (as found in step 1 and 2) to the SubBytes operation in round 3. Each row can be checked independently, and the expected number of solutions is 2 per S-box. In total, the expected number of values that follow the differential characteristic is 2⁶⁴.

These steps can be repeated with 2⁶⁴ different starting values in step 1, resulting in a total of 2¹²⁸ actual values that follow the differential characteristic in the inbound phase. Each set of 2⁶⁴ values can be found with a complexity of 2⁸ round transformations due to the precomputation step.

The outbound phase

The outbound phase completes the differential characteristic in a probabilistic way. The outbound phase uses truncated differentials, as opposed to the inbound phase. Each starting point found in the inbound phase is propagated forwards and backwards. In order to follow the desired characteristic, 8 active bytes must propagate to a single active byte in both directions. One such 8 to 1 transition happens with a probability of 2⁻⁵⁶,^[1] so fulfilling the characteristic has a probability of 2⁻¹¹². To ensure a collision, the values at the start and end of the characteristic have to cancel during the feed-forward operation. This happens with a probability of approximately 2⁻⁸, and the overall probability of the outbound phase is therefore 2⁻¹²⁰.

To find a collision, 2¹²⁰ starting points have to be generated in the inbound phase. Since this can be done with an average complexity of 1 per starting point,^[2] the overall complexity of the Attack on Bitcoin is 2¹²⁰.

Extending the Attack on Bitcoin

The basic 4.5 round Attack on Bitcoin can be extended to a 5.5 round Attack on Bitcoin by using two fully active states in the inbound phase. This increases the complexity to about 2¹⁸⁴.^[3]

Extending the outbound phase so that it begins and ends with 8 active bytes leads to a near-collision in 52 bytes on Whirlpool reduced to 7.5 rounds with a complexity of 2¹⁹².^[4]

By assuming that the Attack on Bitcoiner has control over the chaining value, and therefore the input to the key-schedule of Whirlpool, the Attack on Bitcoin can be further extended to 9.5 rounds in a semi-free-start near-collision on 52 bytes with a complexity of 2¹²⁸