Block cipher

A block cipher is a symmetric cipher that operates on fixed-size blocks of plaintext, giving a block of ciphertext for each. The other main type of cipher is a stream cipher, which generates a stream of keying material to be mixed with messages. Block ciphers can be used in various modes when multiple block are to be encrypted.

There is an extensive literature both on the design of block ciphers and on methods of attacking them. For the design, see below and articles on specific ciphers. For the attacks, see cryptanalysis and articles on specific attacks. A useful web index of current work is the Block Cipher Lounge.

Among the best-known and most widely used block ciphers are two US government standards. The Data Encryption Standard from the 1970s is now considered obsolete; the  Advanced Encryption Standard replaced it in 2002. Both of these ciphers, and others, are discussed in more detail below.

Common techniques
Some of the principles discussed below for block ciphers also apply to other cryptographic primitives. Hash algorithms generally use iteration and require avalanche In both hashes and stream ciphers, non-linearity is an important design criterion; s-boxes can be used in either.

Iterated block ciphers
Nearly all block ciphers use iteration; define some relatively simple transformation and apply it repeatedly to create the cipher. At setup time the primary key undergoes key scheduling giving a number of round keys. The actual cipher then has multiple rounds, each applying the same transformation to the output of the previous round using the round key for the current round.

Some ciphers have an additional step called whitening; additional material derived from the key is mixed into the plaintext before the first round, or into the ciphertext after the last, or both.

There is a trade-off that can be made in the design. With a simple fast round function many rounds may be required to achieve adequate security. A more complex round function might allow fewer rounds but, since each round is then more expensive, there may be little or no gain in overall speed. Secure and reasonably efficient ciphers can be designed either way; examples below include IDEA with only 8 rounds but some fairly expensive math in each round, and  GOST with a very simple round function but 32 rounds. A safety margin is often applied; with a round function thought to be secure after n rounds, the actual cipher may use considerably more than n rounds. When a block cipher is constructed from another cryptographic primitive, there may be no need to iterate because the other primitive provides adequate security. For example, RSA can be used as a block cipher with block size equal to the RSA modulus, and other public key techniques can be used in the same way. In effect, this is one extreme of the trade-off described in the previous paragraph; if the round function is itself cryptographically secure, then only one round is needed.

In cryptanalysis it is common to attack reduced round versions of a cipher. For example, instead of full 16-round DES, the analyst might start by trying to break a two-round or four-round version. Such attacks are easier and success there may lead to insights that are useful in work against the full cipher, or even to an attack that can be extended to break the full cipher. Ideally (for the analyst), a break of the two-round system might lead to an attack on a four-round system which in turn might break the eight-round version and finally all 16 rounds.

Avalanche
The designer wants changes to propagate through the cipher so that, for example, a single-bit change at round n affects all bits of the ciphertext by round n+x for some reasonably small x. Certainly x must be much less than the total number of rounds. If the round function design gives a large x then the cipher will need more rounds to be secure.

This was named the avalanche effect in a paper by Horst Feistel. The idea is that changes should build up like an avalanche, so that a tiny initial change quickly creates large effects. The term and its exact application were new, but the basic concept was not; avalanche is a variant of Claude Shannon's diffusion and that in turn is a formalisation of ideas that were already in use.

The strict avalanche criterion is a strong version of the requirement for good avalanche properties. Complementing any single bit of input should give a 50% chance of a change in any given bit of output.

Feistel structure
Many block ciphers use the Feistel structure, devised by Horst Feistel of IBM and used in DES. Such ciphers are known as Feistel ciphers. Each round uses a function F whose input and output are each half a block. Splitting the block into right and left halves and showing XOR as ^ and round key for round n as kn, even numbered rounds are then:

leftn = leftn-1 ^ F(rightn-1, kn) rightn = rightn-1

and odd-numbered rounds are

rightn = rightn-1 ^ F(leftn-1, kn) leftn = leftn-1

Since XOR is its own inverse and the half-block that is used in the F function is unchanged in each round, reversing this is straightforward. For example, the decryption step matching the first example above is:

leftn-1 = leftn ^ F(rightn, kn) rightn-1 = rightn

In some ciphers, all operations must be reversible so that decryption can work. In a Feistel cipher, the F function itself need not be reversible, only repeatable. This gives the designer extra flexibility; the F function can do almost anything.

A single round in a Feistel cipher has far from ideal avalanche properties; in our first example above, half the output bits (on the right) are unchanged and the leftn-1 input bits each affect only one output. However, the other half is changed in the next round and the F function can be designed so that small changes in its inputs (half-block or round key) produce large output changes. Within a few rounds, a Feistel cipher can have excellent overall avalanche properties.

The hard part of Feistel cipher design is of course the F function. Design goals include efficiency, easy implementation, and good avalanche properties. Also, it is critically important that the F-function be highly non-linear. All other operations in a Feistel cipher are linear and a cipher without enough non-linearity is weak; see the next section.

Non-linearity
To be secure, every block cipher must contain some non-linear operations. If all operations in a cipher were linear &mdash; in any algebraic system, with the attacker making the choice of system and allowed to try as many as he likes &mdash; then the cipher could be reduced to a system of linear equations. Any system of simultaneous linear equations, in any algebra, can be solved straightforwardly if the number of equations matches or exceeds the number of variables. The attacker need only plug in known plaintext/ciphertext pairs until that condition holds, then solve for the key.

For example, for a cipher with 64-bit blocks and a 128-bit key,the attacker could write 64 equations each expressing one output bit in terms of 64 inputs and 128 key bits. Plug in one known plaintext/ciphertext pair and he has 64 equations in 128 variables, not a soluble system. However, if he has a second known pair, that gives him a different set of 64 equations in the same 128 key variables. The total system is now 128 equations in 128 variables. If the equations are all linear, this is soluble by standard techniques. However, he also has the option of plugging in a third known pair to get a system with 192 equations in 128 variables if that is easier to solve, or going even further if that helps.

Solving non-linear systems of equations is far harder so the cipher designer strives to introduce non-linearity to the system. Combined with good avalanche properties, this makes algebraic analysis prohibitively difficult. There are several ways to do add non-linearity; some ciphers rely on only one while others use several.

One method is mixing operations from different algebras. If the cipher relies only on Boolean operations, the cryptanalyst can try to attack using Boolean algebra; if it uses only arithmetic operations, he can try normal algebra. If it uses both, he has a problem. Of course arithmetic operations can be expressed in Boolean algebra or vice versa, but the expressions are inconveniently (for the cryptanalyst!) complex and non-linear whichever way he tries it. For example, in the CAST-128 or Blowfish F function, it is necessary to combine four 32-bit words into one. This is not done with a straightforward x = a+b+c+d or x=a^b^c^d but instead with something like x = (a+b)^(c-d) in one round and x = (a^b)+(c^d) in another. On most computers, this costs no more but it may make the analyst's job harder.

Other operations can also be used, albeit at higher costs. IDEA uses multiplication modulo 216+1 and AES does matrix multiplications in a field.

Rotations, also called circular shifts, on words or registers are non-linear in normal algebra, though they are easily described in Boolean algebra. Some ciphers such as GOST use the same rotation by a constant amount in every round; it would be possible to use different constant rotations in different rounds. CAST-128 uses a key-dependent rotation in the F function. Ron Rivest used data-dependent rotations in RC-5 and RC-6.

A general operation for introducing non-linearity is the substitution box or s-box; see following section.

S-boxes
S-boxes or substitution boxes are just large look-up tables. The basic operation involved is a = sbox[b] which, at least for reasonable sizes of a and b, is easily done on any computer.

There is an entire literature on the design of good s-boxes, most of it emphasizing achieving high non-linearity.

Block cipher modes
Various modes of operation for block cipher usage were originally defined for DES in a US Federal Information Processing Standard (FIPS). The most recent NIST recommendations are in "Recommendation for Block Cipher Modes of Operation"

These modes can be applied to any block cipher.

Electronic Code Book, ECB
In Electronic Code Book mode, the cipher is just applied to each block of plaintext independently.

The disadvantage is that the same plaintext block always encrypts to the same ciphertext; this gives an enemy some information. ECB is therefore generally not used.

Cipher Block Chaining, CBC
In cipher block chaining mode, the ciphertext output from the previous block is XORed into the plaintext before encryption. Encryption of block n is then:

cn = encrypt( pn XOR cn-1)

For this to work for n=1, an initialisation vector (IV) must be provided to act as c0. This need not be secret, but it should be different for each message. If the same IV is repeatedly used, then if two or more messages start with the same text, they will encrypt identically for the first block or the first few blocks. This is an unnecessary weakness; using unique IVs is therefore standard practice.

DES
The Data Encryption Standard, DES, is among the the best known and most thoroughly analysed block ciphers. It was invented by IBM, and was made a US government standard for non-classified government data and for regulated industries such as banking, in the late 70s. From then until about the turn of the century, it was very widely used. However, it is now considered obsolete; its 56-bit key size makes it highly vulnerable to a brute force attack, given modern computers. Some applications still use Triple DES, a variant which applies DES three times with two or three different keys; see next section.

DES operates on 64-bit blocks and takes a 64-bit key. It is a Feistel cipher with 16 rounds and a 48-bit round key for each round, To generate the round keys, the 56-bit key is split into two 28-bit halves and those halves are circularly shifted after each round by one or two bits. Then 48 bits from them are selected and permuted to form the round key.

DES uses eight S-boxes, each 6 bits in and 4 out. The F function works as follows: expand the 32-bit input to 48 bits, simply by copying some bits twice XOR with the 48-bit round key split the result into 8 6-bit chunks pass each chunk through a different s-box, giving 32 output bits permute the output bits The permutation ensures rapid avalanche; a one-bit change in key affects one s-box; a one-bit change in the input block affects one or two s-boxes. With the permutation, changing the output of one s-box affects several in the next round. After a few rounds, the effect spreads to the entire output.

Every new cryptanalytic technique invented since DES became a standard has been tested against DES. None of them have broken it completely, but two &mdash; differential cryptanalysis and linear cryptanalysis &mdash; give attacks theoretically significantly better than brute force. This does not appear to have much practical importance since both require enormous numbers of known plaintexts and since DES has been repeatedly broken by brute force anyway. All the older cryptanalytic techniques have also been tried, or at least considered, for use against DES; none of them work.

The generation of block ciphers which followed DES in the 80s and 90s &mdash; such as GOST, Blowfish, CAST-128 and IDEA (see below for all) &mdash; nearly all used 64-bit blocks, like DES, but all used 128-bit or longer keys for better resistance to brute force. Some of their design principles came from analysis of DES; most are designed to be immune to both differential an linear cryptanalysis.

Triple DES
DES has only a 56-bit key, so it is vulnerable to a brute force attack. However the basic design seems very strong; it has withstood decades of intensive analysis without any catastrophic flaws being found. People have therefore looked for ways to use the basic DES cipher while applying a larger key to avoid brute force attacks.

IDEA
IDEA Is the International Data Encryption Algorithm, a European standard. It is a iterated block cipher, but does not have a Feistel structure. Block size is 64 bits, key 128. No s-boxes are used.

IDEA achieves non-linearity by mixing operations from three different algebraic systems. All operations have 16-bit words as both input and output. Two are just bitwise XOR and addition modulo 216. The third is basically multiplication, modulo 216+1, but with some additional code so the "x*0 yields zero for all x" case does not weaken the cipher.

To see how this works, consider this multiplication table modulo 5:

0 1  2  3  4

0 0  0  0  0  0 1  0  1  2  3  4 2  0  2  4  1  3 3  0  3  1  4  2 4  0  4  3  2  1

Note that, ignoring multiplications by zero, every column and every row is a permutation of the set (1,2,3,4). This is true for any prime modulus.

C code for IDEA multiplication of 2-bit numbers (range 0-3) would be:


 * 1) define NBITS 2
 * 2) define MAX (1<<NBITS)
 * 3) define MOD (MAX+1)

unsigned idea_multiply( unsigned x, unsigned y) { unsigned z ;

// make sure inputs are in range x %= MAX ; y %= MAX ;

// adjust the range // avoid multiplying by zero if( x == 0 ) x = MAX ; if( y == 0 ) y = MAX ;

// calculate the result // see table above z = (x*y) % MOD ;

// adjust it   // avoid returning MAX if( z == MAX ) z = 0 ;

return( z ) ; }

Change NBITS to 16 and you have real IDEA multiplication, operating on 16-bit quantities. This works correctly because MOD is then the prime 216+1. On a 32-bit processor, you need to add a bit of code to avoid having MAX*MAX overflow a 32-bit register, but that is the only special case.

This operation is not nearly as cheap as addition or XOR; in many environments it will also be more expensive than s-box lookups. However, it is highly non-linear and reasonably fast on a modern 32-bit or larger CPU. In some environments, such as an embedded processor with limited cache, it might even be cheaper than s-box lookups.

AES
In the late 90s, the US National Institute of Standards and Technology ran a contest to find a block cipher to replace DES. The result is the Advanced Encryption Standard. AES.

In October 299, they announced the winner &mdash; Rijndael (pronounced approximately "rhine doll"), from two Belgian designers. The NIST page on AES has much detail.