Cryptanalysis

The goal of cryptanalysis is to find some weakness or insecurity in a cryptographic scheme, thus permitting its subversion. It is an essential part of communications intelligence. Cryptanalysis might be undertaken by a malicious attacker, attempting to subvert a system, or by the system's designer (or others) attempting to evaluate whether a system has vulnerabilities. In modern practice, however, quality cryptographic algorithms and protocols have usually been carefully examined and many have been proved that establish practical security of the system (at least, under clear -- and hopefully reasonable -- assumptions).

It is a commonly held misconception that every encryption method can be broken. In connection with his WWII work at Bell Labs, Claude Shannon proved that the one-time pad cipher is unbreakable, provided the key material is truly random, never reused, kept secret from all possible attackers, and of equal or greater length than the message. That is, an enemy who intercepts an encrypted message has provably no better chance of guessing the contents than an enemy who only knows the length of the message.

Even two-time use of the keys can lead to compromise, as shown by the VENONA project that allowed cryptanalysis of Soviet espionage traffic, in which a one-time pad was used more than once.

Any cipher except a one-time pad can be broken with enough computational effort (by brute force attack if nothing else), but the amount of effort needed to break a cipher may be exponentially dependent on the key size, as compared to the effort needed to use the cipher. In such cases, effective security can still be achieved if some conditions (e.g., key size) are such that the effort ('work factor' in Shannon's terms) is beyond the ability of any adversary.

Before discussing classic cryptanalyis, be aware that mathematical cryptanalysis is not the only way to access protected content.

Practical cryptanalysis
"Practical cryptanalysis" is a euphemism for using physical or social means to compromise the cryptosystem, such as clandestinely breaking into a communications center and copying the keys.

And, of course, social engineering, and other attacks against personnel who work with cryptosystems or the messages they handle (e.g., bribery, extortion, blackmail, espionage, ...) may be most productive attacks of all. One variant is referred to as "rubber hose cryptanalysis" &mdash; beating, torturing or threatening someone to get him or her to reveal keys.

Side channel attacks
While pure cryptanalysis uses weaknesses in the algorithms themselves, other attacks on cryptosystems are based on actual use of the algorithms in real devices, known as side-channel attack. If a cryptanalyst has access to, say, the amount of time the device took to encrypt a number of plaintexts or report an error in a password or PIN character, he may be able to use a timing attack to break a cipher that is otherwise resistant to analysis. An attacker might also study the pattern and length of messages to derive valuable information; this is known as traffic analysis, and can be quite useful to an alert adversary.

General types of cryptanalytic attacks
There are a wide variety of cryptanalytic attacks, and they can be classified in any of several ways. One distinction turns on what an attacker knows and can do.

Ciphertext-only
The ciphertext-only attack is the most common case, where the cryptanalyst has access only to the ciphertext (modern cryptosystems are usually effectively immune to ciphertext-only attacks).

Known plaintext
In a known-plaintext attack, the cryptanalyst has access to a ciphertext and its corresponding plaintext (or to many such pairs).

Sometimes it is enough for the attacker to have partial knowledge of the plaintext &mdash; perhaps that it is ASCII text with the top bit of every byte zero, or that it is radar data in a known format. This gives him something to go on, a way to check if a decryption or partial decryption is correct; that may be all he needs.

Often the attacker can guess some plaintext. In British World War II ULTRA codebreaking, such guesses were known as "cribs". Many messages contain fixed text like dates or formal phrases like "your humble and obedient servant", and various system such as compression algorithms or email handlers insert fixed-format headers; all these are free gifts to the cryptananlyst. In war, names of enemy officers, bases, ships or units (or their codenames) are good guesses, also perhaps words like "order" and "ammunition". An intelligence organisation that knows the enemy well may have additional cribs available, looking for "congratulations" to a promoted officer. "happy birthday" to a general, and so on.

Language structure may also provide cribs. Consider ordinary English text, where about one in seven characters are spaces, and "the" and "of" are the commonest words. Suppose the cipher uses 64-bit (8 character) blocks. The chance that one of them encodes the 8-character string " of the " is significant. If the cipher user (foolishly) sends large volumes of data with the same key and the attacker has the determination and the (huge) resources to test them all, this crib is almost certain to break the cipher eventually.

Generally, if a true known plaintext attack (where the attacker actually knows some plaintext) is feasible, then variants based on guessed plaintext or on partial knowledge of plaintext will be more difficult, but not prohibitively so. Suppose there is a known plaintext attack that breaks the cipher at reasonable cost, but the attacker has only some guessed plaintext that has a 10% chance of being right. That gives him a one-in-ten chance of solving the cipher on the first try. If he has many such guessed cribs available, he is almost certain to solve it eventually at some cost not horrifically more than the cost of a pure known plaintext attack.

Or suppose he only knows the plaintext is ASCII; the top bit of every byte is zero. Suppose we are dealing with a block cipher that has 64-bit blocks. If there is a feasible known plaintext attack, then an enemy who knows only 64 bits of plaintext in a single block can break the cipher. However, if the data is known ASCII and the enemy has intercepted N blocks, then he knows that 8N bits of the plaintext are zero. Whether this lets him break the cipher or not is an extremely complex question depending on all the details of the cipher. However, if you are trying to keep the data secure, you should guess "yes" and choose a cipher that is secure against known plaintext attacks..

In general, '''if there is an effective known plaintext attack on the cipher. then the cipher must be considered insecure'''.

A number of attacks require known (including guessed) plaintext to work:
 * a brute force search tries all possible keys; you need to know one block of plaintext so you can tell when you have found the right key
 * a meet-in-the-middle attack finds a middle value in two ways, by half-encrypting a block of known plaintext and half-decrypting the matching cyphertext, and searches for matching "middle" results; this is much more efficient than brute force but is not applicable to most ciphers
 * an algebraic attack writes the cipher operations as equations in some algebraic system, usually Boolean, then plugs in known values for plaintext and ciphertext and solves for the key. depending on various detals, this may need anywhere from one to a few dozen plaintexts.
 * linear cryptanalysis and differential cryptanalysis are often very efficient in terms of the attacker's effort, significantly better than brute force. However, they require large numbers of known or chosen plaintexts.

All these should be completely impractical against any well-designed cipher.

Chosen plaintext
In a chosen-plaintext attack, the cryptanalyst may choose a plaintext and learn its corresponding ciphertext (perhaps many times); an example is the gardening used by the British during WWII.

Chosen ciphertext
In a chosen-ciphertext attack, the cryptanalyst may choose ciphertexts and learn their corresponding plaintexts. Also important, often overwhelmingly so, are mistakes (generally in the design or use of one of the protocols involved; see ULTRA for some historical examples of this).

Related key attack
Using two or more related keys for different messages, different links, or different sessions may give a cryptanalyst an entry point.

The best-known failure of his type is for the WEP protocols used in wireless networking. WEP generates keys for different connections by concatenating a connection-specific intialisation value with another secret value, and this creates a weakness. See for example, "Breaking 104 bit WEP in less than 60 seconds".

Strategies against symmetric cryptosystems
Cryptanalysis of symmetric-key techniques typically involves looking for efficient attacks against block ciphers or stream ciphers. Against an ideal cipher, there is no attack better than brute force.

For example, a simple brute force attack against DES requires one known plaintext and 255 decryptions, trying approximately half of the possible keys, before chances are better than even the key will have been found. But this may not be enough assurance; a linear cryptanalysis attack against DES requires 243 known plaintexts and approximately 243 DES operations. This is a considerable improvement on brute force attacks.

See also the stream cipher article.

Strategies against asymmetric cryptosystems
Public-key algorithms are based on the computational difficulty of various problems. The most famous of these is integer factorization (the RSA cryptosystem is based on a problem related to factoring), but the discrete logarithm problem is also important. Much public-key cryptanalysis concerns numerical algorithms for solving these computational problems, or some of them, efficiently. For instance, the best algorithms for solving the elliptic curve-based version of discrete logarithm are much more time-consuming than the best known algorithms for factoring, at least for problems of equivalent size. Thus, other things being equal, to achieve an equivalent strength of attack resistance, factoring-based encryption techniques must use larger keys than elliptic curve techniques. For this reason, public-key cryptosystems based on elliptic curves have become popular since their invention in the mid-1990s.

Vulnerabilities of cryptographic primitives
Much of the theoretical work in cryptography concerns cryptographic primitives &mdash; algorithms with basic cryptographic properties &mdash; and their relationship to other cryptographic problems. For example, a one-way function is a function intended to be easy to compute but hard to invert. In a very general sense, for any cryptographic application to be secure (if based on such computational feasibility assumptions), one-way functions must exist. However, if one-way functions exist, this implies that P ≠ NP. . Since the P versus NP problem is currently unsolved, we don't know if one-way functions exist. If they do, however, we can build other cryptographic tools from them. For instance, if one-way functions exist, then secure pseudorandom generators and secure pseudorandom functions exist.

Other cryptographic primitives include cipher algorithms themselves, one-way permutations, trapdoor permutations, etc.

Vulnerabilities of cryptographic protocols
In many cases, cryptographic techniques involve back and forth communication among two or more parties in space or across time (e.g., cryptographically protected backup data). The term cryptographic protocol captures this general idea. Cryptographic protocols have been developed for a wide range of problems, including relatively simple ones like interactive proofs, secret sharing , and zero-knowledge , and much more complex ones like electronic cash and secure multiparty computation.

When the security of a cryptographic system fails, it is rare that the vulnerabilty leading to the breach will have been in a quality cryptographic primitive. Instead, weaknesses are often mistakes in the protocol design (often due to inadequate design procedures or less than thoroughly informed designers), in the implementation (e.g., a software bug), in a failure of the assumptions on which the design was based (e.g., proper training of those who will be using the system), or some other human error. Many cryptographic protocols have been designed and analyzed using ad hoc methods. Methods for formally analyzing the security of protocols, based on techniques from mathematical logic (see for example BAN logic), and more recently from concrete security principles, have been the subject of research for the past few decades. Unfortunately, these tools are cumbersome and not widely used for complex designs.