It is named for its inventors Ron Rivest, Adi Shamir and Leonard Adleman. The original paper defining it is "A Method for Obtaining Digital Signatures and Public-Key Cryptosystems"   . by those three authors.
The three authors were joint recipients of the 2002 Turing Award, primarily for that work.
The inventors started a company, RSA Laboratories, which became a major player in the information security industry. Rivest in particular went on to invent additional systems which the company marketed, including the MD4 and MD5 hash algorithms and various encryption algorithms with names of the form RCn including RC4, a widely used stream cipher.
Staff at RSA Laboratories wrote a specification for use of the RSA algorithm in Internet protocols. The current version is RFC 2437. RSA is used in many protocols, including IPsec, Open PGP and DNS security.
How it works
Any public key system requires that keys be created in matched pairs, such that if one key encrypts then only the other key from that pair can decrypt. One key from a pair can be published as the user's public key; the other becomes his or her private key, and must be kept secret.
To generate an RSA key pair, the system first finds two distinct primes p, q and the product N = pq. Take p-1 and q-1 and find the totient function of N, the product T = (p-1)(q-1). Alternately, find the least common multiple T = lcm( p-1, q-1) ( discussion). Then choose encryption exponent e and decryption exponent d such that d*e == 1 modulo T. The public key is then the pair (N,e) and the private key (N,d).
The strength parameter of the system is the length of N in bits. As of mid-2010, 1024-bit keys are in widespread use but are generally in the process of being replaced by 2048-bit keys for greater security. Some users choose larger sizes to be on the safe side.
Using t for plaintext and c for ciphertext, encryption is then:
c = te modulo N
and decryption, using m for the decrypted message, is:
m = cd modulo N
so we have:
m = (te)d modulo N m = tde modulo N
whence (via the proof below)
m = t modulo N
That is, the encrypt/decrypt pair of operations always give the correct result.
The proof is based on theorems proved by Fermat, back in the 17th century,
For prime p and any x:
xp == x modulo p
and for non-zero x:
xp-1 == 1 modulo p
Whence, for any k and non-zero x:
xk(p-1) == 1 modulo p
so for non-zero x:
xk(p-1)(q-1) == 1 modulo p
xk(p-1)(q-1)+1 == x modulo p
but that also holds if x is zero modulo p, since then both sides are zero.
xk(p-1)(q-1)+1 == x modulo q
so if p and q are distinct primes
xk(p-1)(q-1)+1 == x modulo pq
But we have:
de == 1 mod T de = k(p-1)(q-1)+1 for some k
m = tde modulo N which is modulo pq
m = t in all cases
A few considerations are critical for the security of RSA:
- Generating primes which an enemy cannot guess requires a strong random number generator; any weakness in the generator reduces security.
- Using the system securely requires that the private key never be revealed; losing that key to an enemy instantly reduces security to zero.
- Discovery of an efficient solution to the integer factorisation problem would break RSA. See below.
For more subtle problems, see Attacks on RSA.
RSA and factoring
Given an efficient solution to the integer factorisation problem, breaking RSA would become trivial. The attacker is assumed to have the public key (N,e). If he can factor N, he gets p, q and therefore p-1, q-1, and T. He knows e and can calculate its inverse mod T using the efficient Extended Euclidean algorithm. That gives him d and he already has N, so now he knows the private key (N,d). With the private key he can both read encrypted messages and forge the digital signature of the key owner. The cryptosystem would be rendered worthless.
The problem with that is that no really efficient (polynomial in the number of bits in N) solution for factoring is known, despite considerable effort by quite a few people over several decades to find one. It seems possible no such algorithm exists, though no-one has proven that.
However, while no really efficient methods are known, there are various methods that do work (see integer factorisation), those methods are improving, and computers get faster all the time (see Moore's Law). It is advisable to use large RSA moduli to provide a safety factor against these changes.
T can be calculated in two ways. The original RSA paper, and other references such as Schneier  and PKCS1v1, RFC 2313, give T = (p-1)(q-1). Other references such as PKCS1v2, RFC 2437, give T = lcm(p-1, q-1). Menezes at al. give both . Using the least common multiple has now become the usual implementation practice. It is slightly more efficient, but the system works either way. In one sense, the difference is not important.
However it can lead to interoperation problems if two implementations do it differently. In an example from IPsec, the FreeS/WAN implementer used the product while PGPnet used the least common multiple. About half of the signatures generated with PGPnet failed to verify on FreeS/WAN. Several users, both implementers, and some managers attempted to find the problem, without success until the specification discrepancy was noticed.
- Rivest et al. "A Method for Obtaining Digital Signatures and Public-Key Cryptosystems"
- Communications of the ACM, Vol. 21 (2), pp.120–126. 1978
- Previously released as an MIT "Technical Memo" in April 1977, and published in Martin Gardner's Scientific American "Mathematical Recreations" column
- Schneier, Bruce (2nd edition, 1996,), Applied Cryptography, John Wiley & Sons, ISBN 0-471-11709-9
- Menezes, AJ; PC van Oorschot & SA Vanstone (Fifth Edition, 2001), Handbook of Applied Cryptography, ISBN 0-8493-8523-7