Asymmetric Cryptography Basics: Theory
Asymmetric cryptography is a fundamentally new branch of cryptography that deals with public-key crypto algorithms, including digital signature algorithms, asymmetric encryption, and key exchange protocols.
The Essence
Before 1976, even though humanity already had strong ciphers like Lucifer, there was a serious problem with distributing keys for such algorithms. To start encrypting, two parties needed to agree on a key through another secure, eavesdropper-proof channel, such as meeting in person. While logical, workarounds were eventually found.
Initially, clever people proposed various makeshift solutions. For example, in radio communication, one side could jam the airwaves with random noise while the other transmitted the key, then subtract the noise from the received data to recover the key. An eavesdropper couldn’t distinguish between the noise and the secret data.
But in 1976, Whitfield Diffie and Martin Hellman introduced a universal solution that allowed secret communication without prior knowledge of a shared key. The only requirement was that the channel could be eavesdropped on but not modified. As with any breakthrough, many alternative methods based on different principles soon appeared.
Key Exchange
Among asymmetric algorithms, there are both encryption and digital signature algorithms. However, both actions can also be performed symmetrically if both parties share a secret key, which can be used for standard encryption (like AES) or for creating an HMAC.
Merkle’s Puzzle
Before the invention of the first asymmetric algorithm, there were attempts to create something similar using symmetric schemes. For example, American student Ralph Merkle (who later became a renowned cryptographer and invented the Merkle tree used in Bitcoin) described a simple scheme in his coursework:
- I create 240 large random numbers and encrypt each with a weak 40-bit key, adding some redundancy (like appending zeros) before encryption. This results in several terabytes of pure noise, which I send to you by email.
- You randomly pick any encrypted number and brute-force it on your computer. This is feasible because the key is weak.
- Once you decrypt the number, you use it as our shared key and let me know which one you picked, for example, by sending its hash.
The point is, I can identify which number you chose because I have all of them and know their hashes. You also know which one you picked. But an attacker doesn’t know and would have to decrypt all numbers and check their hashes. Decrypting one number is easy (240 operations), but with 240 numbers, the attacker faces 280 operations. For comparison, Bitcoin’s best result is 79 bits.
Diffie-Hellman Algorithm
The first true asymmetric algorithm was invented by two Americans in 1976. The NSA later claimed it knew about this possibility in 1966, but no one believed them.
The algorithm is based on a commutative, one-way operation. This can be exponentiation modulo a prime number or multiplying a number by a point on an elliptic curve in a finite field. The simplest case is exponentiation:
- Choose a base a (a public number, can be published or hardcoded).
- I generate a large secret number b and compute B = ab. I publish B; b is my private key.
- You generate your own large secret number c and compute C = ac. You publish C; c is your private key.
- I take your C and compute our shared secret key K = Cb.
- You take my B and compute K = Bc.
Now we both have the same shared key K, which no one else knows. My K = your K because K = (ac)b = (ab)c. We did this without knowing each other’s secret numbers, using only the public values. An attacker can’t compute K without at least one secret number, as each operation is followed by a modulo operation, making it irreversible.
Security
It’s important to note that exponentiation modulo a prime or scalar multiplication on an elliptic curve (discrete logarithm) is a potentially one-way operation. Many asymmetric algorithms rely on the belief that NP and P complexity classes are not equal, which hasn’t been proven. This means that one day, someone could break these algorithms by solving any NP-complete problem, which would break all such algorithms at once. This day will come with the advent of powerful quantum computers.
Merkle’s Puzzle and a few other unusual asymmetric algorithms aren’t vulnerable to this, but they’re rarely used due to impracticality or lack of study.
Digital Signatures
Besides key exchange algorithms, there are specialized algorithms for signing or encrypting, or both. A digital signature is a set of bytes linked to a message that can only be generated by the owner of the private key and verified by anyone with the public key. A signed document cannot be modified without invalidating the signature, making it useful for authorization. Once signed, a document’s signature cannot be repudiated, making it suitable for contracts.
Usually, the hash of the message is signed, not the message itself, since hashes are short and fixed-length.
Lamport’s Scheme
Lamport’s scheme is the most straightforward digital signature algorithm, but it can only sign one message. If I come up with two random strings and publish their hashes, that’s my public key. I can sign one bit of information: if it’s 0, I publish the first string; if 1, the second. Anyone can check the hash and verify the signature.
The logical extension is to publish more than one pair of strings to sign multiple bits. Ralph Merkle made many improvements, including the Merkle tree, allowing multiple signatures with one key.
Encryption
There are also special algorithms for encryption. The public key is not secret; anyone can use it to encrypt a message, but only the owner of the private key can decrypt it.
Rabin’s Scheme
Michael Rabin (who also invented the primality test used for generating large primes) created a simple asymmetric encryption scheme: c = m2 mod n. The plaintext is converted to a number, squared, and the result is the remainder after division by the public key n.
A crucial property of this scheme is its provable security: finding the square root modulo n is as hard as factoring n. RSA (a similar algorithm) can be broken without solving the NP-complete factoring problem (and such attacks have been demonstrated for small exponents).
To decrypt, you need the prime factors (p and q) of n, which are the private key and must be large. For easier decryption using the Chinese Remainder Theorem, p and q are chosen so that they are congruent to 3 mod 4. The decryption process is complex and results in four possible messages, one of which is correct. Therefore, some redundancy is added to the message before encryption so the recipient can identify the correct version.
More
This article describes the simplest algorithms for general understanding. There’s no point in covering all algorithms and technical details here—books and documentation exist for that.
There are many asymmetric ciphers, and new ones with interesting properties appear periodically. The most widely used are RSA, Diffie-Hellman (in HTTPS), ElGamal derivatives, and their elliptic curve versions (DSA, ECDSA, and the Russian GOST R 34.10-94). Once you understand Diffie-Hellman, the rest are easy to grasp. There are also academic variants like Rabin, Schnorr, and GHR schemes—a more complete list is available on English Wikipedia.
Man-in-the-Middle Attack
All asymmetric algorithms are vulnerable to the “man-in-the-middle” attack, and nothing can be done about it.
As mentioned earlier, during communication over an unsecured channel, asymmetric ciphers guarantee that an eavesdropper cannot learn the content of the conversation. But if the attacker can also modify the transmitted data, they can easily intercept the conversation.
The idea is that during the exchange of public keys, the attacker intercepts and replaces the keys with their own. The parties cannot detect this and end up using the attacker’s keys, communicating with the attacker, who simply re-encrypts and forwards all messages to avoid detection.
From an ethical standpoint, this isn’t a problem if both parties are anonymous; it’s as if each is talking to the attacker, and where the answers come from doesn’t matter. It’s like playing chess online by relaying moves to a computer game.
If one party is not anonymous, linking a real identity to keys becomes important, and help from the real world is needed. That’s why trusted HTTPS connections require certificate authorities, who legally vouch that a given key belongs to a specific entity, like a bank.
If you need to link a key to a human-readable string (like a domain name), you also need external intervention, since ownership of that string must be established. However, the latest technologies allow this to be solved without a centralized authority—using a decentralized one.