Detail the process of key exchange in end-to-end encryption, including the steps and security considerations involved to prevent man-in-the-middle attacks.
Key exchange is a crucial process in end-to-end encryption (E2EE), as it establishes a shared secret key between communicating parties over an insecure channel, which is then used for encrypting the actual message content. The primary goal is to achieve this without exposing the shared secret key to eavesdroppers, especially those who might be conducting a man-in-the-middle (MitM) attack. While there are various methods for key exchange, the Diffie-Hellman (DH) key exchange, and its variants, are most commonly used in modern E2EE systems. Let’s outline a general process and the related security considerations.
The traditional DH key exchange can be described with two participants, Alice and Bob. First, both Alice and Bob agree on a publicly known large prime number ‘p’ and a generator ‘g’, often predetermined by the underlying protocol, such as the Signal Protocol. These values are public and can be known by an attacker without impacting security. Next, both generate random private keys; Alice generates a private key ‘a’, and Bob generates a private key ‘b’. These keys are kept secret. Alice computes her public key ‘A’ by raising the generator to the power of her private key mod ‘p’, represented as ‘A = g^a mod p’. Similarly, Bob calculates his public key ‘B’ as ‘B = g^b mod p’. Crucially, both parties use their private key in this calculation. Then they exchange their public keys with each other over the insecure channel. Alice sends her public key ‘A’ to Bob and Bob sends his public key ‘B’ to Alice. Although these keys travel through an unsecure medium where a MitM attacker might intercept them, security is maintained because it is extremely difficult to obtain a user’s private key from their public key. Once Alice receives Bob's public key ‘B’, she computes the shared secret key ‘s’ by raising Bob's public key ‘B’ to the power of her own private key ‘a’, mod ‘p’, i.e ‘s = B^a mod p’. Bob does a similar operation with Alice’s public key and his own private key resulting in ‘s = A^b mod p’. Because (g^a)^b mod p is equivalent to (g^b)^a mod p, both Alice and Bob independently arrive at the same shared secret key 's' without actually transmitting the secret key. They now have a shared secret to use as the symmetric key for encrypting messages through an end-to-end encrypted channel.
While the DH key exchange effectively allows two parties to agree on a shared secret over an insecure channel, it is vulnerable to MitM attacks, where a malicious third party intercepts communications, posing as each of the parties. In a MitM attack, the attacker, let's call her Mallory, would intercept Alice’s public key ‘A’ sent to Bob and send her own public key ‘M’ to Bob instead. Likewise, Mallory would intercept Bob's public key 'B' being sent to Alice and would send her own public key ‘M’ to Alice instead. Then Mallory creates two shared secret keys: one with Alice using Mallory's private key and Alice’s public key, and one with Bob using Mallory's private key and Bob’s public key. Consequently, Alice and Bob believe they are communicating directly, but Mallory is decrypting the communications between them and re-encrypting them to the other party, allowing her to eavesdrop on and manipulate the communication.
To prevent this type of MitM attack, DH key exchange is often used along with mechanisms that provide authentication. This means that Alice and Bob can confirm they are communicating with the correct counterpart and not with a malicious third party. In most scenarios, a digitally signed public key or a pre-shared secret key, can be employed to verify the identities of the communicating parties and thus mitigate MitM attacks. For example, in the Signal Protocol, a triple Diffie-Hellman key exchange is used where each party also signs their public key using their long term identity keys. This process ensures that both parties can verify each other's identity while they are completing the key exchange, effectively preventing the MitM attack from succeeding.
Other techniques include using a centralized server that acts as a trusted authority, where both parties share their public keys with the server, and then the server can provide the other party with the correct public key. But that introduces a centralization issue that does not provide full end-to-end encryption. Another solution used is through out-of-band verification. This means communicating through another channel and checking that the public keys match. One common example of this is when apps present ‘security codes’ or ‘verification codes’ as strings of digits or QR codes and you verify this against the codes on the other party’s device, usually in person or through another trusted secure communication medium. This allows both users to check each other's identities directly over a trusted channel.
To summarize, the process of key exchange in E2EE involves generating and exchanging public keys using methods like DH, followed by the calculation of a shared secret key without transmitting that shared secret key. This initial phase is vulnerable to MitM attacks, thus authentication mechanisms are important. By incorporating these authentication steps such as a signed pre-shared key or an identity key, and by providing mechanisms that can verify the identity of parties, systems are able to ensure secure and private communication.