Strong cryptography is at the heart of the blockchain and many other modern technologies, so it does not hurt to get familiar with the basics. In this post I will explain the foundations of one very commonly used algorithm called elliptic curve digital signature. This post will be a bit lengthy and theoretical but do not worry – we will see how all this works in practice in the next post.
First we need to understand what private and public keys are. At the end of the day, cryptography is about encoding and decoding messages. Suppose, for instance, that two parties (which we will call Alice and Bob to follow the usual naming conventions) want to exchange information, but expect that a third party is able to obtain a copy of every message that they send forth and back, for instance because Alice and Bob communicate over the internet and a third party could control some of the routers sitting between them (yes, we all know this is not just theory…).
One approach that Alice and Bob could use is as follows. In a first step, they both agree on a key. This could be a key phrase or some other sequence of bytes, depending on the exact algorithm they use. To send a message to Bob, Alice would then take the message and encrypt it, i.e. apply a function that depends on the key to obtain an encrypted message .
Alice would then send the encrypted message to Bob. Bob would apply a second function to the received message to decrypt it again. If and are chosen properly, then they will be inverses of each other, i.e.
Therefore Bob will be able to obtain the original message from the key and the encrypted message. The algorithm is secure if it is virtually impossible to derive the original message from the encrypted message without knowing the key.
This class of algorithms is called symmetric because both parties, Alice and Bob, use the same key, i.e. the same key is used to encrypt a message and to decrypt it again. Unfortunately, there is an obvious challenge when using this in practice: Alice and Bob need to exchange the key and therefore need a separate secure communication channel. In a peer-to-peer network like the blockchain where the nodes can only communicate via the unsecure public internet, this is virtually impossible, and a different approach is needed.
Asymmetric algorithms and public key cryptography are designed to overcome exactly this difficulty. These algorithms uses keys that come in pairs. Every key pair consists of a public key and a private key. As the naming suggests, the public key is intended to be shared, while the private key needs to be kept secret at all times. When Alice wants to send a message to Bob, she will first need to retrieve Bobs public key. Bob could for instance publish his key on a webpage or add it to the signature of an e-mail. Alice would then encrypt the message using the public key. When Bob receives the encrypted message, he uses his private key to decrypt the message again.Knowing the public key and the encrypted message is not sufficient to perform the encryption step, the private key is needed to do this. As only Bob has access to his private key, only he can read a message that has been encrypted with his public key, even though his public key is freely available and known to an attacker.
Public key cryptography relies on a mathematical operation that is easy to perform in one direction, but virtually impossible in the other direction. A classical example is the factorization of large numbers. Whereas it is easy to multiply two large prime numbers and to obtain their product , it requires a lot of computing power to execute the reverse operation, i.e. to determine and from . The well known RSA algorithm is based on this.
Other algorithms with this property are certain operations in properly chosen finite groups. Recall that a group is a mathematical structure in which elements can be added and subtracted such that the usual rules like associativity that we are used to from the addition of ordinary numbers apply. In particular, given a group as well as an integer , we can form the product which is obtained by adding performing k additions.
In some groups, the result of this operation can easily be calculated, but given and , it is very difficult to determine . This property can be exploited to design cryptographic algorithms, as we will see in a few minutes.
One large class of finite groups with this property are elliptic curves over finite fields. Suppose that we are given some finite field (most of the time, is the prime field for some prime number ). Let us also assume that the characteristic of is odd. For our purposes, an elliptic curve over is the set of points that obeys an equation of the form
with constants .
Finite fields are a bit difficult to visualize. To get an idea how an elliptic curve looks like, let us take a look at an example over the reals.
The curve displayed here has the parameters and and features a prominent role in the bitcoin protocol (this curve over a certain finite field is known as SECP256K1, more on this in a later post). The blue line visualizes all pairs on the curve. This picture also demonstrates the reason why elliptic curves are so useful for our purposes – points on the curve can be added, and in fact the set of points on a curve forms an abelian group.
To see how the addition works, look at the points marked as A and B in the picture above. To add these points, draw a line through A and B. This line will (ignoring a few special cases and multiplicities for the time being) intersect the curve in exactly one other point, called C in this example. Then reflect on the x-axis to arrive at the point which has the same x-coordinate as C, but minus its y-coordinate. This point is, by definition, the sum of A and B.
If you have a bit of a background in algebraic geometry and that rings a bell, you are right – it has to do with divisors. In fact, the set of points on an elliptic curve is in a one-to-one correspondence with a subgroup of the divisor class group, and the group structure inherited by this relation is the one that I have just described. See  for more details.
So now we have a finite abelian group (remember that in reality, we do not do this over the reals but over some finite field) – what can we do with it? To illustrate this, let us describe one application to cryptography called elliptic curve digital signature algorithm (ECDSA). A digital signature is like a watermark that you add to a message to allow others to verify that you have seen and approved the message and that the message has not been altered in transit. Again, most digital signatures involve public and private key pairs. When Alice wants to sign a message , she uses her private key to encode her message and obtains a signature . She then sends that signature along with the original message and her public key to Bob. Using that information, Bob can verify the signature to confirm that it has been created using the private key that belongs to the public key and that the message that he has received is identical to the message that Alice has signed. Assuming that only Alice has access to the private key, he can then deduce that Alice has actually composed and approved the message. In practice, it is not the actual message that is signed, but a hash value of the message to keep the signature short.
Let us see how this works (in addition,  is a readable and valuable resource for some of the details). Suppose Alice wants to sign a message . The first thing she will do is to agree with Bob on some set of elliptic curve parameters, i.e. a finite field, the parameters and and some point on the curve called the generator. In , a few standard sets have been defined and published, among them the standard SECP256K1, on which Alice and Bob could agree. That information is public and does not need to be protected.
Next, Alice will pick a secret or private key. In our case, this is simply a number between zero and the order of the point . Alice will then determine a public key
which is simply a point on the curve. That key can be freely published – as we have mentioned before, it is computationally very hard to determine the secret from that information.
All this only needs to be done once. Now comes the part that is specific to the message. First, Alice create a hash value of the message by applying a hash function:
The hash function needs to be chosen such that the resulting hash value can be interpreted as a number between zero and . Next Alice picks a random number between one and and determines the coordinates of the point . Let denote the x-coordinate of this point. Then Alice determines
where the inverse is taken modulo n. The signature that Alice will publish along with the original message is then the pair .
When Bob wants to verify the signature, he would then again compute and determine the point
on the curve. Let us do a short calculation to see what the expected outcome is. First,
Now by the choice of , we also have
As is the order of , this implies that
Thus we expect that the x-coordinate of the newly computed point is equal to the x-coordinate of which Alice has published as . If Bob makes this comparison and arrives at this result, he can be almost sure that someone who has access to the private key has signed the message and that the message has not been altered since then.
Note that, if we knew the signature and , we could compute and therefore the secret . It is therefore extremely important to treat the number with care – use a strong random number generator to obtain it, do not reuse it for further messages and delete it immediately after using it.
We have now seen how arithmetic on an elliptic curve can be used to design a secure and reliable digital signature. I admit that this was a bit theoretical, but now comes the fun part – in the next post we will see how this can be done in Python and code a few examples. Stay tuned…
1. R.Hartshorne, Algebraic Geometry, Springer, New York 1997, example 6.10.2
2. Standards for efficient cryptography 1 and 2 (SEC1, SEC2), SECG Group, available online at http://www.secg.org