Signing and verifying bitcoin transactions

If you have followed my blockchain posts so far, you know how to create bitcoin transactions by assembling transaction inputs and transaction outputs into a transaction data structure and serializing it. However, there is one subtlety that we have ignored so far – what exactly do you sign?

You cannot sign the entire bitcoin transaction, simply because the signature is part of it! So there is a sort of chicken-or-egg problem. Instead, you need to build a transaction without the signature, serialize it, create a hash value, sign this hash value, then add the signature to the transaction and publish this transaction.

This is essentially how signing actually works. However, there are a few details that we still miss. So let us look at the source code of the reference client once more to make sure that we really understand how this works.

As a starting point, remember that there is a script opcode called OP_CHECKSIG that verifies a signature. So let us look at the corresponding method which we find in script/interpreter.cpp. When this opcode is executed, the function EvalScript delegates to the method TransactionSignatureChecker::CheckSig. This method removes the last byte of the signature script (which is the hash type) and then calls the function SignatureHash. This function receives the following parameters (you might have to trace back a bit starting at EvalScript to verify this):

  • the script code of the public key script
  • the transaction that is to be signed
  • the index of the transaction input within that transaction to which the signature refers
  • the hash type
  • the amount
  • a version number which is zero in our case and indicates whether the segregated witness feature is in effect
  • a data structure that serves as a cache

We have not yet talked about the hash type. Essentially the hash type controls which parts of a transaction are included in the signature and therefore protected against being changed. The most common case is SIGHASH_ALL = 1, and this is the value for this field that we assume in this post (if you understand this case, it will not be difficult to figure out the other cases as well).

In any case, the function SignatureHash is what we are searching for. It serializes the transaction and creates a hash value which is then passed to the crytographic subroutines to actually verify the signature. This function again delegates the actual work to an instance of the helper class CTransactionSignatureSerializer, more precisely to its method Serialize. This method is in fact not very surprising.

template
void Serialize(S &s) const {
    // Serialize nVersion
    ::Serialize(s, txTo.nVersion);
    // Serialize vin
    unsigned int nInputs = fAnyoneCanPay ? 1 : txTo.vin.size();
    ::WriteCompactSize(s, nInputs);
    for (unsigned int nInput = 0; nInput < nInputs; nInput++)
         SerializeInput(s, nInput);
    // Serialize vout
    unsigned int nOutputs = fHashNone ? 0 : (fHashSingle ? nIn+1 : txTo.vout.size());
    ::WriteCompactSize(s, nOutputs);
    for (unsigned int nOutput = 0; nOutput < nOutputs; nOutput++)
         SerializeOutput(s, nOutput);
    // Serialize nLockTime
    ::Serialize(s, txTo.nLockTime);
}

Ignoring the flag fAnyoneCanPay, this method behaves similar to ordinary serialization – first the version number is written, then the number of inputs, followed by the actual inputs, then the number of outputs, the actual outputs and finally the lock time.

The actual difference to an ordinary transaction serialization is hidden in the method SerializeInput.

template
void SerializeInput(S &s, unsigned int nInput) const {
    // In case of SIGHASH_ANYONECANPAY, only the input being signed is serialized
    if (fAnyoneCanPay)
        nInput = nIn;
    // Serialize the prevout
    ::Serialize(s, txTo.vin[nInput].prevout);
    // Serialize the script
    if (nInput != nIn)
        // Blank out other inputs' signatures
        ::Serialize(s, CScript());
    else
        SerializeScriptCode(s);
    // Serialize the nSequence
    if (nInput != nIn && (fHashSingle || fHashNone))
        // let the others update at will
        ::Serialize(s, (int)0);
    else
        ::Serialize(s, txTo.vin[nInput].nSequence);
}

Again, this starts out as usual by serializing the reference to the spent transaction output (transaction ID and index). However, we soon find the first difference – if the index of the transaction input to be signed differs from the input that is currently serialized, an empty script is serialized instead of the signature script – which translates into a single byte 0x0 which is the length of the empty script.

The second difference is located in the method SerializeScriptCode. Here, instead of serializing the not yet existing signature, the content of the variable scriptCode is serialized. Going back a bit, we see that this is the public key script of the corresponding transaction output! So basically, this script takes the position of the signature script in this serialization step.

But we are not yet done – let us return to the function SignatureHash and see what it does close to the end.

CTransactionSignatureSerializer txTmp(txTo, scriptCode, nIn, nHashType);

// Serialize and hash
CHashWriter ss(SER_GETHASH, 0);
ss << txTmp << nHashType;
return ss.GetHash();

Using the operator <<, we see that the method Serialize that we have just studied is invoked. However, we see that in addition, the hash type – which was stripped off at the beginning – is now added again at the end. All this is routed into a CHashWriter which is defined in hash.h and is essentially a wrapper around a 256 bit hashing routine. This hash is then returned to TransactionSignatureChecker::CheckSig which then tries to verify the signature with that message.

After that tour through the reference client, let us try to verify the signature of a real transaction from the bitcoin network. In the library btc/script.py, I have implemented a function signatureHash which does – for the special case discussed – the same as the function that we have just looked at. Using that function, it is now not difficult to verify a signature, given a transaction input and the corresponding spent output. Here is what we need to do in an ipython session, again assuming that you have downloaded my btc package.

First, we get our sample transaction that we have already used before and the transaction output spent by the first transaction input.

In [1]: import btc.utils
In [2]: import btc.txn
In [3]: raw = btc.utils.getRawTransaction("ed70b8c66a4b064cfe992a097b3406fa81ff09641fe55a709e4266167ef47891")
In [4]: txn = btc.txn.txn()
In [5]: txn.deserialize(raw)
In [6]: txin = txn.getInputs()[0]
In [7]: raw = btc.utils.getRawTransaction(txin.prevTxid)
In [8]: prev = btc.txn.txn()
In [9]: prev.deserialize(raw)
In [10]: spentTxo = prev.getOutputs()[txin.getVout()]

Next, we determine the signature hash and interpret it as a big endian encoded integer.

In [11]: h = int.from_bytes(btc.script.signatureHash(txn, 0, spentTxo), "big")

Next we need to determine the other ingredients of the signature – the r and s values.

In [12]: r = txin.getScriptSig().getSignatureR()
In [13]: s = txin.getScriptSig().getSignatureS()

Now we need the ECDSA cryptography library. We get the SECP256K1 curve and its parameters from there and determine the point on the curve that corresponds to the public key.

In [14]: import ecdsa
In [15]: curve = ecdsa.curves.SECP256k1
In [16]: G = curve.generator
In [17]: p = curve.curve.p()
In [18]: txin.getScriptSig().getScriptType()
Out[18]: 'P2PKH'
In [19]: pubKey = txin.getScriptSig().getPubKeyHex()
In [20]: pubKey
Out[20]: '038c2d1cbe4d731c69e67d16c52682e01cb70b046ead63e90bf793f52f541dafbd'

We see that the first byte of the public key is 0x03. If you remember what we have learned about the public key encoding, this means that y is odd. We can therefore determine the x and y coordinate as follows.

In [21]: x = int.from_bytes(bytes.fromhex(pubKey[2:66]), 'big')
In [22]: y = ecdsa.numbertheory.square_root_mod_prime((x**3 +7) % p , p)
In [23]: y
Out[23]: 91238655223056717466178782248030327689649326505413694215614940048183357985838
In [24]: y = (p - y) % p
In [25]: y
Out[25]: 24553434014259477957392202760657580163620658160226869823842643959725476685825
In [26]: assert(curve.curve.contains_point(x, y))

Now we are almost done.

In [27]: Q = ecdsa.ellipticcurve.Point(curve.curve, x, y)
In [28]: pKey = ecdsa.ecdsa.Public_key(G, Q)
In [29]: signature = ecdsa.ecdsa.Signature(r,s)
In [30]: pKey.verifies(h, signature)
Out[30]: True

So our signature is valid, as expected – after all this is a real transaction from the bitcoin network. The full source code for this exercise is also contained in btc/script.py in the function verifySignature.

Of course we could now revert this process – given a transaction, we could use the function signatureHash to create a hash value which we then sign using the ECDSA library. This now puts us in a position to actually create a transaction that we can then push into the bitcoin network. Now, this is clearly something you do not want to try in the real bitcoin network. Therefore I will show you in my next post how you can set up a small test network at home to be able to play with this without the risk of losing actual bitcoins.

Bitcoin security and the Mt Gox incident

In February 2014, the bitcoin exchange Mt Gox – at this time one of the largest trading platforms in the market – had to suspend withdrawals completely, apparently in an attempt to recover from a massive attack. A few days later, the company filed for bankruptcy, claiming that it had lost 850.000 BTC, at this time worth roughly 500 Mio. USD. Earlier press releases issued by Mt Gox indicated that a weakness of the bitcoin protocol called transaction malleability had been leveraged by an attacker to steal the bitcoin – a claim that could not be confirmed by an independent analysis carried out by two researchers from the ETH Zürich a few weeks later. Still, that research also showed that this weakness was actively exploited, even though not at the scale claimed by Mt Gox. The purpose of this post is to explain this weakness in a bit more detail and show how it is related to the scripting language used in the bitcoin protocol.

We have seen that signatures and public keys in a bitcoin transaction come as parts of scripts, and that verifying a signature is equivalent to executing a script built out of a transaction input and the corresponding transaction output. Theoretically, this approach is extremely flexible – we could use any valid script and obtain a valid bitcoin transaction.

However, in practice, the bitcoin reference implementation has a few built in checks that restrict the set of scripts that are accepted, and also the scripting language itself has some limitations, there are, for instance, no loops. This has several reasons.

The first reason is the possibility of a denial-of-service attack on a bitcoin node. Suppose the scripting language were powerful enough to build a program that never terminates. When a transaction containing such a script is validated by a node, it starts the scripting engine to validate the transaction and gets stuck in an infinite loop, thus blocking at least one thread forever. Even worse, a very powerful and complicated scrip could leverage errors in the implementation of the scripting engine and so malicious code could be injected into a bitcoin node as part of a transaction. Thus, from a security point of view, it makes sense to place some limitations on the set of allowed scripts.

However, there is another, more subtle reason which is a consequence of the fact that, if a bitcoin transaction is signed, the signature script is not part of the data to which the signature is actually applied, as we will see in a later post. To illustrate why this could be a problem, suppose Alice owes Bob 50 bitcoins and makes a corresponding payment. So she would assemble a transaction – let me call that A – and sign it with her private key. She would then publish that transaction in the network. For bookkeeping purposes, her wallet would probably remember the ID of the transaction. By convention, the ID of a bitcoin transaction is simply its hash value.

Bob could now try the following attack. He could obtain the transaction from the network before it is added to a block, change its signature script by adding some code to the script that essentially does nothing, and publish that transaction as transaction B, as indicated below.

TransactionMalleability

If the new, updated signature script is functionally equivalent to the script contained in A, both transactions would be valid. Moreover, both refer to the same unspent transaction output. However, as they have different signature scripts, they have different hash values and therefore different transaction IDs.

What Bob is hoping for is that the manipulated transaction B created by him is picked up by a miner first and added to a block. Bob would then obtain the bitcoin payment. However, Alice transaction A would be rejected due to double spend! Therefore Bob could now approach Alice and claim that he never received the payment. If Alice checks the status of the transaction ID that she knowns – that of transaction A – she would in fact learn that the transaction has been rejected by the network. In the worst case, she would feel obliged to initiate another payment, and Bob would receive the amount twice.

A similar situation could arise if Bob simply alters the signature created by Alice by replacing s by -s. This would still be a valid signature, but also changes the hash value and therefore the transaction ID.

This is the weakness that has been termed transaction malleability – the ability to modify a transaction without having to sign it again.

Can that be exploited to actually steal bitcoins? Well, maybe. Suppose that Alice is in fact a bitcoin exchange like Mt Gox, managing a large amount of bitcoins for different users. Suppose further that this exchange has implemented a wallet that automatically re-issues a transaction if the first transaction is rejected by the network. In this case, a malicious attacker like Bob could request withdrawals from the exchange, i.e. transfer to his own bitcoin address, and could then create and post modified transactions for each of the resulting transactions generated by the exchange. Even if only a certain percentage of the modified transactions are actually included in the blockchain, this could in fact lead to multiple payments triggered in favor of Bob, so that Bob can actually steal a significant amount of bitcoins from the exchange.

With BIP 62 and BIP 66, certain standards for scripts and the decoding of the signature are enforced to avoid these issues. With the segregated witness features (BIP 141 (Segregated witness)), additional protection is implemented as the signature is no longer part of the transaction that is signed. This at least addresses the currently known sources of transaction malleability. However, as always, this does not guarantee that similar flaws will not be detected and actively exploited in the future. The lesson is obvious – like any other complex distributed system, the blockchain is not immune to this kind of problems and will never be – developers and users should be aware of this.