In my previous posts on the bitcoin protocol, I have described those objects that constitute participants – private and public keys and bitcoin addresses. Now we will look at those objects that represent actual transfers of bitcoins between these participants, namely at transactions.
Essentially, a bitcoin transaction consists of two parts. First, a transaction contains a list of one or more transaction outputs. A transaction output describes the target of the bitcoin transfer and basically consists of a recipient, usually identified by the bitcoin address, and an amount that this recipient is supposed to receive. A transaction can have one or more outputs and thus pay bitcoins to different recipients as part of one transaction (in fact, this is typically the case due to the need for change as we will see later).
There are different types of transaction outputs. The most common one is called a Pay to Public Key Hash (P2PKH) transaction output and refers to a recipient by a hash of the public key of that recipient, which is essentially the same as the bitcoin address. Only the owner of that public key, i.e. technically speaking whoever has access to the private key, can spend these transaction outputs – we will see later how the signing process in the blockchain takes care of this.
Similar to outputs, that determine the target of the payment, there are transaction inputs, that determine the source of the payment. Each transaction input refers to an earlier transaction output. Transaction outputs that are also consumed by some transaction input are called spent, outputs that are not yet referenced by any transaction input are called unspent transaction outputs, abbreviated UTXO.
.
Let us look at an example to illustrate this. Suppose Alice wants to transfer 1.0 BTC to Bob. She (respectively her wallet software) will first scan all the unspent transaction outputs available to Alice, i.e. all transaction outputs that refer to public keys for which Alice has the private key. In the example displayed above, her wallet might locate two transactions (the transactions on the left of the picture) that together contain three transaction outputs with amounts 0.3, 0.5 and 0.4 (the outputs not circled in red).
These outputs sum up to 1.2 BTC and are therefore sufficient to fund the payment to Bob. Alice will therefore construct a transaction – displayed in the middle of the picture above – that has three inputs, each of them referring to the selected transaction outputs. She could now try to add only one output to the transaction and put Bobs public key hash into this output, but this would mean that she transfers 1.2 BTC to Bob, not only 1.0 BTC. Therefore Alice will add a second transaction output to her transaction, that transfers a certain amount called the change back to herself. Many wallets use dedicated addresses for this that are called change addresses.
In the example above, the change – represented by the second transaction output – is 0.1 BTC. Thus the total inputs of the transaction sum up to 1.2 BTC, the total outputs sum up to 1.1 BTC. The difference is called the fee and will be credited to the miner who creates the block in which the transaction will be included – we will look at the process in mining in a separate post.
We now see that there is no such thing as a “bitcoin balance” stored somewhere in the blockchain. There are only transactions, and ownerhip of bitcoin is equivalent to owning the private key that matches unspent transaction outputs. Transactions form a chain, where each transaction is linked to previous transactions via the transaction inputs, and this chain represents the flow of bitcoin ownership.
You might ask yourself whether this is not a “chicken and egg” problem. If each transaction can only spend bitcoins that are present in previously unspent transaction outputs, where do all the bitcoins initially come from? The answer is again provided by the process of mining, where special bitcoin transactions called coinbase transactions are generated that only have an output (and a dummy input) so that bitcoin supply is created.
With this preparation, let us now take a look at the source code of the reference implementation. The transaction data structure is defined in primitives/transaction.h
. After removing some comments and constants, the relevant part of the code is
class CTransaction { public: const int32_t nVersion; const std::vector vin; const std::vector vout; const uint32_t nLockTime;
We see that in addition to the attributes that we expect – a vector of transaction inputs and a vector of transaction outputs – a transaction has two additional attributes. The first attribute is the version number. The current version number is 2 (CURRENT_VERSION
in the header file), but you will also find older transactions with version number one. The second additional field is the locktime, which can be used to define an earliest time (or block) at which the transaction can be added to the chain.
In the same header file, the transaction input and the transaction output structure are defined. We start with the transaction output.
class CTxOut { public: CAmount nValue; CScript scriptPubKey;
We see that a transaction output consists of a value that represents the amount that the transaction transfers, and a field called scriptPubKey
which contains essentially the hash of the public key of the recipient – we will look at scripts in the bitcoin protocol in more detail in a later post. The definition of CAmount
is located in amount.h
:
/** Amount in satoshis (Can be negative) */ typedef int64_t CAmount; static const CAmount COIN = 100000000; static const CAmount CENT = 1000000;
The amount is thus specified in a unit called Satoshi which is therefore the smallest unit of bitcoin that can be transferred. The constant COIN
is the number of Satoshi that comprises one bitcoin and is .
The definition of the transaction input is similar. Ignoring a feature called segregated witness, the relevant part is
class CTxIn { public: COutPoint prevout; CScript scriptSig; uint32_t nSequence;
Here COutPoint
is a class that refers to a previous transaction output – the spent transaction output – as expected from the picture above, and which consists of the ID (i.e. hash value) of the transaction that contains the previous output as well as the index of the output in the vector of all transaction outputs of this transaction. The attribute scriptSig
contains roughly speaking a signature of the entire transaction that has been produced with the private key that belongs to the public key referenced in the spent transaction output. Finally, the field nSequence
is called the sequence number and can be used in combination with the locktime – again we will not get into details on this in this post yet.
The image above summarizes what we have learned so far about the structure of a bitcoin transaction. Time to take a look at a real transaction. The page blockchain.info offers an API to retrieve transactions from the bitcoin blockchain. So let us open a terminal and enter
$ curl https://blockchain.info/en/tx/ed70b8c66a4b064cfe992a097b3406fa81ff09641fe55a709e4266167ef47891?format=hex
0200000003620f7bc1087b0111f76978ef747001e3ae0a12f254cbfb858f028f891c40e5f6010000006a47304402207f5dfc2f7f7329b7cc731df605c83aa6f48ec2218495324bb4ab43376f313b840220020c769655e4bfcc54e55104f6adc723867d9d819266d27e755e098f646f689d0121038c2d1cbe4d731c69e67d16c52682e01cb70b046ead63e90bf793f52f541dafbdfefffffff15fe7d9e0815853738ce47deadee69339e027a1dfcfb6fa887cce3a72626e7b010000006a47304402203202e6c640c063989623fc782ac1c9dc3c6fcaed996d852ec876749ba63db63b02207ef86e262ad4b4bc9cebfadb609f52c35b0105e15d58a5ecbecc5e536d3a8cd8012103dc526ca188418ab128d998bf80942d66f1b3be585d0c89bd61c533bddbdaa729feffffff84e6431db86833897bab333d844486c183dd01e69862edea442e480c2d8cb549010000006a47304402200320bc83f35ceab4a7ef0f8181eedb5f54e3f617626826cc49c8c86efc9be0b302203705889d6aed50f716b81b0f3f5769d72d1b8a6b59d1b0b73bcf94245c283b8001210263591c21ce8ee0d96a617108d7c278e2e715ac6d8afd3fcd158bee472c590068feffffff02ca780a00000000001976a914811fb695e46e2386501bcd70e5c869fe6c0bb33988ac10f59600000000001976a9140f2408a811f6d24ab1833924d98d884c44ecee8888ac6fce0700
The curl command (you could also open the URL in a browser) retrieves a transaction in raw hexadecimal format, specifying the transaction ID as an input (the transaction ID is essentially a hash of the transaction, more on this later). The output is a hexadecimal string containing the requested transaction. This is called a serialized version of the transaction and is important for a number of reasons. First, this is the format that goes over the wire if two nodes in the bitcoin network exchange the transaction. Second, when a transaction is signed, included in a block or when its transaction ID is formed, this serialized representation is the basis. In the next post in this series, we will learn how to encode this representation to match it with the structure described above.
Until then, you might want to play around with the transaction browser at bitcoin.info – just remove the ?format=hex
from the URL above and you will get a human readable version of the transaction which will allow you to locate some of the elements (inputs, outputs and amounts) discussed in this post.
2 Comments