Running and using Go-Ethereum

Most of the time, we have been using the Brownie development environment for our tests so far, and with it the Ganache Ethereum client that Brownie runs behind the scenes. For some applications, it is useful to have other clients at our disposal. The Go-Ethereum client (geth) is the most commonly used client at the time of writing, and today we take a slightly more detailed look at how to run and configure it.

Getting started

We have already installed the geth binary in a previous post, so I assume that the geth binary is still on your path. However, geth is evolving quickly – when I started this series, I used version 1.10.6, but in the meantime, 1.10.8 has been released which contains an important bugfix (for this vulnerability which we have analyzed in depth in a previous post ), so let us use this going forward. So please head over to the download page, get the archive for version 1.10.8 for your platform (here is the link for Linux AMD64), extract the archive, copy the geth binary to a location on your path and make it executable.

Geth stores blockchain data and a keystore in a data directory which can be specified via a command line flag. We will use ~/.ethereum, and it is easier to follow along if you start over with a clean setup, so if you have used geth before, you might want to delete the contents of the data directory before proceeding. To actually start the client, we can then use the command

geth --dev \
     --datadir=$HOME/.ethereum \
     --http \
     --http.corsdomain="*" \
     --http.vhosts="*" \
     --http.addr="0.0.0.0" \
     --http.api="eth,web3"

For the sake of convenience, I have put that command into a script in the tools directory of my repository, so if you have a clone of this directory, you could as well run ./tools/run_geth. Let us go through the individual flags that we have used and try to understand what they imply.

First, there is the dev flag. If you start geth with this flag, the client will actually not try to connect to any peers. Instead, it will create a local blockchain, with a genesis block which is created on the fly (here). In addition, geth will create a so-called developer account (or re-use an existing one). This account shows up at several places. It will be the first account in the keystore managed by geth and therefore the first account that the API method eth_accounts will return. This account (or rather the associated address) will also be used as the etherbase, i.e. as the address to which mined Ether will be credited. Finally, the genesis contains an allocation of 2256 – 9 Wei (the genesis block will also contain allocations for the nine pre-compiled contracts).

The next flag is the data directory, which we have already discussed. The following couple of flags are more interesting and configure the HTTP endpoint. Geth offers APIs over three different channels – HTTP, WebSockets (WS) and a local Unix domain socket (IPC). Whereas the IPC endpoint is enabled by default, the other two are disabled by default, and the http flag enables the HTTP endpoint.

The next three flags are important, as they determine who is allowed to access this API. First, http.address is the address on which the client will be listening. By default, this is the local host (i.e. 127.0.0.1), which implies that the client cannot be reached from the outside world. Especially, this will not work if you run geth inside a docker container or virtual machine. Specifying 0.0.0.0 as in the example above allows everybody on the local network to connect to your client – this is of course not a particularly secure setup, so modify this if you are not located on a secure and private network.

In addition, geth also uses the concept of a virtual host to validate requests. Recall that RFC7320 defines the HTTP header field Host which is typically used to allow different domains to be served by one web server running on one IP address only. This field is added by web browsers to requests that are the result of a navigation, but also the requests generated by JavaScript code running in the browser. When serving an incoming request, geth will validate the content of this header field against a list configured via the http.vhosts flag. This flag defaults to “localhost”. Thus, if you want to serve requests from outside, maybe from a web browser running on a different machine in your local network, you have to set this flag, either to the domain name of your server or using the wildcard “*” to accept incoming requests regardless of the value of the host header.

Finally, there is the CORS domain flag http.corsdomain. CORS is the abbreviation for cross-origin request surgery and refers to an attack which tries to access a server from JavaScript code loaded from a different domain. To prevent this sort of attack, browsers ask a server upfront before sending such a request whether the server will accept the request by submitting a so-called pre-flight request. When we develop our frontend later on, we will need to make sure that this pre-flight request is successful, so we need to include the domain from which we will load our JavaScript code to the list that we configure here, or, alternatively, also use a wildcard here. If you want to learn more about CORS, you might want to read this documentation on the Mozilla developer network.

The last flag that we use determines which of the APIs that geth offers will be made available via the HTTP endpoint. The most important API is the eth API, which contains all the methods that we need to submit and read transactions, get blocks or interact with smart contracts. In addition, it might be helpful to enable the debug API, which we will use a bit later when tracing transactions. There are also a few APIs which you almost never want to make available over the network like the personal API which allows you to access the accounts maintained by the client, or the admin API.

Using the geth console

We have just mentioned that there are some APIs which you would typically not want to make accessible via the API. Instead, you usually access these APIs via the IPC endpoint and the geth console. This is a JavaScript-based interactive console that allows you to invoke API methods and thus interact with a running geth client. To start the console, make sure that the geth client is running, open a separate terminal window and enter

geth attach ~/.ethereum/geth.ipc

Note that the second argument is the Unix domain socket that geth will create in its data directory. To see how the console works, let us play a bit with accounts. At the prompt, enter the following commands.

eth.blockNumber
eth.accounts
eth.getBlockByNumber(0)

The first command will return the block number for the block at the head of the chain. Currently, this is zero – we only have the genesis block, no other blocks. The second command displays the list of accounts managed by the node. You should see one account, which is the developer accounts mentioned earlier. The third command displays the genesis block, and you will see that the extra data also contains the developer account.

The accounts managed by the node can be controlled using the personal API. An important functionality of this API is that accounts can be locked, so that they can no longer be used. As an example, let us try to lock the developer account.

dev = eth.accounts[0]
personal.lockAccount(dev)

Unlocking the account again is a bit tricky, as this is not allowed while the HTTP endpoint is being served. So to unlock again, you will have to shutdown geth, start it again without the HTTP flags, attach again and execute the command

personal.unlockAccount(eth.accounts[0], "")

Note the second argument – this is the password that has been used to lock the account (at startup, geth creates the development account with an empty passphrase, alternatively a passphrase can be supplied using the —password command line flag).

Finally, let us see how to use the console to create additional accounts and transfer Ether to them.

dev = eth.accounts[0]
alice = personal.newAccount("secret")
value = web3.toWei(1, "ether")
gas = 21000
gasPrice = web3.toWei(1, "gwei")
txn = {
        from: dev, 
        to: alice, 
        gas: gas, 
        gasPrice: gasPrice, 
        value: value
}
hash = eth.sendTransaction(txn)
eth.getTransactionReceipt(hash)
eth.getBalance(alice)

You could now proceed like this and set up a couple of accounts, equipped with Ether, for testing purposes. To simplify this procedure, I have provided a script that sets up several test accounts – if you have cloned the repository, simply run it by typing

python3 tools/GethSetup.py

Geth and Brownie

What about Brownie? Do we have to say good-bye to our good old friend Brownie if we choose to work with geth? Fortunately the answer is no – Brownie is in fact smart enough and will automatically detect a running geth (in fact, a running Ethereum client) when it is started and use it instead of launching Ganache. Let us try this. Make sure that geth is running and start Brownie as usual.

brownie console

At this point, it is important that we have enabled the web3 API when starting geth, as Brownie uses the method web3_clientVersion to verify connectivity at startup. If everything works, Brownie will spit out a warning that the blockchain that it has detected has a non-zero length and greet you with the usual prompt.

Great, so let us transfer some Ether to a new account as we have done it before from the console to see that everything works.

dev = accounts[0]
bob = accounts.add()
dev.transfer(to=bob, amount=web3.toWei(1, "ether"))

Hmm…this does not look good. It appears that Brownie has created a transaction and sent it, but is now waiting for the receipt and does not receive it. To understand the problem, let us switch again to a different terminal and start the geth console again. At the console prompt, inspect the pending transactions by running

txpool

The output should show you that there is one pending transaction (which you can also inspect by using eth.getTransaction) which is not included in a block yet. If you look at this transaction for a second, you will find that there are two things that look suspicious. First, the gas price for the transaction is zero. Second, the gas limit is incredibly high. If you inspect the last block that has been mined, you will find that the gas limit is exactly the gas limit of the last block that has been mined successfully.

Why is this a problem? The gas limit for a new block is determined by geth aiming at a certain target value. At the moment, this target value is lower than the gas limit of the genesis block, meaning that geth will try to decrease the gas limit with each new block (the exact algorithm is here). Thus the gas limit for the new block that the miner tries to assemble is lower than that for the previous one and therefore lower than the gas limit of our transaction, so that the transaction will not fit into the block and the miner will ignore it.

Let us try to fix this. First, we need to replace our pending transaction. The easiest way to do this is to use the geth console. What we need to do is to get the transaction from the pool of pending transactions, correct the gas limit and increase the gas price, so that the miner will pick up this transaction instead of the previous one. We also set the value to zero, so that the transaction will effectively be cancelled.

txn = eth.pendingTransactions[0]
txn.gas = eth.estimateGas(txn)
txn.gasPrice = web3.toWei(1, "gwei")
txn.value = 0
eth.sendTransaction(txn)

Note that we did not change the nonce, so our transaction replaces the pending one. After a few seconds, Brownie should note that the transaction has been dropped and stop waiting for a receipt.

The reason for our problem is related to the way how Brownie determines the gas limit and gas price to be used for a transaction. When a transaction is created, Brownie tries to figure out a gas limit and gas price from the network configuration. For the gas limit, the default setting is “max”, which instructs Brownie to use the block gas limit of the latest blocks (which will be cached for one hour). For the gas price, the default is zero. To make Brownie work with geth, we need to adjust both settings. In the console, enter

network.gas_limit("auto")
network.gas_price("auto")

When you now repeat the commands above to transfer Ether, the transaction should go through. For versions of Brownie newer than version 1.15.2, however, you will receive an error message saying that the sleep method is not implemented by geth. The transaction will still work (the error comes from this line of the code which is executed in a separate thread initiated here), so the error message is only annoying, however you might want to downgrade to version 1.15.2 if you plan to work with Brownie and geth more often (it appears that the problem was introduced with this commit).

Note that the settings for the gas price and the gas limit that we have made enough will be lost when we restart Brownie. In order to make these changes permanent, you can add them to the configuration file for Brownie. Specifically, Brownie will, upon startup, load configuration data from a file called brownie-config.yaml. To set the gas price and the gas limit, create such a file with the following content

networks:
    default: development
    development:
        gas_limit: auto
        gas_price: auto

Here we adjust the configuration for the network development which we also declare as the default network and set the gas limit and the gas price to “auto”, instructing Brownie to determine a good approximation at runtime.

This closes our post for today. We have learned how to run geth in a local development environment, discussed the most important configuration options and seen how we can still use Brownie to work with transactions and contracts. In the next post, we will start to design and build our NFT wallet application and first try to understand the overall architecture.

Implementing and testing an ERC721 contract

In the previous posts, we have discussed the ERC721 standard and how metadata and the actual asset behind a token are stored. With this, we have all the ingredients in place to tackle the actual implementation. Today, I will show you how an NFT contract can be implemented in Solidity and how to deploy and test a contract using Brownie. The code for this post can be found here.

Data structures

As for our sample implementation of an ERC20 contract, let us again start by discussing the data structures that we will need. First, we need a mapping from token ID to the current owner. In Solidity, this would look as follows.

mapping (uint256 => address) private _ownerOf;

Note that we declare this structure as private. This does not affect the functionality, but for a public data structure, Solidity would create a getter function which blows up the contract size and thus makes deployment more expensive. So it is a good practice to avoid public data structures unless you really need this.

Now mappings in Solidity have a few interesting properties. In contrast to programming languages like Java or Python, Solidity does not offer a way to enumerate all elements of a mapping – and even if it did, it would be dangerous to use this, as loops like this can increase the gas usage of your contract up to the point where the block limit is reached, rendering it unusable. Thus we cannot simply calculate the balance of an owner by going through all elements of the above mapping and filtering for a specific owner. Instead, we maintain a second data structure that only tracks balances.

mapping (address => uint256) private _balances;

Whenever we transfer a token, we also need to update this mapping to make sure that it is in sync with the first data structure.

We also need a few additional mappings to track approvals and operators. For approvals, we again need to know which address is an approved recipient for a specific token ID, thus we need a mapping from token ID to address. For operators, the situation is a bit more complicated. We set up an operator for a specific address (the address on behalf of which the operator can act), and there can be more than one operator for a given address. Thus, we need a mapping that assigns to each address another mapping which in turn maps addresses to boolean values, where True indicates that this address is an operator for the address in the first mapping.

/// Keep track of approvals per tokenID
mapping (uint256 => address) private _approvals; 

/// Keep track of operators
 mapping (address => mapping(address => bool)) private _isOperatorFor;

Thus the sender of a message is an operator for an address owner if and only if _isOperatorFor[owner][msg.sender] is true, and the sender of a message is authorized to withdraw a token if and only if _approvals[tokenID] === msg.sender.

Burning and minting a token is now straightforward. To mint, we first check that the token ID does not yet exist. We then increase the balance of the contract owner by one and set the owner of the newly minted token to the contract owner, before finally emitting an event. To burn, we reverse this process – we set the current owner to the zero address and decrease the balance of the current owner. We also reset all approvals for this token. Note that in our implementation, the contract owner can burn all token, regardless of the current owner. This is useful for testing, but of course you would not want to do this in production – as a token owner, you would probably not be very amused to see that the contract owner simply burns all your precious token. As an aside, if you really want to fire up your own token in production, you would probably want to take a look at one of the available audited and thoroughly tested sample implementations, for instance by the folks at OpenZeppelin.

Modifiers

The methods to approve and make transfers are rather straightforward (with the exception of a safe transfer that we will discuss separately in a second). If you look at the code, however, you will spot a Solidity feature that we have not used before – modifiers. Essentially, a modifier is what Java programmers might know as an aspect – a piece of code that wraps around a function and is invoked before and after a function in your contract. Specifically, if you define a modifier and add this modifier to your function, the execution of the function will start off by running the modifier until the compiler hits upon the special symbol _ in the modifier source code. At this point, the code of the actual function will be executed, and if the function completes, execution continues in the modifier again. Similar to aspects, modifiers are useful for validations that need to be done more than once. Here is an example.

/// Modifier to check that a token ID is valid
modifier isValidToken(uint256 _tokenID) {
    require(_ownerOf[_tokenID] != address(0), _invalidTokenID);
    _;
}

/// Actual function
function ownerOf(uint256 tokenID) external view isValidToken(tokenID) returns (address)  {
    return _ownerOf[tokenID];
}

Here, we declare a modifier isValidToken and add it to the function ownerOf. If now ownerOf is called, the code in isValidToken is run first and verifies the token ID. If the ID is valid, the actual function is executed, if not, we revert with an error.

Safe transfers and the code size

Another Solidity feature that we have not yet seen before is used in the function _isContract. This function is invoked when a safe transfer is requested. Recall from the standard that a safe transfer needs to check whether the recipient is a smart contract and if yes, tries to invoke its onERC721Received method. Unfortunately, Solidity does not offer an operation to figure out whether an address is the address of a smart contract. We therefore need to use inline assembly to be able to directly run the EXTCODESIZE opcode. This opcode returns the size of the code of a given address. If this is different from zero, we know that the recipient is a smart contract.

Note that if, however, the code size is zero, the recipient might in fact still be a contract. To see why, suppose that a contract calls our NFT contract within its constructor. As the code is copied to its final location after the constructor has executed, the code size is still zero at this point. In fact, there is no better and fully reliable way to figure out whether an address is that of a smart contract in all cases, and even the ERC-721 specification itself states that the check for the onERC721Received method should be done if the code size is different from zero, accepting this remaining uncertainty.

Inline assembly is fully documented here. The code inside the assembly block is actually what is known as Yul – an intermediate, low-level language used by Solidity. Within the assembly code, you can access local variables, and you can use most EVM opcodes directly. Yul also offers loops, switches and some other high-level constructs, but we do not need any of this in your simple example.

Once we have the code size and know that our recipient is a smart contract, we have to call its onERC721Received method. The easiest way to do this in Solidity is to use an interface. As in other programming languages, an interface simply declares the methods of a contract, without providing an implementation. Interfaces cannot be instantiated directly. Given an address, however, we can convert this address to an instance of an interface, as in our example.

interface ERC721TokenReceiver
{
  function onERC721Received(address, address, uint256, bytes calldata) external returns(bytes4);
}

/// Once we have this, we can access a contract with this interface at 
/// address to
ERC721TokenReceiver erc721Receiver = ERC721TokenReceiver(to);
bytes4 retval = erc721Receiver.onERC721Received(operator, from, tokenID, data);

Here, we have an address to and assume that at this address, a contract implementing our interface is residing. We then convert this address to an instance of a contract implementing this interface, and can then access its methods.

Note that this is a pure compile-time feature – this code will not actually create a contract at the address, but will simply assume that a contract with that interface is present at the target location. Of course, we can, at compile time, not know whether this is really the case. The compiler can, however, prepare a call with the correct function signature, and if this method is not implemented, we will most likely end up in the fallback function of the target contract. This is the reason why we also have to check the return value, as the fallback function might of course execute successfully even if the target contract does not implement onERC721Received.

Implementing the token URI method

The last part of the code which is not fully straightforward is the generation of the token URI. Recall that this is in fact the location of the token metadata for a given token ID. Most NFT contracts that I have seen build this URI from a base URI followed by the token ID, and I have adapted this approach as well. The base URI is specified when we deploy the contract, i.e. as a constructor argument. However, converting the token ID into a string is a bit tricky, because Solidity does again not offer a standard way to do this. So you either have to roll your own conversion or use one of the existing implementations. I have used the code from this OpenZeppelin library to do the conversion. The code is not difficult to read – we first determine the number of digits that our number has by dividing by ten until the result is less than one (and hence zero – recall that we are dealing with integers) and then go through the digits from the left to the right and convert them individually.

Interfaces and the ERC165 standard

Our smart contract implements a couple of different interfaces – ERC-721 itself and the metadata extension. As mentioned above, interfaces are a compile-time feature. To improve type-safety at runtime, it would be nice to have a feature that allows a contract to figure out whether another contract implements a given interface. To solve this, EIP-165 has been introduced. This standard does two things.

First, it defines how a hash value can be assigned to an interface. The hash value of an interface is obtained by taking the 4-byte function selectors of each method that the interface implements and then XOR’ing these bytes. The result is a sequence of four bytes.

Second, it defines a method that each contract should implement that can be used to inquire whether a contract implements an interface. This method, supportsInterface, accepts the four-byte hash value of the requested interface as an argument and is supposed to return true if the interface is supported.

This can be used by a contract to check whether another contract implements a given interface. The ERC-721 standard actually mandates that a contract that implements the specification should also implement EIP-165. Our contract does this as well, and its supportsInterface method returns true if the requested interface ID is

  • 0x01ffc9a7, which corresponds to ERC-165 itself
  • 0x80ac58cd which is the hash value corresponding to ERC-721
  • 0x5b5e139f which is the hash value corresponding to the metadata extension

Testing, deploying and running our contract

Let us now discuss how we can test, deploy and run our contract. First, there is of course unit testing. If you have read my post on Brownie, the unit tests will not be much of a surprise. There are only two remarks that might be in order.

First, when writing unit tests with Brownie and using fixtures to deploy the required smart contracts, we have a choice between two different approaches. One approach would be to declare the fixtures as function scoped, so that they are run over and over again for each test case. This has the advantage that we start with a fresh copy of the contract for each test case, but is of course slow – if you run 30 unit tests, you conduct 30 deployments. Alternatively, we can declare the fixture as sessions-scoped. They will then be only executed once per test session, so that every test case uses the same instance of the contract under test. If you do this, be careful to clean up after each test case. A disadvantage of this approach, though, remains – if the execution of one test case fails, all test cases run after the failing test case will most likely fail as well because the clean up is skipped for the failed test case. Be aware of this and do not panic if all of a sudden almost all of your test cases fail (the -x switch to Brownie could be your friend if this happens, so that Brownie exits if the first test case fails).

A second remark is concerning mocks. To test a safe transfer, we need a target contract with a predictable behavior. This contract should implement the onERC721Received method, be able to return a correct or an incorrect magic value and allow us to check whether it has been called. For that purpose, I have included a mock that can be used for that purpose and which is also deployed via a fixture.

To run the unit tests that I have provided, simply clone my repository, make sure you are located in the root of the repository and run the tests via Brownie.

git clone https://github.com/christianb93/nft-bootcamp.git
cd nft-bootcamp
brownie test tests/test_NFT.py

Do not forget to first active your Python virtual environment if you have installed Brownie or any of the libraries that it requires in a virtual environment.

Once the unit tests pass, we can start the Brownie console which will, as we know, automatically compile all contracts in the contract directory. To deploy the contract, run the following commands from the Brownie console.

owner = accounts[0]
// Deploy - the constructor argument is the base URI
nft = owner.deploy(NFT, "http://localhost:8080/")

Let us now run a few tests. We will mint a token with ID 1, pick a new account, transfer the token to this account, verify that the transfer works and finally get the token URI.

alice = accounts[1]
nft._mint(1)
assert(owner == nft.ownerOf(1))
nft.transferFrom(owner, alice, 1)
assert(alice == nft.ownerOf(1))
nft.tokenURI(1)

I invite you to play around a bit with the various functions that the NFT contract offers – declare an operator, approve a transfer, or maybe test some validations. In the next few posts, we will start to work towards a more convenient way to play with our NFT – a frontend written using React and web3.js. Before we are able to work on this, however, it is helpful to expand our development environment a bit by installing a copy of geth, and this is what the next post will be about. Hope to see you there.

Basis structure of a token and the ERC20 standard

What is a token? The short answer is that a token is a smart contract that records and manages ownership in a digital currency. The long answer is in this post.

Building a digital currency – our first attempt

Suppose for a moment you wanted to issue a digital currency and were thinking about the design of the required software package. Let us suppose further that you have never heard of a blockchain before. What would you probably come up with?

First, you would somehow need to record ownerhip. In other words, you will have to store balances somewhere, and associate a balance to every participant or client in the system, represented by an account. In a traditional IT, this would imply that somewhere, you fire up a database, maybe a relational database, that has a table with one row per account holding the current balance of this account.

Next, you would need a function that allows you to query the balance, something like balanceOf, to which you pass an account and with returns the balance of this account, looking up the value in the database. And finally, you would want to make a transfer. So you would have a method transfer which the owner of an account can use to transfer a certain amount to another account. This method would of course have to verify that whoever calls it (say you expose it as an API) is the holder of the account from which the transfer is made, which could be done using certificates or digital signatures and is well possible with traditional technology. So your first design would be rather minimalistic.

This would probably work, but is not yet very powerful. Let us add a direct debit functionality, i.e. let us allow users to withdraw a pre-approved amount of money from an account. To realize this, you could come up with a couple of additional functions.

  • First, you would add a function approve() that the owner of an account can invoke to grant permission to someone else (identified again by an account) to withdraw a certain amount of currency
  • You would probably also want to store these approvals in the database and add a function approval() to read them out
  • Finally, you would add a second function – say transferFrom – to allow withdrawals

Updating your diagram, you would thus arrive at the following design for your digital currency.

This is nice, but it still has a major drawback – someone will eventually need to operate the database and the application, and could theoretically manipulate balances and allowances directly in the database, bypassing all authorizations. The same person could also try to manipulate the code, building in backdoors or hidden transfers. So this system only qualifies as an acceptable digital currency if it is embedded into a system of regulations and audits that tries to avoid these sort of manipulations.

The ERC20 token standard

Now suppose further that you are still sitting in at your desk and scratching your head, thinking about this challenge when someone walks into your office and tells you that a smart person has just invented a technology called blockchain that allows you to store data in way that makes it extremely hard to manipulate it and also allows you to store and run immutable programs called smart contracts. Chances are that this would sound like the perfect solution to you. You would dig into this new thing, write a smart contract that stores balances and approvals in its storage on the blockchain and whose methods implement the functions that appear in your design, and voila – you have implemented your first token.

Essentially, this is how a token works. A token is a smart contract that, in its storage, maintains a data structure mapping accounts to balances, and offers methods to transfer digital currency between accounts, thus realizing a digital currency on top of Ethereum. These “sub-currencies” were among the first applications of smart contracts, and attempts to standardize these contracts have already been started in 2015, shortly after the Ethereum blockchain was launched (see for instance this paper cited by the later standard). The final standard is now known as ERC20.

I strongly advise you to take a look at the standard itself, which is actually quite readable. In addition to the functions that we have already discussed, it defines a few optional methods to read token metadata (like its name, a symbol and how to display decimals), a totalSupply method that returns the total number of token emitted and events that fire when approvals are made or token are transferred. Here is an overview of the components of the standard.

Note that it is up to the implementation whether the supply of token is fixed or token can be created (“minted”) or burnt. The standard does, however, specify that an implementation should emit an event if this happens.

Coding a token in Solidity

Let us now discuss how to implement a token according to the ERC20 standard in Solidity. The code for our token can be found here. Most of it is straightforward, but there is a couple of features of the Solidity language that we have not yet come across and that require explanation.

First, let us think about the data structures that we will need. Obviously, we somehow need to store balances per account. This calls for a mapping, i.e. a set of key-value pairs, where the keys are addresses and the value for a given address is the balance of this address, i.e. the number of token owned by this address. The value can be described by an integer, say a uint256. The address is not simply a string, as Solidity has a special data type called address. So the data structure to hold the balances is declared as follows.

mapping(address => uint256) private _balances;

Mappings in Solidity are a bit special. First, there is no way to visit all elements of a map like this, i.e. there nothing like x.keys() to get a list of all keys that appear in the mapping. Solidity will also allow you to access an element of a mapping that has not been initialized, this will return the default value for the respective data type, i.e. zero in our case. Thus, logically, our mapping covers all possible addresses and assigns an initial balance zero to them.

A similar mapping can be used to track allowances. This is a mapping whose values are again mappings. The first key (the key of the top-level mapping) is the owner of the account, the second key is the address authorized to perform a transfer (called the spender) , and the value is the allowance.

mapping (address => mapping (address => uint256)) private _allowance;

The next mechanism that we have not yet seen is the constructor which will be called when the contract is deployed. We use it to initialize some variables that we will need later. First, we store the msg.sender which is one of the pre-defined variables in Solidity and is the address of the account that invoked the constructor, i.e. in our case the account that deployed the contract. Note that msg.sender refers to the address of the EOA or contract that is the immediate predecessor of the contract in the call chain. In contrast to this, tx.origin is the EOA that signed the transaction. In the constructor, we also set up the initial balance of the token owner.

The remaining methods are straightforward, with one exception – validations. Of course, we need to validate a couple of things, for instance that the balance is sufficient to make a transfer. Thus we would check a condition, and, depending on the boolean value of that condition, revert the transaction. This combination is so common that Solidity has a dedicated instruction to do this – require. This accepts a boolean expression and a string, and, if the expression evaluates to false, reverts using the string as return value. Unfortunately, it is currently not possible for an EOA to get access to the return value of a reverted transaction, as this data is not part of the transaction receipt (see EIP-658 and EIP-758 for some background on this), but this is possible if you make a call to the contract.

This raises an interesting question. In some unit tests, for instance in this one that I have written to test my token implementation, we test whether a transaction reverts by assuming that submitting a transaction raises a Python exception. For instance, the following lines

with brownie.reverts("Insufficient balance"):
    token.transfer(alice.address, value, {"from": me.address});

verify that the transfer method of our token contract actually reverts with an expected message. Now we have just seen that the message is not part of the transaction receipt – how does Brownie know? It turns out that the handling of reverted transactions in the various frameworks is a bit tricky, in this case this works because we do not provide a gas limit – I will dive a bit deeper into the mechanics of revert in a later post.

Testing the token using MetaMask

Let us now deploy and test our token. If you have not done so, clone my repository, set up a Brownie project directory, add symbolic links to the contracts and test cases and run the tests.

git clone https://github.com/christianb93/nft-bootcamp
cd nft-bootcamp
mkdir tmp
cd tmp
brownie init
cd contracts
ln ../../contracts/Token.sol .
cd ../tests
ln ../../tests/test_Token.py
cd ..
brownie test 

Assuming that the unit tests complete successfully, we can use Brownie to deploy a copy of the token as usual (or any of the alternative methods discussed in previous posts)

brownie console
me = accounts[0]
token = Token.deploy({"from": me})
token.balanceOf(me)

At this point, the entire supply of token is allocated to the contract owner. To play with MetaMask, we need two additional accounts of which we know the private keys. Let us call them Alice and Bob. Enter the following commands to create these accounts and make sure to write down their addresses and private keys. We also transfer an initial supply of 1000 token to Alice and equip Alice with some Ether to be able to make transactions.

alice = accounts.add()
alice.private_key
alice.address
bob = accounts.add()
bob.private_key
bob.address
token.transfer(alice, 1000, {"from": me})
me.transfer(alice, web3.toWei(1, "ether"))
alice.balance()

Next, we will import or keys and token into MetaMask. If you have not done this yet, go ahead and install the MetaMask extension for your browser. You will be directed to the extension store for your browser (I have been using Chrome, but Firefox should work as well). Add the extension (you might want to use a separate profile for this). Then follow the instructions to create a new wallet. Set a password to protect your wallet and save the seed phrase somewhere.

You should now see the MetaMask main screen in front of you. At the right hand side at the top of the screen, you should see a switch to select a network. Click on it and select “Localhost 8545” to connect to the – still running – instance of Ganache. Then, click on the icon next to the network selector and import the account of Alice by entering the private key. You should now see a new account (for me, this was “Account 2”) with the balance of 1 ETH.

Next, we will add the token. Click on “Add Token”, collect the contract address from brownie (token.address) and enter it. You should now see a balance of “10 MTK”. Note how MetaMask uses the decimals – Alice owns 1000 token, and we have set the decimals (the return value of token.decimals()) to two, so that MetaMask interprets the last two zeros as decimals and displays ten.

Now let us use MetaMask to make a transfer – we will send 100 token to Bob. Click on “Send” and enter the address of Bob. Now select 1 MTK (remember the decimals again). Confirm the transaction. After a few seconds, MetaMask should inform you that the transaction has been mined. You will also see that the balance of Alice in ETH has decreased slightly, as she needed to pay for the gas, and her MTK balance has been decreased. Finally, switch back to Brownie and run

token.balanceOf(bob)

to confirm that Bob is now proud owner of 100 token.

Today, we have discussed the structure of a token contract, introduced you to the ERC20 standard, presented an implementation in Solidity and verified that this contract is able to interact with the MetaMask wallet as expected. In the next post, we will discuss a few of the things that can go terribly wrong if you implement smart contracts without thinking about the potential security implications of your design decisions.

Compiling and deploying a smart contract with geth and Python

In our last post, we have been cheating a bit – I have shown you how to use the web3 Python library to access an existing smart contract, but in order to compile and deploy, we have still been relying on Brownie. Time to learn how this can be done with web3 and the Python-Solidity compiler interface as well. Today, we will also use the Go-Ethereum client for the first time. This will be a short post and the last one about development tools before we then turn our attention to token standards.

Preparations

To follow this post, there is again a couple of preparational steps. If you have read my previous posts, you might already have completed some of them, but I have decided to list them here once more, in case you are just joining us or start over with a fresh setup. First, you will have to install the web3 library (unless, of course, you have already done this before).

sudo apt-get install python3-pip python3-dev gcc
pip3 install web3

The next step is to install the Go-Ethereum (geth) client. As the client is written in Go, it comes as a single binary file, which you can simply extract from the distribution archive (which also contains the license) and copy to a location on your path. As we have already put the Brownie binary into .local/bin, I have decided to go with this as well.

cd /tmp
wget https://gethstore.blob.core.windows.net/builds/geth-linux-amd64-1.10.6-576681f2.tar.gz
gzip -d geth-linux-amd64-1.10.6-576681f2.tar.gz
tar -xvf  geth-linux-amd64-1.10.6-576681f2.tar
cp geth-linux-amd64-1.10.6-576681f2/geth ~/.local/bin/
chmod 700 ~/.local/bin/geth
export PATH=$PATH:$HOME/.local/bin

Once this has been done, it is time to start the client. We will talk more about the various options and switches in a later post, when we will actually use the client to connect to the Rinkeby testnet. For today, you can use the following command to start geth in development mode.

geth --dev --datadir=~/.ethereum --http

In this mode, geth will be listening on port 8545 of your local PC and bring up local, single-node blockchain, quite similar to Ganache. New blocks will automatically be mined as needed, regardless of the gas price of your transactions, and one account will be created which is unlocked and at the same time the beneficiary of newly mined blocks (so do not worry, you have plenty of Ether at your disposal).

Compiling the contract

Next, we need to compile the contract. Of course, this comes down to running the Solidity compiler, so we could go ahead, download the compiler and run it. To do this with Python, we could of course invoke the compiler as a subprocess and collect its output, thus effectively wrapping the compiler into a Python class. Fortunately, someone else has already done all of the hard work and created such a wrapper – the py-solc-x library (a fork of a previous library called py-solc). To install it and to instruct it to download a specific version of the compiler, run the following commands (this will install the compiler in ~/.solcx)

pip3 install py-solc-x
python3 -m solcx.install v0.8.6
~/.solcx/solc-v0.8.6 --version

If the last command spits out the correct version, the binary is installed and we are ready to use it. Let us try this out. Of course, we need a contract – we will use the Counter contract from the previous posts again. So go ahead, grab a copy of my repository and bring up an interactive Python session.

git clone https://github.com/christianb93/nft-bootcamp
cd nft-bootcamp
ipython3

How do we actually use solcx? The wrapper offers a few functions to invoke the Solidity compiler. We will use the so-called JSON input-output interface. With this approach, we need to feed a JSON structure into the compiler, which contains information like the code we want to compile and the output we want the compiler to produce, and the compiler will spit out a similar structure containing the results. The solcx package offers a function compile_standard which wraps this interface. So we need to prepare the input (consult the Solidity documentation to better understand what the individual fields mean), call the wrapper and collect the output.

import solcx
source = "contracts/Counter.sol"
file = "Counter.sol"
spec = {
        "language": "Solidity",
        "sources": {
            file: {
                "urls": [
                    source
                ]
            }
        },
        "settings": {
            "optimizer": {
               "enabled": True
            },
            "outputSelection": {
                "*": {
                    "*": [
                        "metadata", "evm.bytecode", "abi"
                    ]
                }
            }
        }
    };
out = solcx.compile_standard(spec, allow_paths=".");

The output is actually a rather complex data structures. It is a dictionary that contains the contracts created as result of the compilation as well as a reference to the source code. The contracts are again structured by source file and contract name. For each contract, we have the ABI, a structure called evm that contains the bytecode as well as the corresponding opcodes, and some metadata like the details of the used compiler version. Let us grab the ABI and the bytecode that we will need.

abi = out['contracts']['Counter.sol']['Counter']['abi']
bytecode = out['contracts']['Counter.sol']['Counter']['evm']['bytecode']['object']

Deploying the contract

Let us now deploy the contract. First, we will have to import web3 and establish a connection to our geth instance. We have done this before for Ganache, but there is a subtlety explained here – the PoA implementation that geth uses has extended the length of the extra data field of a block. Fortunately, web3 ships with a middleware that we can use to perform a mapping between this block layout and the standard.

import web3
w3 = web3.Web3(web3.HTTPProvider("http://127.0.0.1:8545"))
from web3.middleware import geth_poa_middleware
w3.middleware_onion.inject(geth_poa_middleware, layer=0)

Once the middleware is installed, we first get an account that we will use – this is the first and only account managed by geth in our setup, and is the coinbase account with plenty of Ether in it. Now, we want to create a transaction that deploys the smart contract. Theoretically, we know how to do this. We need a transaction that has the bytecode as data and the zero address as to address. We could probably prepare this manually, but things are a bit more tricky if the contract has a constructor which takes arguments (we will need this later when implementing our NFT). Instead of going through the process of encoding the arguments manually, there is a trick – we first build a local copy of the contract which is not yet deployed (and therefore has no address so that calls to it will fail – try it) then call its constructor() method to obtain a ContractConstructor (this is were the arguments would go) and then invoke its method buildTransaction to get a transaction that we can use. We can then send this transaction (if, as in our case, the account we want to use is managed by the node) or sign and send it as demonstrated in the last post.

me = w3.eth.get_accounts()[0];
temp = w3.eth.contract(bytecode=bytecode, abi=abi)
txn = temp.constructor().buildTransaction({"from": me}); 
txn_hash = w3.eth.send_transaction(txn)
txn_receipt = w3.eth.wait_for_transaction_receipt(txn_hash)
address = txn_receipt['contractAddress']

Now we can interact with our contract. As the temp contract is of course not the deployed contract, we first need to get a reference to the actual contract as demonstrated in the previous post – which we can do, as we have the ABI and the address in our hands – and can then invoke its methods as usual. Here is an example.

counter = w3.eth.contract(address=address, abi=abi)
counter.functions.read().call()
txn_hash = counter.functions.increment().transact({"from": me});
w3.eth.wait_for_transaction_receipt(txn_hash)
counter.functions.read().call()

This completes our post for today. Looking back at what we have achieved in the last few posts, we are now proud owner of an entire arsenal of tools and methods to compile and deploy smart contracts and to interact with them. Time to turn our attention away from the simple counter that we used so far to demonstrate this and to more complex contracts. With the next post, we will actually get into one the most exciting use cases of smart contracts – token. Hope to see you soon.

Using web3.py to interact with an Ethereum smart contract

In the previous post, we have seen how we can compile and deploy a smart contract using Brownie. Today, we will learn how to interact with our smart contract using Python and the Web3 framework which will also be essential for developing a frontend for our dApp.

Getting started with web3.py

In this section, we will learn how to install web3 and how to use it to talk to an existing smart contract. For that purpose, we will once more use Brownie to run a test client and to deploy an instance of our Counter contract to it. So please go ahead and repeat the steps from the previous post to make sure that an instance of Ganache is running (so do not close the Brownie console) and that there is a copy of the Counter smart contract deployed to it. Also write down the contract address which we will need later.

Of course, the first thing will again be to install the Python package web3, which is as simple as running pip3 install web3. Make sure, however, that you have GCC and the Python development package (python3-dev on Ubuntu) on your machine, otherwise the install will fail. Once this completes, type ipython3 to start an interactive Python session.

Before we can do anything with web3, we of course need to import the library. We can then make a connection to our Ganache server and verify that the connection is established by asking the server for its version string.

import web3
w3 = web3.Web3(web3.HTTPProvider('http://127.0.0.1:8545'))
w3.clientVersion

This is a bit confusing, with the word web3 occurring at no less than three points in one line of code, so let us dig a bit deeper. First, there is the module web3 that we have imported. Within that module, there is a class HTTPProvider. We create an instance of this class that connects to our Ganache server running on port 8545 of localhost. With this instance, we then call the constructor of another class, called Web3, which is again defined inside of the web3 module. This class is dynamically enriched at runtime, so that all namespaces of the API can be accessed via the resulting object w3. You can verify this by running dir(w3) – you should see attributes like net, eth or ens that represent the various namespaces of the JSON RPC API.

Next, let us look at accounts. We know from our previous post that Ganache has ten test accounts under its control. Let us grab one of them and check its balance. We can do this by using the w3 object that we have just created to invoke methods of the eth API, which then translate more or less directly into the corresponding RPC calls.

me = w3.eth.get_accounts()[0]
w3.eth.get_balance(me)

What about transactions? To see how transactions work, let us send 10 Ether to another address. As we plan to re-use this address later, it is a good idea to use an address with a known private key. In the last post, we have seen how Brownie can be used to create an account. There are other tools that do the same thing like clef that comes with geth. For the purpose of this post, I have created the following account.

Address:  0x7D72Df7F4C7072235523A8FEdcE9FF6D236595F3
Key:      0x5777ee3ba27ad814f984a36542d9862f652084e7ce366e2738ceaa0fb0fff350

Let us transfer Ether to this address. To create and send a transaction with web3, you first build a dictionary that contains the basic attributes of the transaction. You then invoke the API method send_transaction. As the key of the sender is controlled by the node, the node will then automatically sign the transaction. The return value is the hash of the transaction that has been generated. Having the hash, you can now wait for the transaction receipt, which is issued once the transaction has been included in a block and mined. In our test setup, this will happen immediately, but in reality, it could take some time. Finally, you can check the balance of the involved accounts to see that this worked.

alice = "0x7D72Df7F4C7072235523A8FEdcE9FF6D236595F3"
value = w3.toWei(10, "ether")
txn = {
  "from": me,
  "to": alice,
  "value": value,
  "gas": 21000,
  "gasPrice": 0
}
txn_hash = w3.eth.send_transaction(txn)
w3.eth.wait_for_transaction_receipt(txn_hash)
w3.eth.get_balance(me)
w3.eth.get_balance(alice)

Again, a few remarks are in order. First, we do not specify the nonce, this will be added automatically by the library. Second, this transaction, using a gas price, is a “pre-EIP-1559” or “pre-London” transaction. With the London hardfork, you would instead rather specify a maximum fee per gas and a priority fee per gas. As I started to work on this series before London became effective, I will stick to the legacy transactions throughout this series. Of course, in a real network, you would also not use a gas price of zero.

A second important point to be aware of is timing. When we call send_transaction, we hand the transaction over to the node which signs it and publishes it on the network. At some point, the transaction is included in a block by a miner, and only then, a transaction receipt becomes available. This is why we call wait_for_transaction_receipt which actively polls the node (at least when we are using a HTTP connection) until the receipt is available. There is also a method get_transaction_receipt that will return a transaction receipt directly, without waiting for it, and it is a common mistake to call this too early.

Also, note the conversion of the value. Within a transaction, values are always specified in Wei, and the library contains a few helper functions to easily convert from Wei into other units and back. Finally, note that the gas limit that we use is the standard gas usage of a simple transaction. If the target account is a smart contract and additional code is executed, this will not be sufficient.

Now let us try to get some Ether back from Alice. As the account is not managed by the node, we will now have to sign the transaction ourselves. The flow is very similar. We first build the transaction dictionary. We then use the helper class Account to sign the transaction. This will return a tuple consisting of the hash that was signed, the raw transaction itself, and the r, s and v values from the ECDSA signature algorithm. We can then pass the raw transaction to the eth.send_raw_transaction call.

nonce = w3.eth.get_transaction_count(alice)
refund = {
  "from": alice,
  "to": me,
  "value": value, 
  "gas": 21000,
  "gasPrice": 0,
  "nonce": nonce
}
key = "0x5777ee3ba27ad814f984a36542d9862f652084e7ce366e2738ceaa0fb0fff350"
signed_txn = w3.eth.account.sign_transaction(refund, key)
txn_hash = w3.eth.send_raw_transaction(signed_txn.rawTransaction)
w3.eth.wait_for_transaction_receipt(txn_hash)
w3.eth.get_balance(me)
w3.eth.get_balance(alice)

Note that this time, we need to include the nonce (as it is part of the data which is signed). We use the current nonce of the address of Alice, of course.

Interacting with a smart contract

So far, we have covered the basic functionality of the library – creating, signing and submitting transactions. Let us now turn to smart contracts. As stated above, I assume that you have fired up Brownie and deployed a version of our smart contract. The contract address that Brownie gave me is 0x3194cBDC3dbcd3E11a07892e7bA5c3394048Cc87, which should be identical to your result as it only depends on the nonce and the account, so it should be the same as long as the deployment is the first transaction that you have done after restarting Ganache.

To access a contract from web3, the library needs to know how the arguments and return values need to be encoded and decoded. For that purpose, you will have to specify the contract ABI. The ABI – in a JSON format – is generated by the compiler. When we deploy using Brownie, we can access it using the abi attribute of the resulting object. Here is the ABI in our case.

abi = [
    {
        'anonymous': False,
        'inputs': [
            {
                'indexed': True,
                'internalType': "address",
                'name': "sender",
                'type': "address"
            },
            {
                'indexed': False,
                'internalType': "uint256",
                'name': "oldValue",
                'type': "uint256"
            },
            {
                'indexed': False,
                'internalType': "uint256",
                'name': "newValue",
                'type': "uint256"
            }
        ],
        'name': "Increment",
        'type': "event"
    },
    {
        'inputs': [],
        'name': "increment",
        'outputs': [],
        'stateMutability': "nonpayable",
        'type': "function"
    },
    {
        'inputs': [],
        'name': "read",
        'outputs': [
            {
                'internalType': "uint256",
                'name': "",
                'type': "uint256"
            }
        ],
        'stateMutability': "view",
        'type': "function"
    }
]

This looks a bit intimidating, but is actually not so hard to read. The ABI is a list, and each entry either describes an event or a function. For both, events and functions, the inputs are specified, i.e. the parameters., and similarly the outputs are described. Every parameter has types (Solidity distinguishes between internal types and the type used for encoding), and a name. For events, the parameters can be indexed. In addition, there are some specifiers for functions like the information whether it is a view or not.

Let us start to work with the ABI. Run the command above to import the ABI into a variable abi in your ipython session. Having this, we can now instantiate an object that represents the contract within web3. To talk to a contract, the library needs to know the contract address and its ABI, and these are the parameters that we need to specify.

address = "0x3194cBDC3dbcd3E11a07892e7bA5c3394048Cc87"
counter = w3.eth.contract(address=address, abi=abi)

It is instructive to user dir and help to better understand the object that this call returns. It has an attribute called functions that is a container class for the functions of the contract. Each contract function shows up as a method of this object. Calling this method, however, does not invoke the contract yet, but instead returns an object of type ContractFunction. Once we have this object, we can either use it to make a call or a transaction (this two-step approach reminds me a bit of a prepared statement when using embedded SQL).

Let us see how this works – we will first read out the counter value, then increment by one and then read the value again.

counter.functions.read().call()
txn_hash = counter.functions.increment().transact({"from": me})
w3.eth.wait_for_transaction_receipt(txn_hash)
counter.functions.read().call()

Note how we pass the sender of the transaction to the transact method – we could as well include other parameters like the gas price, the gas limit or the nonce at this point. You can, however, not pass the data field, as the data will be set during the encoding.

Another important point is how parameters to the contract method need to be handled. Suppose we had a method add(uint256) which would allow us to increase the counter not by one, but by some provided value. To increase the counter by x, we would then have to run

counter.functions.add(x).transact({"from": me})

Thus the parameters of the contract method need to be part of the call that creates the ContractFunction, and not be included in the transaction.

So far we have seen how we can connect to an RPC server, submit transactions, get access to already deployed smart contracts and invoke their functions. The web3 API has a bit more to offer, and I urge you to read the documentation and, in ipython, play around with the built-in help function to browse through the various objects that make up the library. In the next post, we will learn how to use web3 to not only talk to an existing smart contract, but also to compile and deploy a contract.

Fun with Solidity and Brownie

For me, choosing the featured image for a post is often the hardest part of writing it, but today, the choice was clear and I could not resist. But back to business – today we will learn how Brownie can be used to compile smart contracts, deploy them to a test chain, interact with the contract and test it.

Installing Brownie

As Brownie comes as a Python3 package (eth-brownie), installing it is rather straightforward. The only dependency that you have to observe which is not handled by the packet manager is that to Ganache, which Brownie uses as built-in node. On Ubuntu 20.04, for instance, you would run

sudo apt-get update
sudo apt-get install python3-pip python3-dev npm
pip3 install eth-brownie
sudo npm install -g ganache-cli@6.12.1

Note that by default, Brownie will install itself in .local/bin in your home directory, so you will probably want to add this to your path.

export PATH=$PATH:$HOME/.local/bin

Setting up a Brownie project

To work correctly, Brownie expects to be executed at the root node of a directory tree that has a certain standard layout. To start from scratch, you can use the command brownie init to create such a tree (do not do this yet but read on) . Brownie will create the following directories.

  • contracts – this is where Brownie expects you to store all your smart contracts as Solidity source files
  • build – Brownie uses this directory to store the results of a compile
  • interfaces – this is similar to contracts, place any interface files that you want to use here (it will become clearer a bit later what an interface is)
  • reports – this directory is used by Brownie to store reports, for instance code coverage reports
  • scripts – used to store Python scripts, for instance for deployments
  • tests – this is where all the unit tests should go

As some of the items that Brownie maintains should not end up in my GitHub repository, I typically create a subdirectory in the repository that I add to the gitignore file, set up a project inside this subdirectory and then create symlinks to the contracts and tests that I actually want to use. If you want to follow this strategy, use the following commands to clone the repository for this series and set up the Brownie project.

git clone https://github.com/christianb93/nft-bootcamp
cd nft-bootcamp
mkdir tmp
cd tmp
brownie init
cd contracts
ln -s ../../contracts/* .
cd ../tests
ln -s ../../tests/* .
cd ..

Note that all further commands should be executed from the tmp directory, which is now the project root directory from Brownies point of view.

Compiling and deploying a contract

As our first step, let us try to compile our counter. Brownie will happily compile all contracts that are stored in the project when you run

brownie compile

The first time when you execute this command, you will find that Brownie actually downloads a copy of the Solidity compiler that it then invokes behind the scenes. By default, Brownie will not recompile contracts that have not changed, but you can force a recompile via the --all flag.

Once the compile has been done, let us enter the Brownie console. This is actually the tool which you mostly use to interact with Brownie. Essentially, the console is an interactive Python console with the additional functionality of Brownie built into it.

brownie console

The first thing that Brownie will do when you start the console is to look for a running Ethereum client on your machine. Brownie expects this client to sit at port 8545 on localhost (we will learn later how this can be changed). If no such client is found, it will automatically start Ganache and, once the client is up and running, you will see a prompt. Let us now deploy our contract. At the prompt, simply enter

counter = Counter.deploy({"from": accounts[0]});

There is a couple of comments that are in order. First, to make a deployment, we need to provide an account by which the deployment transaction that Brownie will create will be signed. As we will see in the next section, Ganache provides a set of standard accounts that we can uses for that purpose. Brownie stores those in the accounts array, and we simply select the first one.

Second, the Counter object that we reference here is an object that Brownie creates on the fly. In fact, Brownie will create such an object for every contract that it finds in the project directory, using the same name as that of the contract. This is a container and does not yet reference a deployed contract, but if offers a method to deploy a contract, and this method returns another object which is now instantiated and points to the newly deployed contract. Brownie will also add methods to a contract that correspond to the methods of the smart contract that it represents, so that we can simply invoke these methods to talk to the contract. In our case, running

dir(counter)

will show you that the newly created object has methods read and increment, corresponding to those of our contract. So to get the counter value, increment it by one and get the new value, we could simply do something like

# This should return zero
counter.read()
txn = counter.increment()
# This should now return one
counter.read()

Note that by default, Brownie uses the account that deployed the contract as the “from ” account of the resulting transaction. This – and other attributes of the transaction – can be modified by including a dictionary with the transaction attributes to be used as last parameter to the method, i.e. increment in our case.

It is also instructive to look at the transaction object that the second statement has created. To read the logs, for instance, you can use

txn.logs

This will again show you the typical fields of a log entry – the address, the data and the topics. To read the interpreted version of the logs, i.e. the events, use

txn.events

The transaction (which is actually the transaction receipt) has many more interesting fields, like the gas limit, the gas used, the gas price, the block number and even a full trace of the execution on the level of single instructions (which seems to be what Geth calls a basic trace). To get a nicely formatted and comprehensive overview over some of these fields, run

txn.info()

Accounts in Brownie

Accounts can be very confusing when working with Ethereum clients, and this is a good point in time to shed some light on this. Obviously, when you want to submit a transaction, you will have to get access to the private key of the sender at some point to be able to sign it. There are essentially three different approaches how this can be done. How exactly these approaches are called is not standardized, here is the terminoloy that I prefer to use.

First, an account can be node-managed. This simply means that the node (i.e. the Ethereum client running on the node) maintains a secret store somewhere, typically on disk, and stores the private keys in this secret store. Obviously, clients will usually encrypt the private key and use a password or passphrase for that purpose. How exactly this is done is not formally standardized, but both Geth and Ganache implement an additional API with the personal namespace (see here for the documentation), and also OpenEthereum offers such an API, albeit with slightly different methods. Using this API, a user can

  • create a new account which will then be added to the key store by the client
  • get a list of managed accounts
  • sign a transaction with a given account
  • import an account, for instance by specifying its private key
  • lock an account, which stops the client from using it
  • unlock an account, which allows the client to use it again for a certain period of time

When you submit a transaction to a client using the eth_sendTransaction API method, the client will scan the key store to see whether is has the private key for the requested sender on file. If yes, it is able to sign the transaction and submit it (see for instance the source code here).

Be very careful when using this mechanism. Even though this is great for development and testing environments, it implies that while the account is unlocked, everybody with access to the API can submit transactions on your behalf! In fact, there are systematic scans going on (see for instance this article) to detect unlocked accounts and exploit them, so do not say I have not warned you….

In Brownie, we can inspect the list of node-managed accounts (i.e. accounts managed by Ganache in our case) using either the accounts array or the corresponding API call.

web3.eth.get_accounts()
accounts

You will see the same list of ten accounts using both methods, with the difference that the accounts array contains objects that Brownie has built for you, while the first method only returns an array of strings.

Let us now turn to the second method – application-managed accounts. Here, the application that you use to access the blockchain (a program, a frontend or a tool like Brownie) is in charge of managing the accounts. It can do so by storing the accounts locally, protected again by some password, or in memory. When an application wants to send a transaction, it now has to sign the transaction using the private key, and would then use the eth_sendRawTransaction method to submit the signed transaction to the network.

Brownie supports this method as well. To illustrate this, enter the following sequence of commands in the Brownie console to create two new accounts, transfer 1000 Ether from one of the test accounts to the first of the newly created accounts and then prepare and sign a transaction that is transferring some Ether to the second account.

# Will spit out a mnemonic
me = accounts.add()
alice = accounts.add()
# Give me some Ether
accounts[0].transfer(to=me.address, amount=1000);
txn = {
  "from": me.address,
  "to": alice.address,
  "value": 10,
  "gas": 21000,
  "gasPrice": 0,
  "nonce": me.nonce
}
txn_signed = web3.eth.account.signTransaction(txn, me.private_key)
web3.eth.send_raw_transaction(txn_signed.rawTransaction)
alice.balance()

When you now run the accounts command again, you will find two new entries representing the two accounts that we have added. However, these entries are now of type LocalAccount, and web3.eth.get_accounts() will show that they have not been added to the node, but are managed by Brownie.

Note that Brownie will not store the accounts on disk if not being told to do so, but you can do this. By default, Brownie keeps each local account in a separate file. To save your account, enter

me.save("myAccount")

which will prompt you for a password and then store the account in a file called myAccount. When you now exit Brownie and start it again, you can load the account by running

me = accounts.load("myAccount")

You will then be prompted once more for the password, and assuming that you supply the correct password, the account will again be available.

More or less the same code would apply if you had chosen to go for the third method – user-managed accounts. In this approach, the private key is never stored by the application. The user is responsible for managing accounts, and only if a transaction is to be made, the private key is presented to the application, using a parameter or an input field. The application will never store the account or the private key (of course, it will have to reside in memory for some time), and the user has to enter the private key for every single transaction. We will see an example for this when we deploy a contract using Python and web3 in a later post.

Writing and running unit tests

Before closing this post, let us take a look at another nice feature of Brownie – unit testing. Brownie expects test to be located in the corresponding subdirectory and to start with the prefix “test_”. Within the file, Brownie will then look for functions prefixed with “test_” as well and run them as unit tests, using pytest.

Let us look at an example. In my repository, you will find a file test_Counter.py (which should already be symlinked into the tests directory of the Brownie project tree if you have followed the instructions above to initialize the directory tree). If you have ever used pytest before, this file contains a few things that will look familiar to you – there are test methods, and there some fixtures. Let us focus on those parts which are specific to the usage in combination with Brownie. First, there is the line

from brownie import Counter, accounts, exceptions;

This imports a few objects and makes them available, similar to the Brownie console. The most important one is the Counter object itself, which will allow us to deploy instances of the counter and test them. Next, we need access to a deployed version of the contract. This is handled by a fixture which uses the Counter object that we have just imported.

@pytest.fixture
def counter():
    return accounts[0].deploy(Counter);

Here, we use the deploy method of an Account in Brownie to deploy, which is an alternative way to specify the account from which the deployment will be done. We can now use this counter object as if we would be working in the console, i.e. we can invoke its methods to communicate with the underlying smart contract and check whether they behave as expected. We also have access to other features of Brownie, we can, for instance, inspect the transaction receipt that is returned by a method invocation that results in a transaction, and use it to get and verify events and logs.

Once you have written your unit tests, you want to run them. Again, this is easy – leave the console using Ctrl-D, make sure you are still in the root directory of the project tree (i.e. the tmp directory) and run

brownie test tests/test_Counter.py

As you will see, this brings up a local Ganache server and executes your tests using pytest, with the usual pytest output. But you can generate much more information – run

brownie test tests/test_Counter.py --coverage --gas

to receive the output below.

In addition to the test results, you see a gas report, detailing the gas usage of every invoked method, and a coverage report. To get more details on the code, you can use the Brownie GUI as described here. Note, however, that this requires the tkinter package to be installed (on Ubuntu, you will have to use sudo apt-get install python3-tk).

You can also simply run brownie test to execute all tests, but the repository does already contain tests for future blog entries, so this might be a bit confusing (but should hopefully work).

This completes our short introduction. You should now be able to deploy smart contracts and to interact with them. In the next post, we will do the same thing using plain Python and the web3 framework, which will also prepare us for the usage of the web3.js framework that we will need when building our frontend.

Asynchronous I/O with Python part III – native coroutines and the event loop

In the previous post, we have seen how iterators and generators can be used in Python to implement coroutines. With this approach, a coroutine is simply a function that contains a yield statement somewhere. This is nice, but makes the code hard to read, as the function signature does not immediately give you a hint whether it is a generator function or not. Newer Python releases introduce a way to natively designate functions as asynchronous functions that behave similar to coroutines and can be waited for using the new async and await syntax.

Native coroutines in Python

We have seen that coroutines can be implemented in Python based on generators. A coroutine, then, is a generator function which runs until it is suspended using yield. At a later point in time, it can be resumed using send. If you know Javascript, this will sound familiar – in fact, with ES6, Javascript has introduced a new syntax to declare generator functions. However, most programmers will probably be more acquainted with the concepts of an asynchronous functions in Javascript and the corresponding await and async keyword.

Apparently partially motivated by this example and by the increasing popularity of asynchronous programming models, Python now has a similar concept that was added to the language with PEP-492 which introduces the same keywords into Python as well (as a side note: I find it interesting to see how these two languages have influenced each other over the last couple of years).

In this approach, a coroutine is a function marked with the async keyword. Similar to a generator-based coroutine which runs up to the next yield statement and then suspends, a native coroutine will run up to the next await statement and then suspend execution.

The argument to the await statement needs to be an awaitable object, i.e. one of the following three types:

  • another native coroutine
  • a wrapped generator-based coroutine
  • an object implementing the __await__ method

Let us look at each of these three options in a bit more detail

Waiting for native coroutines

The easiest option is to use a native coroutine as target for the await statement. Similar to a yield from, this coroutine will then resume execution and run until it hits upon an await statement itself. An example for such a coroutine is asyncio.sleep(), which sleeps for the specified number of seconds. You can define your own native coroutine and await the sleep coroutine to suspend your coroutine until a certain time has passed.

async def coroutine():
    await asyncio.sleep(3)

Similar to yield from, this builds a chain of coroutines that hand over control to each other. A coroutine that has been “awaited” in this way can hand over execution to a second coroutine, which in turn waits for a third coroutine and so forth. Thus await statements in a typical asynchronous flow form a chain.

Now we have seen that a chain of yield from statements typically ends with a yield statement, returning a value or None. Based on that analogy, one might think that the end of a chain of await statements is an await statement with no argument. This, however, is not allowed and would also not appear to make sense, after all you wait “for something”. But if that does not work, where does the chain end?

Time to look at the source code of the sleep function that we have used in our example above. Here we need to distinguish two different cases. When the argument is zero, we immediately delegate to __sleep0, which is actually very short (we will look at the more general case later).

@types.coroutine
def __sleep0():
    yield

So this is a generator function as we have seen it in the last post, with an additional annotation, which turns it into a generator-based coroutine.

Generator-based coroutines

PEP-492 emphasizes that native coroutines are different from generator-based coroutines, and also enforces this separation. It is, for instance, an error to execute a yield inside a native coroutine. However, there is some interoperability between these two worlds, provided by the the decorator *types.coroutine that we have seen in action above.

When we decorate a generator-based coroutine with this decorator, it becomes a native coroutine, which can be awaited. The behaviour is very similar to yield from, i.e. if a native coroutine A awaits a generator-based coroutine B and is run via send, then

  • if B yields a value, this value is directly returned to the caller of A.send() as the result of the send invocation
  • at this point, B suspends
  • if we call A.send again, this will resume B (!), and the yield inside B will evaluate to the argument of the send call
  • if B returns or raises a StopIteration, the return value respectively the value of the StopIteration will be visible inside A as the value of the await statement

Thus in the example of asyncio.sleep(0), generator-based coroutines are the answer to our chicken-and-egg issue and provide the end point for the chain of await statements. If you go back to the code of sleep, however, and look at the more general case, you will find that this case is slightly more difficult, and we will only be able to understand it in the next post once we have discussed the event loop. What you can see, however, is that eventually, we wait for something called a future, so time to talk about this in a bit more detail.

Iterators as future-like objects

Going back to our list of things which can be waited for, we see that by now, we have touched on the first two – native coroutines and generator-based coroutines. A future (and the way it is implemented in Python) is a good example for the third case – objects that implement __await__.

Following the terminology used in PEP-492, any object that has an __await__ method is called a future-like object, and any such object can be the target of an await statement. Note that both a native coroutine as well as a generator-based coroutine have an __await__ method and are therefore future-like objects. The __await__ method is supposed to return an iterator, and when we wait for an object implementing __await__, this iterator will be run until it yields or returns.

To make this more tangible, let us see how we can use this mechanism to implement a simple future. Recall that a future is an object that is a placeholder for a value still to be determined by an asynchronous operation (if you have ever worked with Javascript, you might have heard of promises, which is a very similar concept). Suppose, for instance, we are building a HTTP library which has a method like fetch to asynchronously fetch some data from a server. This method should return immediately without blocking, even though the request is still ongoing. So it cannot yet return the result of the request. Instead, it can return a future. This future serves as a placeholder for the result which is still to come. A coroutine could use await to wait until the future is resolved, i.e. the result becomes available.

Of course we will not write a HTTP client today, but still, we can implement a simple future-like object which is initially pending and yields control if invoked. We can then set a value on this future (in reality, this would be done by a callback that triggers when the actual HTTP response arrives), and a waiting coroutine could then continue to run to retrieve the value. Here is the code

class Future:

    def __await__(self):
        if not self._done:
            yield 
        else:
            return self._result

    def __init__(self):
        self._done = False

    def done(self, result):
        self._result = result
        self._done = True

When we initially create such an object, its status will be pending, i.e. the attribute _done will be set to false. Awaiting a future in that state will run the coroutine inside the __await__ method which will immediately yield, so that the control goes back to the caller. If now some other asynchronous task or callback calls done, the result is set and the status is updated. When the coroutine is now resumed, it will return the result.

To trigger this behaviour, we need to create an instance of our Future class and call await on it. Now using await is only possible from within a native coroutine, so let us write one.

async def waiting_coroutine(future):
    data = None
    while data is None:
        data = await future
    return data

Finally, we need to run the whole thing. Similar as for generator-based coroutines, we can use send to advance the coroutine to the next suspension point. So we could something like this.

future=Future()
coro = waiting_coroutine(future)
# Trigger a first iteration - this will suspend in await
assert(None == coro.send(None))
# Mark the future as done
future.done(25)
# Now the second iteration should complete the coroutine
try:
    coro.send(None)
except StopIteration as e:
    print("Got StopIteration with value %d" % e.value)

Let us see what is happening behind the scenes when this code runs. First, we create the future which will initially be pending. We then make a call to our waiting_coroutine. This will not yet start the process, but just build and return a native coroutine, which we store as coro.

Next, we call send on this coroutine. As for a generator-based coroutine, this will run the coroutine. We reach the point where our coroutine waits for the future. Here, control will be handed over to the coroutine declared in the __await__ method of the future, i.e. this coroutine will be created and run. As _done is not yet set, it will yield control, and our send statement returns with None as result.

Next, we change the state of the future and provide a value, i.e we resolve the future. When we now call send once more, the coroutine is resumed. It picks up where it left, i.e. in the loop, and calls await again on the future. This time, this returns a value (25). This value is returned, and thus the coroutine runs to completion. We therefore get a StopIteration which we catch and from which we can retrieve the value.

The event loop

So far, we have seen a few examples of coroutines, but always needed some synchronous code that uses send to advance the coroutine to the next yield. In a real application, we would probably have an entire collection of coroutines, representing various tasks that run asynchronously. We would then need a piece of logic that acts as a scheduler and periodically goes through all coroutines, calls send on them to advance them to the point at which they return control by yielding, and look at the result of the call to determine when the next attempt to resume the coroutine should be made.

To make this useful in a scenario where we wait for other asynchronous operations, like network traffic or other types of I/O operations, this scheduler would also need to check for pending I/O and to understand which coroutine is waiting for the result of a pending I/O operation. Again, if you know Javascript, this concept will sound familiar – this is more or less what the event loop built into every browser or the JS engine running in Node.js is doing. Python, however, does not come with a built-in event loop. Instead, you have to select one of the available libraries that implement such a loop, for instance the asyncio library which is distributed with CPython. Using this library, you define tasks which wrap native coroutines, schedule them for execution by the event loop and allow them to wait for e.g. the result of a network request represented by a future. In a nutshell, the asyncio event loop is doing exactly this

FuturesCoroutinesScheduler

In the next post, we will dig a bit deeper into the asyncio library and the implementation of the event loop.

Asynchronous I/O with Python part II – iterators and generators

As explained in my previous post, historically coroutines in Python have evolved from iterators and generators, and understanding generators is still vital to understanding native coroutines. In this post, we take a short tour through iterators in Python and how generators have traditionally been implemented.

Iterables and iterators

In Python (and in other programming languages), an iterator is an object that returns a sequence of values, one at a time. While in languages like Java, iterators are classes implementing a specific interface, Python iterators are simply classes that have a method __next__ which is supposed to either return the next element from the iterator or raise a StopIteration exception to signal that no further elements exist.

Iterators are typically not created explicitly, but are provided by factory classes called iterables. An iterable is simply a class with a method __iter__ which in turn returns an iterator. Behind the scenes, iterables and iterators are used when you run a for-loop in Python – Python will first invoke the __iter__ of the object to which you refer in the loop to get an iterator and then call the __next__ method of this iterator once for every iteration of the loop. The loop stops when a StopIteration is raised.

This might sound a bit confusing, so let us look at an example. Suppose you wanted to build an object which – like the range object – allows you to loop over all numbers from 0 to a certain limit. You would then first write a class that implements a method __next__ that returns the next value (so it has to remember the last returned value), and then implement an iterable returning an instance of this class.

class SampleIterator:

    def __init__(self, limit):
        self._position = 0
        self._limit = limit

    def __next__(self):
        if self._position < self._limit:
            self._position += 1
            return self._position - 1
        else:
            raise StopIteration

class SampleIterable:

    def __init__(self, limit):
        self._limit = limit

    def __iter__(self):
        return SampleIterator(self._limit)


myIterable = SampleIterable(10)
for i in myIterable:
    print("i = %d" % i)

Often, the same object will implement the __next__ method and the __iter__ method and therefore act as iterable and iterator at the same time.

Note that the iterator typically needs to maintain a state – it needs to remember the state after the last invocation of __next__ has completed. In our example, this is rather straightforward, but in more complex siutations, programmatically managing this state can be tricky. With PEP-255, a new approach was introduced into Python which essentially allows a programmer to ask the Python interpreter to take over this state management – generators.

Generators in Python

The secret sauce behind generators in Python is the yield statement. This statement is a bit like return in that it returns a value and the flow of control to the caller, but with the important difference that state of the currently executed function is saved by Python and the function can be resumed at a later point in time. A function that uses yield in this way is called a generator function.

Again, it is instructive to look at an example. The following code implements our simple loop using generators.

def my_generator(limit=5):
    _position = 0
    while _position < limit:
        yield _position 
        _position += 1

for i in my_generator(10):
    print("i = %d" % i)

We see that we define a new function my_generator which, at the first glance, looks like an ordinary function. When we run this function for the first time, it will set a local variable to set its current position to zero. We then enter a loop to increase the position until we reach the limit. In each iteration, we then invoke yield to return the current position back to the caller.

In our main program, we first call my_generator() with an argument. As opposed to an ordinary function, this invocation does not execute the function. Instead, it evaluates the argument and builds and returns an object called a generator object. This object is an iterator, i.e. it has a __next__ method. When this method is called for the first time, the execution of our function body starts until it hits the first yield statement. At this point, the execution returns to the caller and whatever we yield is returned by the call to __next__. When now __next__ is invoked again, the Python interpreter will restore the current state of the function and resume its execution after the yield. We increase our internal position, enter the loop again, hit the next yield and so forth. This continues until the limit is reached. Then, the function returns, which is equivalent to raising a StopIteration and signals to the caller that the iterator is exhausted.

Instead of using the for loop, we can also go through the same steps manually to see how this works.

generator = my_generator(5)
while True:
    try:
        value = generator.__next__()
        print("Value: %d" % value)
    except StopIteration:
        break

This is already quite close to the programming model of a co-routine – we can start a coroutine, yield control back to the caller and resume execution at a later point in time. However, there are a few points that are still missing and that have been added to Python coroutines with additional PEPs.

Delegation to other coroutines

With PEP-380, the yield from statement was added to Python, which essentially allows a coroutine to delegate execution to another coroutine.

A yield from statement can delegate either to an ordinary iterable or to another generator.

What yield from is essentially doing is to retrieve an iterator from its argument and call the __next__ method of this iterator, thus – if the iterable is a generator – running the generator up to the next yield. Whatever this yield returns will then be yielded back to the caller of the generator containing the yield from statement.

When I looked at this first, I initially was under the impression that if a generator A delegates to generator B by doing yield from B, and B yields a value, control would go back to A, similar to a subroutine call. However, this is not the case. Instead of thinking of a yield from like a call, its better to think of it like a jump. In fact, when B yields a value, this value will be returned directly to the caller of A. The yield from statement in A only returns when B either returns or raises a StopIteration (which is equivalent), and the return value of B will then be the value of the yield from statement. So you might think of the original caller and A as being connected through a pipe through which yielded values are sent back to the caller, and if A delegates to B, it also hands the end of the pipe over to B where it remains until B returns (i.e. is exhausted in the sense of an iterator).

YieldFrom

Passing values and exceptions into coroutines

We have seen that when a coroutine executes a yield, control goes back to the caller, i.e. to the code that triggered the coroutine using __next__, and when the coroutine is resumed, its execution continues at the first statement after the yield. Note that yield is a statement and takes an argument, so that the coroutine can hand data back to the caller, but not the other way round. With PEP-342, this was changed and yield became an expression so that it actually returns a value. This allows the caller to pass a value back into the generator function. The statement to do this is called send.

Doing a send is a bit like a __next__, with the difference that send takes an argument and this argument is delivered to the coroutine as result of the yield expression. When a coroutine runs for the first time, i.e. is not resumed at a yield, only send(None) is allowed, which, in general, is equivalent to __next__. Here is a version of our generator that uses this mechanism to be reset.

def my_generator(limit=5):
    _position = 0
    while _position < limit:
        cur = _position
        val = yield cur 
        if val is not None:
            # 
            # We have been resumed due to a send statement. 
            #
            _position = val
            yield val
        else:
            _position += 1

We can now retrieve a few values from the generator using __next__, then use send to set the position to a specific value and then continue to iterate through the generator.

generator = my_generator(20)
assert 0 == generator.__next__()
assert 1 == generator.__next__()
generator.send(7)
assert 7 == generator.__next__()

Instead of passing a value into a coroutine, we can also throw an exception into a coroutine. This actually quite similar to the process of sending a value – if we send a value into a suspended coroutine, this value becomes visible inside the coroutine as the return value of the yield at which the coroutine is suspended, and if we throw an exception into it, the yield at which the coroutine is suspended will raise this exception. To throw an exception into a coroutine, use the throw statement, like

generator = my_generator(20)
assert 0 == generator.__next__()
generator.throw(BaseException())

If you run this code and look at the resulting stack trace, you will see that in fact, the behavior is exactly as if the yield statement had raised the exception inside the coroutine.

The generator has a choice whether it wants to catch and handle the exception or not. If the generator handles the exception, processing continues as normal, and the value of the next yield will be returned as result of throw(). If, however the generator decides to not handle the exception or to raise another exception, this exception will be passed through and will show up in the calling code as if it had been raised by throw. So in general, both send and throw statements should be enclosed in a try-block as they might raise exceptions.

Speaking of exceptions, there are a few exceptions that are specific for generators. We have already seen the StopIteration exception which is thrown if an iterator or generator is exhausted. A similar exception is GeneratorExit which can be thrown into a generator to signal that the generator should complete. A generator function should re-raise this exception or raise a StopIteration so that its execution stops, and the caller needs to handle the exception. There is even a special method close that can be used to close a coroutine which essentially does exactly this – it throws a GeneratorExit into the coroutine and expects the generator to re-raise it or to replace it by a StopIteration exception which is then handled. If a generator is garbage-collected, the Python interpreter will execute this method.

This completes our discussion of the “old-style” coroutines in Python using generator functions and yielding. In the next post, we will move on to discuss the new syntax for native coroutines introduced with Python 3.5 in 2015.

Asynchronous I/O with Python part I – the basics

Though not really new, a programming model commonly known as asynchronous I/O has been attracting a lot of attention over the last couple of years and even influenced the development of languages like Java, Go or Kotlin. In this and the next few posts, we will take a closer look at this model and how it can be implemented using Python.

What is asynchronous I/O?

The basic ideas of asynchronous I/O are maybe explained best using an example from the world of networking, which is at the same time the area where the approach excels. Suppose you are building a REST gateway that accepts incoming connections and forwards them to a couple of microservices. When a new client connects, you will have to make a connection to a service, send a request, wait for the response and finally deliver the response back to the client.

Doing this, you will most likely have to wait at some points. If, for instance, you build a TCP connection to the target service, this involves a handshake during which you have to wait for network messages from the downstream server. Similarly, when you have established the connection and send the request, it might take some time for the response to arrive. While this entire process is n progress, you will have to maintain some state, for instance the connection to the client which you need at the end to send the reply back.

If you do all this sequentially, your entire gateway will block while a request is being processed – not a good idea. The traditional way to deal with this problem has been to use threads. Every time a new request comes in, you spawn a thread. While you have to wait for the downstream server, this thread will block, and the scheduler (the OS scheduler if you use OS-level threads or some other mechanism) will suspend the thread, yield the CPU to some other thread and thus allow the gateway to serve other requests in the meantime. When the response from the downstream server arrives, the thread is woken up, and, having saved the state, the processing of the client’s request can be completed.

This approach works, but, depending on the implementation, creating and running threads can create significant overhead. In addition to the state, concurrently managing a large number of threads typically involves a lot of scheduling, locking, handling of concurrent memory access and kernel calls. This is why you might try a different implementation that entirely uses user-space mechanism.

You could, for instance, implement some user-space scheduler mechanism. When a connection is being made, you would read the incoming request, send a connection request (a TCP SYN) to the downstream server and then voluntarily return control to the scheduler. The scheduler would then monitor (maybe in a tight polling loop) all currently open network connections to downstream servers. Once the connection is made, it would execute a callback function which triggers the next steps of the processing and send a request to the downstream server. Then, control would be returned to the scheduler which would invoke another callback when the response arrives and so forth.

With this approach, you would still have to store some state, for instance the involved connections, but otherwise the processing would be based on a sequence of individual functions or methods tied together by a central scheduler and a series of callbacks. This is likely to be very efficient, as switching between “threads” only involves an ordinary function call which is much cheaper than a switch between two different threads. In addition, each “thread” would only return control to the scheduler voluntarily, implementing a form of cooperative multitasking, and can not be preempted at unexpected points. This of course makes synchronization much easier and avoids most if not all locking, which again removes some overhead. Thus such a model is likely to be fast and efficient.

On the downside, without support from the used programming language for such a model, you will easily end up with a complex set of small functions and callbacks, sometimes turning into a phenomenon known as callback hell. To avoid this, more and more programming languages offer a programming model which supports this approach with libraries and language primitives, and so does Python.

Coroutines and futures

The model which we have described is not exactly new and has been described many years ago. In this model, processing takes place in a set of coroutines. Coroutines are subroutines or functions which have the ability to deliberately suspend their own execution – a process known as yielding. This will save the current state of the coroutine and return control to some central scheduler. The scheduler can later resume the execution of the coroutine which will pick up the state and continue to run until it either completes or yields again (yes, this is cooperative multitasking, and this is where the name – cooperative routines – comes from).

Coroutines can also wait for the result of a computation which is not yet available. Such a result is encapsulated in an object called a future. If, for instance, a coroutine sends a query to a downstream server, it would send the HTTP request over the network, create a future representing the reply and then yield and wait for the completion of this future. Thus the scheduler would gain back control and could run other coroutines. At the same time, the scheduler would have to monitor open network connections, and, when the response arrives, complete the future, i.e. provide a value, and reschedule the corresponding coroutine.

FuturesCoroutinesScheduler

Finally, some additional features would be desirable. To support modularization, it would be nice if coroutines could somehow call each other, i.e. if a coroutine could delegate a part of its work to another coroutine and wait for its completion. We would probably also want to see some model of exception handling. If, for instance, a coroutine has made a request and the response signals an error, we would like to see a way how the coroutine learns about this error by being woken up with an exception. And finally, being able to pass data into an already running coroutines could be beneficial. We will later see that the programming model that Python implements for coroutines supports all of these features.

Organisation of this series

Coroutines in Python have a long history – they started as support for iterators, involved into what is today known as generator-based coroutines and finally turned into the native coroutines that Python supports today. In addition, the asyncio library provides a framework to schedule coroutines and integrate them with asynchronous I/O operations.

Even today, the implementation of coroutines in Python is still internally based on iterators and generators, and therefore it is still helpful to understands these concepts, even if we are mainly interested in the “modern” native coroutines. To reflect this, the remaining posts in this series will cover the following topics.

  • Iterators and generator-based coroutines
  • Native coroutines
  • The main building blocks of the low-level asyncio API – tasks, futures and the event loop
  • Asynchronous I/O and servers
  • Building an asynchronous HTTP server from scratch

To follow the programming examples, you will need a comparatively new version of Python, specifically you will need Python 3.7 or above. In case you have an older version, either get the latest version from the Python download page and build it from source, or (easier) try to get a more recent package for your OS (for Ubuntu, for instance, there is the deadsnake PPA that you can use for that purpose).

Learning Kafka with Python – retries and idempotent writes

In the past few posts, we have discussed approaches to implement at-least-once processing on the consumer side, i.e. mechanisms that make sure that every record in the partition is only processed once. Today, we will look at a similar problem on the producer side – how can we make sure that every record is written into the partition only once? This sounds easy, but can be tricky if we need to retry failed message without knowing the exact error that has occured.

The retry problem

In the sample producer that we have looked at in a previous post, we missed an important point – error handling. The most important error that a reliable producer needs to handle is an error when handing over a new record to the broker.

In general, Kafka differentiates between retriable errors, i.e. transient errors like individual packets being lost on the network, and non-retriable errors, i.e. errors like an invalid authorization for which a retry does not make sense. For most transient errors, the client will – under the hood – automatically attempt a retry if a record could not be sent.

Let us take a short look at the Java producer as an example. When a batch of records has been sent to the broker as a ProduceRequest, the response is handled in the method handleProduceResponse. Here, a decision is made whether an automatic retry should be initiated, in which case the batch of records will simply be added to the queue of batches to be sent again. The logic to decide when a retry should be attempted is contained in the method canretry, and in the absence of transactions (see the last section of this post), it will decide to retry if the batch has not timed-out yet (i.e. has been created more than delivery.timeout.ms before), the error is retriable and the number of allowed retries (set via the parameter retries) has not yet been reached. Examples for retriable exceptions are exceptions due to a low number of in-sync replicas, timeouts, connection failures and so forth.

This is nice, but there is a significant problem when using automated retries. If, for instance, a produce request times out, it might very well be that this is only due to a network issue and in the background, the broker has actually stored the record in the partition log. If we retry, we will simply send the same batch of records again, which could lead to duplicate records in the partition log. As these records will have different offsets, there is no way for a consumer to detect this duplicate. Depending on the type of application, this can be a major issue.

If you wanted to solve this on the application level, you would probably set retries to zero, implement your own retry logic and use a sequence number to allow the consumer to detect duplicates. A similar logic referred to as idempotent writes has been added to Kafka with KIP-98 which was implemented in release 0.11 in 2016.

What are idempotent writes?

Essentially, idempotent writes use a sequence number which is added to each record by the producer to allow the broker to detect duplicates due to automated retries. This sequence number is added to a record shortly before it is sent (more precisely, a batch of records receives a base sequence number, and the sequence number of a record is the base sequence number plus its index in the batch), and if an automated retry is made, the exact same batch with the same sequence number is sent again. The broker keeps track of the highest sequence number received, and will not store any records with a sequence number smaller than or equal to the currently highest processed sequence number.

To allow all followers to maintain this information as well, the sequence number is actually added to the partition log and therefore made available to all followers replicating the partitions, so that this data survives the election of a new partition leader.

In a bit more detail, the implementation is slightly more complicated than this. First, it would imply a high overhead to maintain a globally unique sequence number across all producers and partitions. Instead, the sequence number is maintained per producer and per partition. To make this work, producers will be assigned a unique ID called the producer ID. In fact, when a producer that uses idempotent writes starts, it will send an InitPidRequest to the broker. The broker will then assign a producer ID and return it in the response. The producer stores the producer ID in memory and adds it to all records being sent, so that the broker knows from which producer a record originates. Similar to the sequence number, this information is added to the records in the partition log. Note, however, that neither the producer ID nor the sequence number are passed to a consumer by the consumer API.

How does the broker determine the producer ID to be assigned? This depends on whether idempotent writes are used in combination with transactions. If transactions are used, we will learn in the next post that applications need to define an ID called transaction ID that is supposed to uniquely identify a producer. In this case, the broker will assign a producer ID to each transaction ID, so that the producer ID is effectively persisted across restarts. If, however, idempotent writes are used stand-alone, the broker uses a ZooKeeper sequence to assign a sequence number, and if a producer is either restarted or (for instance due to some programming error) sends another InitPidRequest, it will receive a new producer ID. For each new partition assigned to a producer not using transactions, the sequence number will start again at zero, so that the sequence number is only unique per partition and producer ID (which is good enough for our purpose).

Another useful feature of idempotent writes is that a Kafka broker is now able to detect record batches arriving in the wrong order. In fact, if a record arrives whose sequence number is higher than the previously seen sequence number plus one, the broker assumes that records got lost in flight or we see an ordering issue due to a retry and raises an error. Thus ordering is now guaranteed even if we allow more than one in-flight batch.

Trying it out

Time again to try all this. Unfortunately, the Kafka Python client that we have used so far does not (yet) support KIP-98. We could of course use a Java or Go client, but to stick to the idea of this little series to use Python, let us alternatively employ the Python client provided by Confluent.

To install this client, use

pip3 install confluent-kafka==1.4.1

Here I am using version 1.4.1 which was the most recent version at the time when this post was written, so you might want to use the same version. Using the package is actually straightforward. Again, we first create a configuration, then a producer and then send records to the broker asynchronously. Compared to the Kafka Python library used so far, there are a few differences which are worth being noted.

  • Similar to the Kafka Python library, sends are done asynchronously. However, you do not receive a future when sending as it is the case for the Kafka Python library, but you define a callback directly
  • To make sure that the callback is invoked, you have to call the poll method of the producer on a regular basis
  • When you are done producing, you have to explicitly call flush to make sure that all buffered messages are sent
  • The configuration parameters of the client follow the Java naming conventions. So the bootstrap servers, for instance, are defined by a configuration parameter called bootstrap.servers instead of bootstrap_servers, and the parameter itself is not a Python list but a comma-separated list passed as a string
  • The base producer class accepts bytes as values and does not invoke a serializer (there is a derived class doing this, but this class is flagged as not yet stable in the API documentation so I decided not to use it)

To turn on idempotent writes, there are a couple of parameters that need to be set in the producer configuration.

  • enable.idempotence needs to be 1 to turn on the feature
  • acks needs to be set to “all”, i.e. -1
  • max.in.flight should be set to one
  • retries needs to be positive (after all, idempotent writes are designed to make automated retries safe)

Using these instructions, it is now straightforward to put together a little test client that uses idempotent writes to a “test” topic. To try this, bring up the Kafka cluster as in the previous posts, create a topic called “test” with three replicas, navigate to the root of the repository and run

python3 python/idempotent_writes.py

You should see a couple of messages showing the configuration used and indicating that ten records have been written. To verify that these records do actually contain a producer ID and a sequence number, we need to dump the log file on one of the brokers.

vagrant ssh broker1
/opt/kafka/kafka_2.13-2.4.1/bin/kafka-dump-log.sh \
  --print-data-log \
  --files /opt/kafka/logs/test-0/00000000000000000000.log

The output should look similar to the following sample output.

Dumping /opt/kafka/logs/test-0/00000000000000000000.log
Starting offset: 0
baseOffset: 0 lastOffset: 9 count: 10 baseSequence: 0 lastSequence: 9 producerId: 3001 producerEpoch: 0 partitionLeaderEpoch: 0 isTransactional: false isControl: false position: 0 CreateTime: 1589818655781 size: 291 magic: 2 compresscodec: NONE crc: 307611005 isvalid: true
| offset: 0 CreateTime: 1589818655780 keysize: -1 valuesize: 16 sequence: 0 headerKeys: [] payload: {"msg_count": 0}
| offset: 1 CreateTime: 1589818655780 keysize: -1 valuesize: 16 sequence: 1 headerKeys: [] payload: {"msg_count": 1}
| offset: 2 CreateTime: 1589818655780 keysize: -1 valuesize: 16 sequence: 2 headerKeys: [] payload: {"msg_count": 2}
| offset: 3 CreateTime: 1589818655780 keysize: -1 valuesize: 16 sequence: 3 headerKeys: [] payload: {"msg_count": 3}
| offset: 4 CreateTime: 1589818655780 keysize: -1 valuesize: 16 sequence: 4 headerKeys: [] payload: {"msg_count": 4}
| offset: 5 CreateTime: 1589818655780 keysize: -1 valuesize: 16 sequence: 5 headerKeys: [] payload: {"msg_count": 5}
| offset: 6 CreateTime: 1589818655780 keysize: -1 valuesize: 16 sequence: 6 headerKeys: [] payload: {"msg_count": 6}
| offset: 7 CreateTime: 1589818655780 keysize: -1 valuesize: 16 sequence: 7 headerKeys: [] payload: {"msg_count": 7}
| offset: 8 CreateTime: 1589818655780 keysize: -1 valuesize: 16 sequence: 8 headerKeys: [] payload: {"msg_count": 8}
| offset: 9 CreateTime: 1589818655781 keysize: -1 valuesize: 16 sequence: 9 headerKeys: [] payload: {"msg_count": 9}

Here, the third line contains the header of the entire record batch. We see that the batch contains ten records, and we find a producer ID (3001). In each of the records, we also see a sequence number, ranging from 0 to 9.

Transactions

When you read KIP-98, the Kafka improvement proposal with which idempotent writes where introduced, then you realize that the main objective of this KIP is not just to provide idempotent writes, but to be able to handle transactions in Kafka. Here, handling transactions does not mean that Kafka somehow acts as a distributed transaction manager, joining transactions of a relational database. It does, however, mean that writes and reads in Kafka are transactional in the sense that a producer can write records within a transaction, and consumers will either see all of the records written as part of this transaction or none of them.

This makes it possible to model scenarios that occur quite often in business applications. Suppose, for instance, you are putting together an application handling security deposits. When you sell securities, you produce one record which will trigger the delivery of the securities to the buyer, and a second record that will trigger the payment that you receive for them. Now suppose that the first record is written, and them something goes wrong, so that the second record cannot be written. Without transactions, the first record would be in the log and consumers would pick it up, so that the security side of the transaction would still be processed. With transactions, you can abort the transaction, and the record triggering the security transfer will not become visible for consumers.

We will not go into details about transactions in this post, but KIP-98 is actually quite readable. I also recommend that you take a look at this well written blog post on the Confluent pages that provides some more background and additional links.

With that, it is time to close this short series on Kafka and Python. I hope I was able to give you a good introduction into the architecture and operations of a Kafka cluster and a good starting point for own projects. Happy hacking!