Building an NFT wallet app – implementation

In the previous post in this series, I have presented a simple architecture for an NFT wallet app, written in ReactJS. Today, I will dive a bit deeper into the implementation. The source code for this post can be found here. As most of it is fairly standard ReactJS code, I will not discuss all of it, but only look at a few specific points.

I assume that you are familiar with React. In addition to the React framework, we will also use the web3.js framework to interact with our smart contract. The idea of this post is not to provide a systematic introduction to this framework. In fact, this is not even necessary as the documentation is quite good and – if you have read my previous posts on the corresponding Python framework web3.py – you will soon realize that the structure of the API is quite similar to the Python API that we have used before. There are, however, a few notable differences on which we will focus.

Handling the wallet

One part of web3.js which has no direct counterpart in web3.py is the wallet functionality, available under web3.eth.accounts.wallet. This wallet is essentially an in-memory list of accounts to which you can add accounts and from which you can remove accounts. In addition, the web3 library allows you to store accounts locally, more specifically in browser local storage, encrypted using a password that the caller has to provide.

The wallet component of our sample application loads a wallet from local storage into memory when the page is rendered for the first time (this is realized using a React hook). Once the wallet has been loaded, the list of accounts stored in the wallet is copied to the application state, i.e. the Redux store. When a user adds an account to the wallet or removes an account, the updated wallet is saved again and the wallet is re-synchronized with the list of accounts kept in the state store.

Getting the token image and token data

Another crucial difference between the Python and the JavaScript web3 API becomes apparent if we look at the code that the token details component uses to retrieve the token URI from the smart contract and load the token image. Here is (a simplified version) of the code.

nft.methods.tokenURI(tokenID).call().then((uri) => {
                axios.get(uri).then((response) => { 
                    setDescription(response.data.description);
                    setImage(response.data.image);
                    setName(response.data.name);
                }).catch((err) => {
                    console.log(err);
                });    
            }).catch((err) =>  {
                console.log(err);
            });   
            

Let us try to understand this code. At the point in time when it executes, the variable nft holds a web3.eth.Contract object that we can use to interact with the smart contract. We then call the method tokenURI of the smart contract, passing, as an argument, the ID of the token. We know that this returns the URI of the token metadata. In the JavaScript code, however, the object that we get back is not yet the URI, but a promise. So internally, web3.js submits the request asynchronously, and the promise resolves when the reply is received.

Once this happens, we use Axios to submit a HTTP GET request for this URI. This again returns a promise, and once this promise resolves, we can extract the token description and the URL of the token image from the metadata returned as a response. When assembling our page, we can then embed the image itself using the image URL, so that the image is only loaded once the page is rendered.

Of course, both promises can raise an error, so we need two catch blocks where we log the respective error. Also note that this code is placed in a hook, and therefore the data is not yet available upon first rendering. We thus need to make sure that we update the state with the received data to trigger a re-rendering, and that the JSX code we use to build the page is prepared to also work if the data we need, like the image URL, is not yet present.

Submitting and tracking transactions

The code that we use to actually submit a transaction, i.e. sell a token, has a very similar structure. Again, let us look at a simplified version of it.

nft.methods.safeTransferFrom(...).estimateGas(..).then((gas) => {
  nft.methods.safeTransferFrom(...).send(...)
    .on("transactionHash", (hash) => {
       // add transaction to transaction list
    })
    .on("receipt", (receipt) => {
       // update transaction status
    })
    .then(() => { 
       // success, clear errors
    })
    .catch((err) => {
       // handle error during transaction
    });
  }).catch((err) => {
     // handle error during gas estimation
})

Let us again go through this line by line to see how it works. First, we access the method safeTransferFrom which, as we know, is the recommended way to trigger a transfer. We could now immediately call the send method of the resulting method object, in order to send a transaction to the Ethereum node (by the way: we do not have to sign this transaction manually, as the account that we use is part of the wallet managed by web3.js). The problem with this approach, however, is that we need to provide a gas limit. Of course we could guess, but this would not be very efficient. Instead, we first run a gas estimate, which will result in a first call to the server.

In JavaScript, this call is handled via a corresponding promise. Once this promise resolves, we know that the gas estimation was successful and also know the gast limit – time to call the send method to trigger the actual transaction. What this returns is what the web3.js documentation calls a PromiEvent. This is a promise which has some mixed-in methods of an event emitter, and allows our code to react on events like the availability of a transaction hash or a transaction receipt in a promise-like notation. When we first receive the transaction hash, we add the transaction to the transaction list and force a re-rendering of the list. Similarly, when we receive the receipt, we update the status of the transaction to “mined”. At this point, the promise will resolve and the transaction is complete. Note that we do not wait for a confirmation in the sense that our transaction is not only included in a block (i.e. we have a receipt), but a few additional blocks have been mined so that we can be confident that the transaction is also part of the canonical chain.

Running the sample app

To try out the sample app, clone my repository, then switch to the frontend directory and use npm to install all required packages.

git clone https://github.com/christianb93/nft-bootcamp.git
cd nft-bootcamp/frontend
npm install

This might of course take a while, depending on the speed of your network connection, as it will download and installed all required JavaScript packages and their dependencies. Once the last command completes, you can start the frontend by running

npm start

To use the frontend, you will of course need a running Ethereum node and a copy of the smart contract deployed. To this end, start a copy of geth by executing ./tools/run_geth from a second terminal (assuming that you have installed geth as described in one of my previous posts), open a third terminal, navigate to the root directory of the repository and run

python3 tools/GethSetup.py
python3 install/deployAndMintNFT.py

This should set up a few test accounts with sufficient balance, deploy a copy of the smart contract, print out the contract address and mint five token. Having this in place, go back to the welcome page of the wallet app which should look as follows.

Pick a password, enter it in the input field right above the “Refresh” button and hit that button. Next you will have to set up a few test accounts. For starters, enter the private key 0xc65f2e9b1c360d44070ede41d5e999d30c23657e2c5889d3d03cef39289cea7c in the input field next to the button “Add account” and press that button. This should result in the account 0xFC2a2b9A68514E3315f0Bd2a29e900DC1a815a1D being added to the wallet. Mark this account as primary account by activating the radio button next to it. Then, enter the private key 0x2d7bdb58c65480ac5aee00b20d3558fb18a916810d298ed97174cc01bb809cdd to create a second account.

Next, we will add a token to the watch list. Click on “Token Watchlist”, enter the contract address which was spit out by the deployment script and token ID 1 and click on “Add Token”. Repeat this process for the remaining token IDs 2 to 5.

The setup is now complete, and you can start playing with the app. You can, for instance, hit any of the “Details” buttons to display an individual token or to sell a token (make sure to have selected the correct primary account in the wallet first, which needs to match the current owner to be able to sell). This should also populate the list of transactions, with the transactions showing up as mined immediately.

Have fun! There are probably still tons of bugs in the code, after all this is only a bit more than a quick hack to demonstrate how to use the involved libraries – still, if you find one and wish to contribute, I am happy to accept pull requests.

Implementing and testing an ERC721 contract

In the previous posts, we have discussed the ERC721 standard and how metadata and the actual asset behind a token are stored. With this, we have all the ingredients in place to tackle the actual implementation. Today, I will show you how an NFT contract can be implemented in Solidity and how to deploy and test a contract using Brownie. The code for this post can be found here.

Data structures

As for our sample implementation of an ERC20 contract, let us again start by discussing the data structures that we will need. First, we need a mapping from token ID to the current owner. In Solidity, this would look as follows.

mapping (uint256 => address) private _ownerOf;

Note that we declare this structure as private. This does not affect the functionality, but for a public data structure, Solidity would create a getter function which blows up the contract size and thus makes deployment more expensive. So it is a good practice to avoid public data structures unless you really need this.

Now mappings in Solidity have a few interesting properties. In contrast to programming languages like Java or Python, Solidity does not offer a way to enumerate all elements of a mapping – and even if it did, it would be dangerous to use this, as loops like this can increase the gas usage of your contract up to the point where the block limit is reached, rendering it unusable. Thus we cannot simply calculate the balance of an owner by going through all elements of the above mapping and filtering for a specific owner. Instead, we maintain a second data structure that only tracks balances.

mapping (address => uint256) private _balances;

Whenever we transfer a token, we also need to update this mapping to make sure that it is in sync with the first data structure.

We also need a few additional mappings to track approvals and operators. For approvals, we again need to know which address is an approved recipient for a specific token ID, thus we need a mapping from token ID to address. For operators, the situation is a bit more complicated. We set up an operator for a specific address (the address on behalf of which the operator can act), and there can be more than one operator for a given address. Thus, we need a mapping that assigns to each address another mapping which in turn maps addresses to boolean values, where True indicates that this address is an operator for the address in the first mapping.

/// Keep track of approvals per tokenID
mapping (uint256 => address) private _approvals; 

/// Keep track of operators
 mapping (address => mapping(address => bool)) private _isOperatorFor;

Thus the sender of a message is an operator for an address owner if and only if _isOperatorFor[owner][msg.sender] is true, and the sender of a message is authorized to withdraw a token if and only if _approvals[tokenID] === msg.sender.

Burning and minting a token is now straightforward. To mint, we first check that the token ID does not yet exist. We then increase the balance of the contract owner by one and set the owner of the newly minted token to the contract owner, before finally emitting an event. To burn, we reverse this process – we set the current owner to the zero address and decrease the balance of the current owner. We also reset all approvals for this token. Note that in our implementation, the contract owner can burn all token, regardless of the current owner. This is useful for testing, but of course you would not want to do this in production – as a token owner, you would probably not be very amused to see that the contract owner simply burns all your precious token. As an aside, if you really want to fire up your own token in production, you would probably want to take a look at one of the available audited and thoroughly tested sample implementations, for instance by the folks at OpenZeppelin.

Modifiers

The methods to approve and make transfers are rather straightforward (with the exception of a safe transfer that we will discuss separately in a second). If you look at the code, however, you will spot a Solidity feature that we have not used before – modifiers. Essentially, a modifier is what Java programmers might know as an aspect – a piece of code that wraps around a function and is invoked before and after a function in your contract. Specifically, if you define a modifier and add this modifier to your function, the execution of the function will start off by running the modifier until the compiler hits upon the special symbol _ in the modifier source code. At this point, the code of the actual function will be executed, and if the function completes, execution continues in the modifier again. Similar to aspects, modifiers are useful for validations that need to be done more than once. Here is an example.

/// Modifier to check that a token ID is valid
modifier isValidToken(uint256 _tokenID) {
    require(_ownerOf[_tokenID] != address(0), _invalidTokenID);
    _;
}

/// Actual function
function ownerOf(uint256 tokenID) external view isValidToken(tokenID) returns (address)  {
    return _ownerOf[tokenID];
}

Here, we declare a modifier isValidToken and add it to the function ownerOf. If now ownerOf is called, the code in isValidToken is run first and verifies the token ID. If the ID is valid, the actual function is executed, if not, we revert with an error.

Safe transfers and the code size

Another Solidity feature that we have not yet seen before is used in the function _isContract. This function is invoked when a safe transfer is requested. Recall from the standard that a safe transfer needs to check whether the recipient is a smart contract and if yes, tries to invoke its onERC721Received method. Unfortunately, Solidity does not offer an operation to figure out whether an address is the address of a smart contract. We therefore need to use inline assembly to be able to directly run the EXTCODESIZE opcode. This opcode returns the size of the code of a given address. If this is different from zero, we know that the recipient is a smart contract.

Note that if, however, the code size is zero, the recipient might in fact still be a contract. To see why, suppose that a contract calls our NFT contract within its constructor. As the code is copied to its final location after the constructor has executed, the code size is still zero at this point. In fact, there is no better and fully reliable way to figure out whether an address is that of a smart contract in all cases, and even the ERC-721 specification itself states that the check for the onERC721Received method should be done if the code size is different from zero, accepting this remaining uncertainty.

Inline assembly is fully documented here. The code inside the assembly block is actually what is known as Yul – an intermediate, low-level language used by Solidity. Within the assembly code, you can access local variables, and you can use most EVM opcodes directly. Yul also offers loops, switches and some other high-level constructs, but we do not need any of this in your simple example.

Once we have the code size and know that our recipient is a smart contract, we have to call its onERC721Received method. The easiest way to do this in Solidity is to use an interface. As in other programming languages, an interface simply declares the methods of a contract, without providing an implementation. Interfaces cannot be instantiated directly. Given an address, however, we can convert this address to an instance of an interface, as in our example.

interface ERC721TokenReceiver
{
  function onERC721Received(address, address, uint256, bytes calldata) external returns(bytes4);
}

/// Once we have this, we can access a contract with this interface at 
/// address to
ERC721TokenReceiver erc721Receiver = ERC721TokenReceiver(to);
bytes4 retval = erc721Receiver.onERC721Received(operator, from, tokenID, data);

Here, we have an address to and assume that at this address, a contract implementing our interface is residing. We then convert this address to an instance of a contract implementing this interface, and can then access its methods.

Note that this is a pure compile-time feature – this code will not actually create a contract at the address, but will simply assume that a contract with that interface is present at the target location. Of course, we can, at compile time, not know whether this is really the case. The compiler can, however, prepare a call with the correct function signature, and if this method is not implemented, we will most likely end up in the fallback function of the target contract. This is the reason why we also have to check the return value, as the fallback function might of course execute successfully even if the target contract does not implement onERC721Received.

Implementing the token URI method

The last part of the code which is not fully straightforward is the generation of the token URI. Recall that this is in fact the location of the token metadata for a given token ID. Most NFT contracts that I have seen build this URI from a base URI followed by the token ID, and I have adapted this approach as well. The base URI is specified when we deploy the contract, i.e. as a constructor argument. However, converting the token ID into a string is a bit tricky, because Solidity does again not offer a standard way to do this. So you either have to roll your own conversion or use one of the existing implementations. I have used the code from this OpenZeppelin library to do the conversion. The code is not difficult to read – we first determine the number of digits that our number has by dividing by ten until the result is less than one (and hence zero – recall that we are dealing with integers) and then go through the digits from the left to the right and convert them individually.

Interfaces and the ERC165 standard

Our smart contract implements a couple of different interfaces – ERC-721 itself and the metadata extension. As mentioned above, interfaces are a compile-time feature. To improve type-safety at runtime, it would be nice to have a feature that allows a contract to figure out whether another contract implements a given interface. To solve this, EIP-165 has been introduced. This standard does two things.

First, it defines how a hash value can be assigned to an interface. The hash value of an interface is obtained by taking the 4-byte function selectors of each method that the interface implements and then XOR’ing these bytes. The result is a sequence of four bytes.

Second, it defines a method that each contract should implement that can be used to inquire whether a contract implements an interface. This method, supportsInterface, accepts the four-byte hash value of the requested interface as an argument and is supposed to return true if the interface is supported.

This can be used by a contract to check whether another contract implements a given interface. The ERC-721 standard actually mandates that a contract that implements the specification should also implement EIP-165. Our contract does this as well, and its supportsInterface method returns true if the requested interface ID is

  • 0x01ffc9a7, which corresponds to ERC-165 itself
  • 0x80ac58cd which is the hash value corresponding to ERC-721
  • 0x5b5e139f which is the hash value corresponding to the metadata extension

Testing, deploying and running our contract

Let us now discuss how we can test, deploy and run our contract. First, there is of course unit testing. If you have read my post on Brownie, the unit tests will not be much of a surprise. There are only two remarks that might be in order.

First, when writing unit tests with Brownie and using fixtures to deploy the required smart contracts, we have a choice between two different approaches. One approach would be to declare the fixtures as function scoped, so that they are run over and over again for each test case. This has the advantage that we start with a fresh copy of the contract for each test case, but is of course slow – if you run 30 unit tests, you conduct 30 deployments. Alternatively, we can declare the fixture as sessions-scoped. They will then be only executed once per test session, so that every test case uses the same instance of the contract under test. If you do this, be careful to clean up after each test case. A disadvantage of this approach, though, remains – if the execution of one test case fails, all test cases run after the failing test case will most likely fail as well because the clean up is skipped for the failed test case. Be aware of this and do not panic if all of a sudden almost all of your test cases fail (the -x switch to Brownie could be your friend if this happens, so that Brownie exits if the first test case fails).

A second remark is concerning mocks. To test a safe transfer, we need a target contract with a predictable behavior. This contract should implement the onERC721Received method, be able to return a correct or an incorrect magic value and allow us to check whether it has been called. For that purpose, I have included a mock that can be used for that purpose and which is also deployed via a fixture.

To run the unit tests that I have provided, simply clone my repository, make sure you are located in the root of the repository and run the tests via Brownie.

git clone https://github.com/christianb93/nft-bootcamp.git
cd nft-bootcamp
brownie test tests/test_NFT.py

Do not forget to first active your Python virtual environment if you have installed Brownie or any of the libraries that it requires in a virtual environment.

Once the unit tests pass, we can start the Brownie console which will, as we know, automatically compile all contracts in the contract directory. To deploy the contract, run the following commands from the Brownie console.

owner = accounts[0]
// Deploy - the constructor argument is the base URI
nft = owner.deploy(NFT, "http://localhost:8080/")

Let us now run a few tests. We will mint a token with ID 1, pick a new account, transfer the token to this account, verify that the transfer works and finally get the token URI.

alice = accounts[1]
nft._mint(1)
assert(owner == nft.ownerOf(1))
nft.transferFrom(owner, alice, 1)
assert(alice == nft.ownerOf(1))
nft.tokenURI(1)

I invite you to play around a bit with the various functions that the NFT contract offers – declare an operator, approve a transfer, or maybe test some validations. In the next few posts, we will start to work towards a more convenient way to play with our NFT – a frontend written using React and web3.js. Before we are able to work on this, however, it is helpful to expand our development environment a bit by installing a copy of geth, and this is what the next post will be about. Hope to see you there.

Understanding the Ethereum virtual machine – part III

Having found our way through the mechanics of the Ethereum virtual machine in the last post, we are now in a position to better understand what exactly goes on when a smart contracts hand over control to another smart contract or transfers Ether.

Calls and static calls

Towards the end of the previous post in this series, we have already taken a glimpse at how an ordinary CALL opcode is being processed. As a reminder, here is the diagram that displays how the EVM and the EVM interpreter interact to run smart contract

In our case – the processing of a CALL – this specifically implies that the following steps will be carried out (we ignore gas processing for the time being, as this is a bit more complicated and will be discussed in depth in a separate section).

  • the interpreter hits upon the CALL opcode
  • it performs a look up in the jump table and determines that the function opCall needs to be run
  • it gets the parameters from the stack, in particular the address of the contract to be called (the second stack item)
  • it then extracts the input data from the memory of the currently executing code
  • we then invoke the Call method of the EVM, using the contract address and the input data as arguments
  • as we have learned, this will result in a new execution context (i.e. a new Contract object, a new stack and a freshly initialized memory) in which the code of the target contract will be executed
  • at the end, we get the returned data (an array of bytes) back
  • if everything went fine, we copy the returned data back into the memory of the currently executing contract

It is important to understand that this comes with a full context switch – the target contract will execute with its own stack and memory, state changes made by the target contract will refer to the state of the target contract, and Ether transferred with the call is credited to the target contract.

Also note that there are actually two ways how the result of the call is made available to the caller. First, the result of the call (a pointer to a byte array) will be copied to the memory of the calling contract. In addition, the return value is also returned by opCall and there it is copied once more, this time to a special buffer called the return data buffer. The caller can copy the data stored in this buffer and determine its length using the RETURNDATACOPY and RETURNDATALENGTH opcodes introduced with EIP-211 (in order to make it easier to pass back return data whose length is not yet known when the call is made).

In summary, the called contract is executed essentially as if it were the initial contract execution of the underlying transaction. Calls can of course be nested, so we now see that a transaction should be considered as the top-level call, which can be followed by a number of nested calls (actually, this number is limited, for instance by the limited depth of the call stack).

Of course, executing an unknown contract can be a significant security risk. We have seen an example in our post on smart contract security, where a malicious contract calling back into your own contract can cause a double-spending. Therefore, it is natural to somehow try to restrict what a called contract can due. One of the first restrictions of this type is the introduction of the STATICCALL with EIP-214. A static call is very much like an ordinary call, except that the called contract is not allowed to make any state changes, in particular no value transfer is possible as part of a static call.

The function opStaticCall realizing this is actually very similar to the processing of an ordinary call. There are two essential differences. First, there is no value and therefore one parameter less that needs to be taken from the stack. Second, the method of the EVM that is eventually invoked is not Call but StaticCall. The structure of this function is very similar to that of an ordinary call, so let us focus on the differences. Here is a short snipped (leaving out some parts to focus on the differences) of the Call method.

evm.Context.Transfer(evm.StateDB, caller.Address(), addr, value)
code := evm.StateDB.GetCode(addr)
contract := NewContract(caller, AccountRef(addrCopy), value, gas)
contract.SetCallCode(&addrCopy, evm.StateDB.GetCodeHash(addrCopy), code)
ret, err = evm.interpreter.Run(contract, input, false)

And here is the corresponding code for a static call (again, I have made a few changes to better highlight the differences).

addrCopy := addr
code := evm.StateDB.GetCode(addr)
contract := NewContract(caller, AccountRef(addrCopy), new(big.Int), gas)
contract.SetCallCode(&addrCopy, evm.StateDB.GetCodeHash(addrCopy), code)
ret, err = evm.interpreter.Run(contract, input, true)

So we see that there are three essential differences. First, in a static call, there is value transfer – this is as expected, as a static call is not allowed to make a value transfer which represents a change to the state. Second, when we build the contract, the third parameter is zero – again, this is related to the fact that there is no value transfer, as this parameter determines the value that, for instance, the opcode CALLVALUE returns. Finally, we set the third parameter of the Run function to true. In our discussion of the Run method in the previous post, we have already seen that this disallows all instructions which are marked as state changing.

Delegation and the proxy pattern

Apart from calls and static calls, there is a third way to invoke another contract, namely a delegate call. Roughly speaking, a delegate call implies that instead of executing the code of the called contract within the context of the called contract, we execute the code within the context of the caller. Thus, we essentially run the code of the called contract as if it were a part of the caller code, as you would run a library (however, this is of course not how libraries are actually realized in Solidity where a library is simply linked into the contract at build time).

In the EVM, a delegate call is done using the opcode DELEGATECALL (well, that did probably not come as a real surprise). Similar to a static call, there is no value transfer for this call and correspondingly no value parameter on the stack. Going through the same analysis as for a static call, we find that execution of the opcode delegates to the method DelegateCall() of the EVM. Let us again look at the parts of the code that differ from an ordinary call.

addrCopy := addr
code := evm.StateDB.GetCode(addr)
contract := NewContract(caller, AccountRef(caller.Address()), nil, gas).AsDelegate()
contract.SetCallCode(&addrCopy, evm.StateDB.GetCodeHash(addrCopy), code) 
ret, err = evm.interpreter.Run(contract, input, false)

Looking at this , we spot three differences compared to an ordinary call. First, the second parameter used for the creation of the new contract (which is the parameter which will determine the self field of the new contract and with that the address used to read and change state during contract execution) is not set to the target contract, but to the address of the caller, i.e. the currently executing contract, while the address used to determine the code to be run is still that of the target contract. Thus, as promised, we execute the code of the target contract within the context of the currently executing contract.

A second difference is the third argument used for contract creation, which is the value transferred with this call. Again, this is zero (even nil). Finally, after creating the contract, we execute its AsDelegate() method. This changes the attributes CallerAddress and value of the contract to that of the currently executing contract. Thus, whenever we execute the opcodes CALLVALUE or CALLER, we get the same values as in the context of the currently executing contract, as promised by EIP-7, the EIP which introduced delegate calls.

One of the motivations behind introducing this possibility was that it allows for a pattern known as proxy pattern. In this pattern, there are two contracts involved. First, there is the proxy contract. The proxy contract accepts a call or transaction and is responsible for holding the state. It does, however, not contain any non-trivial logic. Instead, it uses a delegate call to invoke the logic residing in a second contract, the logic contract.

Why would you want to do this? There are, in fact, a couple of interesting use cases for this pattern. First, it allows you to build an upgradeable contract. Recall that – at least until the CREATE2 opcode was introduced – it was not possible to change a smart contract after is has been deployed. Even though this is of course by intention and increases trust in a smart contract (it will be the same, no matter when you interact with it), it also implies a couple of challenges, most notably that it makes it impossible to add features to a smart contract over time or to fix a security issue. The proxy pattern, however, does allow you to do this. You could, for instance, store the address of the logic contract in the proxy contract instead of hard-coding it, and then add a method to the proxy that allows you to change that address. You can then deploy a new version of the logic to a new address and then update the address stored in the proxy contract to migrate from the old version to the new version. As the state is part of the proxy contract which stays at its current location, the state will be untouched, and as the address that the users interact with does not change, the users might not even notice the change. Needless to say that this is very useful for some cases, but can also be abused by tricking a user into trusting a contract and then changing its functionality, so be careful when interacting with a smart contract that performs delegation.

A second use case is related to re-use. As an example, suppose you have developed a smart contract that implements some useful wallet-like functionality, maybe time-triggered transfers. You want to make this available to others. Now you could of course allow anybody to deploy your smart contract, but this would lead to many addresses on the blockchain containing exactly the same code. Alternatively, you could store the logic in one logic contract and than only distribute the code for the proxy. A new user would then simply deploy a proxy, so each proxy would act as a wallet with an individual state and balance, but all of them would run the same logic. Again, it goes without saying that this implies that your users trust you and your contract – if, for instance, your logic contract is able to remove itself (“self-destruct” using the corresponding opcode), than this would of course render all deployed proxies useless and the balance stored in them would be lost forever.

Finally (and this apparently was one of the motivation behind EIP-7) you could have a large contract whose deployment consumes more gas than the gas limit of a block allows. You could then split the logic into several smaller logic contracts and use a proxy to tie them together into a common interface.

There are several ongoing attempts to standardize this pattern and in particular upgradeable contracts. EIP-897, for instance, proposes a standard to expose the address to which a proxy is pointing. EIP-1967 addresses an interesting problem that the pattern has – the logic contract and the proxy contract share a common state, and thus the proxy contract needs to find a way to store the address of the logic contract without conflicting with the storage layout of the logic contract. Finally, EIP-1822 proposes a standard for upgradeable contracts. It is instructive to read through these EIPs and I highly advise you to do so and also have a look at the implementations described or linked in them.

Gas handling during a call

Let us now turn to gas handling during a call. We have already seen that, as for every instruction, there is a constant gas cost and a dynamic gas cost. In addition, there are two special contributions which are not present for other instructions – a refund and a stipend.

The constant gas cost is simple – this is simply a constant value of (currently) 700 units of gas, increased from previously 40 with EIP-150. The dynamic gas cost is already a bit more complicated and itself consists of four positions. The first three positions are rather straightforward

  • first, there is a fee of 9000 units of gas when a non-zero value is transferred as part of the call
  • second, there is an account creation fee of 25000 whenever a non-zero value is transferred to a non-existing account as part of the call
  • third, there is the usual gas fee for memory expansion, as for many other instructions

The fourth contribution to the dynamic gas cost is a bit more tricky. The problem we are facing at this point is that the contract which is called will of course consume gas as well, but at this point in time, we do not know how much this is going to be. To solve this, a position called the gas cap is used. Initially, this gas cap was simply the first stack item, i.e. the first argument to the CALL instruction, which specifies the gas limit for the contract to be executed, i.e. the part of our remaining gas that we want to pass on to the called contract. We could now simply use this number as additional gas cost and then, once the called contract returns, see how much of that is still unused and refund that amount.

This is indeed how the gas payment for a call worked before EIP-150 was introduced. This EIP was drafted to address denial-of-service attacks that utilized the fact that the costs for some instructions, among them making a call, was no longer reflecting the actual computing cost on the client. As a counter-measure, the cost for a call was increased from previously 40 to the new still valid 700. This, however, caused problems with existing contract that tried to calculate the amount of gas they would make available to called contract by taking the currently remaining gas (inquired via the GAS opcode) and subtracting the constant fee of 40 units of gas. To avoid this, the developers thought about coming up with a mechanism which allowed a contract to make “almost all” remaining gas available to the caller, without having to hard-code gas fees. More precisely, “almost all” means that the following algorithm is applied to calculate the gas cap.

  • Determine the gas which is currently still available, after having deducted the constant gas cost already
  • Determine the base fee, i.e. the dynamic gas cost for the call calculated so far (memory fee, transfer fee and creation fee)
  • Subtract this from the remaining gas to determine the gas which will still be available after paying for all other gas cost contributions (“projected available gas”)
  • Read out the first value from the stack (the first parameter of the GAS instruction), i.e. the requested gas limit
  • determine a gas cap as 63 / 64 times the projected available gas
  • if the requested gas limit is higher than the gas cap, return the gas cap, otherwise return the requested gas limit

Thus a contract can effectively pass almost all of the remaining gas to the callee by providing a very large requested gas limit as first argument to the CALL instruction, so that the requested gas limit is definitely smaller than the calculated cap. The factor of 63 / 64 has been put in as an additional protection against recursive calls. The outcome of this algorithm is then used for two purposes – as an upfront payment to cover the maximum amount of gas that the callee might need, and as the gas supply that the callee actually obtains for its execution.

Now, I have been cheating a bit as there are two components in the diagram above that we have not yet discussed. First, I have just told you that the outcome of the EIP-150 algorithm is passed as available gas to the callee. This, however, is only true if the call does not transfer any Ether. If it does, there is an additional stipend of 2300 gas which is added to the gas made available to the callee before actually making the call. Note that this stipend does not count against the gas cost of the callee, as it is not part of the dynamic gas cost, so it effectively has two implications – it reduces the cost of the call by 2300 units of gas and, at the same time, it makes sure that even if the caller specified zero as gas limit for the call, the callee has at least 2300 units of gas available. The motivation of this is that a call with a non-zero value typically triggers the receive function or fallback function of the called contract, and calls with a gas supply of zero will let this function fail. Thus the gas stipend serves as a safe-guard to reduce the risk of a value transfer failing because the recipient is a smart contract and its receive- or fallback-function runs out of gas.

Finally, there is the refund, which happens here and simply amounts to adding the gas that the callee has not consumed to the available gas of the current execution context again.

The gas stipend and transfers in Solidity

The gas stipend is one of the less documented features of smart contracts, and a part of the confusion that I have seen around this topic (which, in fact, was the main motivation for the slightly elaborated treatment in this post) comes from the fact that a gas stipend exists in the EVM as well as in Solidity.

As explained above, the EVM adds the gas stipend depending on the value transferred with the call – in fact, the stipend only applies to calls with a non-zero value. In addition to this, Solidity applies the same logic, but only if the value is zero. To see this, you might want to use a simple contract like this one.

contract Transfer {

    uint256 value;

    function doIt() public {
        payable(msg.sender).transfer(value);
    }
}

If you compile this, for instance in Remix, and take a look at the generated bytecode, you will see that eventually, the transfer translates into a CALL instruction. The preparation of the stack preceding this instruction is a bit involved, but if you go through this carefully and wait until the dust has settled, you will find that the top of the stack looks as follows.

(value == 0) * 2300 | sender | value |

Thus the first value, which specified the gas to be made available for the subcontract, is 2300 (the gas stipend) if the value is zero, and zero otherwise. In the first case, the EVM will not add anything, in the second case, the EVM will add its own gas stipend. Thus, regardless of the value, the net effect will be that the gas stipend of 2300 units of gas always applies for a transfer. You might also want to look at this snippet in the Solidity source code that creates the corresponding code (at least if I interpret the code correctly).

What this analysis tells us as well is that there is no way to instruct the compiler to increase the gas limit of the transfer. As the 2300 units of gas will only be sufficient for very simple functions, you need a different approach when invoking contracts with a more complex receive function. When we discuss NFTs in a later post in this series, we will see how you can use interfaces in Solidity to easily call functions of a target contract. Alternatively, to simply invoke the fallback function or the receive function with a higher gas limit, you can use a low-level call. To see this in action, change the transfer in the above sample code to

(bool success, ) = 
     payable(msg.sender).call{value: value}("");

When you now compile again, take a look at the resulting bytecode and locate the CALL instruction, you will see that immediately before we do the CALL, we execute the GAS opcode. As we know, this pushes the remaining available gas onto the stack. Thus the first argument to the CALL is the remaining gas. As, by the EIP-150 algorithm above, this is in every case more than the calculated cap, the result is that the cap will be used, i.e. almost all remaining gas will be made available to the called contract. Be sure, however, to check the return value and handle any errors that might have occurred in the called contract, as Solidity does not add extra code to make sure that we revert upon errors. Note that there is an ongoing discussion to extend the functionality of transfer in Solidity to allow a transfer to explicitly pass on all the remaining gas, see this thread.

With this, we have reached the end of our post for today. In this and the previous two posts, we have taken a deep-dive into how the Ethereum virtual machine actually works, guided by the yellow paper and the code of the Go-Ethereum client. In the next post, we will move on and start to explore one of the currently “hottest” applications of smart contract – non-fungible token. Hope to see you soon!

Understanding the Ethereum virtual machine – part II

In todays post, we will complete our understanding of how the EVM executes a smart contract. We will investigate the actual interpreter loop, discuss gas handling and have a short look at pre-compiled contracts.

The jump table and the main loop

In the last post, we have seen that the entry point to the actual code execution is the Run method of the EVM interpreter. This method is essentially going through the bytecode step by step and, for each opcode, looking up the opcode in a data structure known as jump table– Among other things, this table contains a reference to a Go function that is to be executed to process the instruction. More specifically, an entry in the jump table contains the following fields, which partially refer to other tables in other source code files.

  • First, there is a Go function which is invoked to process the operation
  • Next, there is a gas value which is known as the constant gas cost of the operation. The idea behind this is that the gas cost for the execution of an instruction typically has two parts – a static part which is independent of the parameters and a dynamic part which depends on parameters like the memory consumption or other parameters. This field represents the static part
  • The third field is again a function that can be used to derive the dynamic part of the gas cost
  • The fourth field – minStack – is the number of stack items that this operation expects
  • The next field – maxStack – is the maximum size of the stack that will still allow this operation to work without overflowing the stack. For most operations, this is simply the maximum stack size minus the number of items that the operation pops from the stack plus the number of items that it adds to the stack
  • The next field, memorySize, specifies how much memory the opcode needs to execute. Again, this is a function, as the result could depend on parameters
  • The remaining fields are a couple of flags that describe the type of operation. The flag halts is set if the operation ends the execution of the code. At the time of writing, this is set for the opcodes STOP, RETURN and SELFDESTRUCT.
  • Similarly, the reverts flag indicates whether this opcode explicitly reverts the execution and is currently only set for the REVERT opcode itself
  • The return flag indicates whether this opcode returns any data. This is the case for the call operations STATICCALL, DELEGATECALL, CALL, and CALLCODE, but also for REVERT and contract creation via CREATE and CREATE2
  • The writes flag indicates whether the operation modifies the state and is set of operations like SSTORE
  • Finally, the jumps flag indicates whether the operation is a jump instruction and therefore modifies the program counter

Another data structure that will be important for the execution of the code is a set of fields known as the call context. This refers to a set of variables that make up the current of the interpreter and are reset every time a new execution starts, like memory, stack and the contract object.

Let us now go through the Run method step by step and try to understand what it does. First, it increments the call stack depth which will be decremented again at the end of the function. It also sets the read only flag of the interpreter if not yet done and resets the return data field. Next, we initialize the call context and set the program counter to zero before we eventually enter a loop called the main loop.

Within this loop, we first check every 1000th step whether the abort flag is set. If yes, we stop execution (my understanding is that this feature is primarily used to cancel running EVM operations that were started as part of an API call). Next, we use the current value of the program counter to read the next opcode that we need to process, and look up that operation in the jump table (raising an error if there is no entry, which indicates an invalid opcode).

Once we have the jump table entry in our hands, we can now check the current stack size against the minimum and maximum stack size of the instruction and make sure that we raise an error if we try to process an operation in read-only mode that potentially updates the state.

We then calculate the gas required to perform the operation. As already explained, the gas consumption has two parts – a static part and a dynamic part. For each of these two contributions, we invoke the method UseGas() of the contract object, which will reduce the gas left that the contract tracks and also raise an error if we are running out of gas.

We then execute the operation by invoking the Go function to which it is mapped. This function will typically get some data from the stack, perform some calculations and push data back to the stack, but can also modify the state and perform more complex operations. Most if not all operations are contained in instructions.go, and it is instructive to scan the file and look at a few operations to get a feeling for how this works (we will go through a more complex example, the CALL operation, in a later post).

Once the instruction completes, we check the returns flag of the jump table entry to see whether the instruction returns any data, and if yes, we copy this data to the returnData field of the interpreter so that it is available for the next instruction. We then decide whether the execution is complete and we need to return to leave the main loop, or whether we need to continue execution with an updated program counter.

So the main loop is actually rather straightforward, and, together with our discussion of the Call() method in the previous post, we now have a fairly complete picture of how contract execution works.

Handling gas consumption

Let us leverage this end-to-end view to put together the various bits and pieces to understand how gas consumption is handled. We start our discussion on the level of an entire block. In one of the previous posts, we have already seen that when a block is processed here, two gas related variables are maintained. First, the processing keeps track of the gas used for all transactions in this block, which corresponds to the gasUsed field of a block header. In addition, there is a block gas pool, which is simply a counter initialized with the current block gas limit and used to keep track of the gas which is still available without violating this limit.

When we now process a single transaction contained in the block, we invoke the function applyTransaction. In this function, we increase the used gas counter on the block level by the gas consumed by the transaction and use that information to create the transaction receipt, that contains both the gas used by the transaction and the current value of the cumulative gas usage on the block level. This is done based on the return value of the ApplyMessage function, which itself immediately delegates to the TransitionDB method of a newly created state transition object.

The state transition object contains two additional gas counters. The first counter (st.gas) keeps track of the gas still available for this transaction, and is initialized with the gas limit of the transaction, so this is the equivalent of the gas pool on the block level. The second counter is the initial value of this field and only used to be able to calculate the gas actually used later on.

When we now process the transaction, we go through the following steps.

  • First, we initialize the gas counters
  • Then, we deduct the upfront payment from the senders balance. The upfront payment is the gas price times the gas limit and therefore the maximum amount of Ether that the sender might have to pay for this transaction
  • Similarly, we reduce the block gas limit by the gas limit of the transaction
  • Next, we calculate the intrinsic gas for the transaction. This is the amount of gas just for executing the plain transaction, still without taking any contract execution into account. It is calculated (ignoring contract creations) by taking a flat fee of currently 21000 units of gas per transaction, plus a fee for every byte of the transaction input (which is actually different for zero bytes and non-zero bytes). In addition, there is a fee for each entry in the access list (this is basically a list of accounts and addresses for which a discount applies when accessing them, see EIP-2930). In the yellow paper, the intrinsic gas is called g0 and defined in section 6.2
  • We then reduce the remaining gas by the intrinsic gas cost (again according to what section 6.2 of the yellow paper prescribes) and invoke Call(), using the remaining gas counter st.gas as the parameter which determines the gas available for this execution. Thus the gas available to the contract execution is the gas limit minus the intrinsic gas cost. We have already seen that this creates a Contract containing another gas counter which keeps track of the gas consumed during the execution. Within the interpreter main loop, we calculate static and dynamic gas cost for each opcode and reduce the counter accordingly. At the end, the remaining gas is returned
  • We update the remaining gas counter st.gas with the value returned by Call(). We then perform a refund, i.e. we the remaining gas times gas price back to the sender and also put the remaining gas back into the gas pool on block level

This has a few interesting consequences. First, it demonstrates that the total gas cost of executing a transaction does actually consist of two parts – the intrinsic gas for the transaction and the cost of executing the opcodes of the smart contract (if any). Both of these components have a static part (the 21000 base fee for the intrinsic gas cost and the static fee per opcode for the code execution) and a dynamic part, which depends on the transaction.

The second thing that you want to remember is that in order to make sure that a transaction is processed, it is not sufficient to have enough Ether to pay for the gas actually used. Instead, you need to have at least the gas limit times the gas price, otherwise the upfront payment will fail. Similarly, you need to make sure that the gas limit of your transaction is lower than the block gas limit, otherwise the block will not be mined.

Pre-compiled contracts

There is a special case of calling a contract that we have ignored so far – pre-compiled contracts. Before diving down into the code once more, let me quickly explain what pre-compiled contracts are and why they are useful.

Suppose you wanted to develop a smart contract that needs to calculate a hash value. The EVM has a built-in opcode SHA3 to calculate the Keccak hash, but what about other hashing algorithms? Of course, as the EVM is Turing-complete, you could develop a contract that does this, but this would blow up your contract considerably and, in addition, would probably be extremely slow as this would mean executing complex mathematical operations in the EVM. As an alternative, the Ethereum designers came up with the idea of a pre-compiled contract. Roughly speaking, this is a kind of extension of the instruction set of the EVM, realized as contracts located at pre-defined addresses. The contract at address 0x2, for instance, calculates an SHA256 hash, and the contract at address 0x3 a RIPEMD-160 hash. These contracts are, however, not really placed on the blockchain – if you look at the code at this address using for instance the JSON API method eth_getCode, you will not get anything back. Instead, these pre-defined contracts are handled by the EVM. If the EVM processes a CALL targeting one of these addresses, it does not actually call a contract at this address, but simply runs a native Go function that performs the required calculation.

We have already seen where in the code this happens – when we initialize the target contract in the Call() method of the EVM, we check whether the target address is a pre-compiled contract and, if yes, execute the associated Go function instead of running the interpreter. The return values are essentially the same as for an ordinary call – return data, an error and the gas used for this operation.

The pre-compiled contracts as well as the gas cost for executing them are defined in the file contracts.go. At the time of writing, there are nine pre-compiled contracts, residing (virtually) at the addresses 0x1 to 0x9:

  • EC recover algorithm, which can be used to determine the public key of the signer of a transaction
  • SHA256 hash function
  • RIPEMD-160 hash function
  • the data copy function, which simply returns the input as output and can be used to copy large chunks of memory more efficiently than by using the built-in opcodes
  • exponentation module some number M
  • three elliptic curve operations to support zero-knowledge proofs (see the EIPs 196 and 197)
  • the BLAKE2 F compression function hash function (see EIP-152)

Here is the final flow diagram for the smart contract execution that now also reflects the special case of a pre-compiled contract.

With this, we close our post for today. In the next post, we will take a closer look at the CALL opcode and its variations to understand how a smart contract can invoke another contract.

Understanding the Ethereum virtual machine – part I

In todays post, we will shed some light on how the Ethereum virtual machine (EVM) actually works under the hood. We start with an overview of the most relevant data structures and methods and explain the big picture before we look at the interpreter main loop in the next post.

The Go-Ethereum EVM – an overview

To be able to analyze in depth what really happens if a specific opcode is executed, it is helpful to take a look at both the yellow paper and the source code of the Go-Ethereum (geth) client implementing what the yellow paper describes. The code for the EVM is in this folder (I have used version 1.10.6 for the analysis, but the structure should be rather stable across releases).

Let us first try to understand the data structures involved. The diagram below shows the most important classes, attributes and methods that we need to understand.

First, there is the block context. This class is simple, it simply contains some data fields that represent attributes of the block in which the transaction is located and is used to realize opcodes like NUMBER or DIFFICULTY. Similarly, the transaction context (TxContext) holds some fields of the transaction as part of which we execute the smart contract.

Let us now turn to the Contract class. The name of this class is a bit misleading, as it does in fact not represent a smart contract, but the execution of a smart contract, either as the result of a transaction or, more generally, of a message call. Its most important attributes (at least for our present discussion) are

  • The code, i.e. the smart contract code which is actually executed
  • the input provided
  • the gas available for the execution
  • the address at which the smart contract resides (self)
  • the address of the caller (caller and CallerAddress)

It is important to understand the meanings of the various addressed contained in this structure. First, there is the self attribute, which is the contract address, i.e. the address at which the contract itself resides. This is the address which is called Ia in the yellow paper, which is returned by the ADDRESS opcode and which is the address holding the state manipulated by the code, for instance when we run an SSTORE operation. This is also the address returned by the Address() method of the contract.

Next, there is the caller and the callerAddress. In most cases, these two addresses are identical and represent the source of the message call, i.e. what is called the sender Is in the yellow paper. There are cases, however, namely so called delegate calls, where these address are not identical. We will come back to this in the next post.

The contract object also maintains the gas available for the execution. This field is initialized when the the execution starts and can then be reduced by calling UseGas() to consume a certain amount of gas.

Next, there is the EVM itself. The EVM refers to a state (StateDB), a transaction context and a block context. It also holds an attribute abort which can be set to abort the execution, and a field callGasTemp which is used to hold the gas value in some cases, we will see this field in action later.

Finally, there is the EVM interpreter. The interpreter is doing all the hard work of running a piece of code. For that purpose, it references a jump table which is essentially a list of opcodes together with references to corresponding Go functions that need to be run whenever this opcode is encountered. The interpreter also maintains the scope context which is a structure bundling the data that is refreshed with every execution of a smart contract – the content of the memory, the content of the stack and the contract execution, represented by a contract object.

Code execution in the yellow paper

Before we move on to understand how the code execution actually works, let us take a short look at the yellow paper, in particular sections 6, 8 and 8 describing contract execution, and try to map the data structures and functions described there to the part of the source code that we have just explored.

The central function that describes the execution of a contract code in the yellow paper is a function denoted by a capital Theta (Θ) in the yellow paper. This function has the following arguments.

  • the state on which the code operates
  • the sender of the message call or transaction
  • the origin of the transaction (which is always an EOA and the address which signed the transaction)
  • the recipient of the message call
  • the address at which the code to be executed is located (this is typically the same as the recipient, but might again differ in the case of delegated calls)
  • the gas available for the execution
  • the gas price
  • the value to be transferred as part of the message call (again, there is a subtlety for delegate calls that we postpone to the next post)
  • the input data of the message call
  • the depth of the call stack
  • a flag that can be used to prevent the transaction from making any changes to the state (this is required for the STATICCALL functionality)

If you compare this list with the data structures displayed above, you will find that this is essentially the combination of the EVM attributes, the transaction context, the scope context and the contract execution object. All this data is tied together in the EVM class, so it is natural to assume that the function Θ itself is realized by a method of this class – in fact, this is the Call method that we will look at in the next section.

The output of Θ is the updated state, the remaining gas, an object known as accrued substate that contains touched and destroyed accounts, the logs generated during the execution and the gas to be refunded.

The inner workings of Θ are described in section 8 of the yellow paper, First, the value to be transferred is deducted from the balance of the sender and added to the balance of the recipient. Then, the actual code is executed – this happens by calling another function denoted by Ξ (a capital greek xi) – again, there is an exception to this rule for pre-compiled contracts that we discuss in the next post. If the execution is not succesful, then the state is reset to the its previous value, if it is successful, the state returned by Ξ is used. The function Ξ is again not terribly to identify in the source code – it is the method Run() of the EVM interpreter which will be the subject of the next post.

The call method of the EVM

Let us now take a closer look at the method Call() of the EVM which implements what the yellow paper calls Θ. The source code for this method can be found here. For today, I will ignore pre-compiled contracts completely which we will discuss in the next post.

The method starts by running a few checks, like making sure that we do not exceed the call depth limit (which is defined to be 1024 at the moment) or that we do not attempt to transfer more than the available balance.

The next step is to take a snapshot of the current state. Internally, Go-Ethereum uses revisions to keep track of different versions of the state, and taking a snapshot simply amounts to remembering a revision to which we can revert later if needed.

Next, we check whether the contract address already exists. This might be a bit confusing, as it does not seem to make sense to call a contract at a non-existing address, or, more precisely, at an address not yet initialized in the state DB. Note, however, that “calling” means a bit more general “sending a message to an account”, which is also done if you simply want to transfer Ether to an account. Sending a message to a non-contract account is perfectly valid, and it might even be that this account has never been used before and is therefore not part of the cached state.

The next step is to actually perform the transfer of any Ether involved in the message call, i.e. we send value Wei from the sender to the recipient. We then get into the actual bytecode execution by performing the following steps.

  • get the code associated with the contract address (i.e. the runtime bytecode) from the state
  • if the length of the code is zero, return – there is nothing left to be done
  • initialize a new Contract object that represents the current execution.
  • initialize the contract code
  • call the Run method of the interpreter

We then collect the return value from the Run method and a potential error code and set gas to contract.Gas – this represents the gas still remaining after executing the code. We then determine the final return values according to the following logic.

  • If Run did not result in an error, return the return value, error code and remaining gas just assembled
  • If Run returned a special error code indicating that the execution was reverted, reset the state to the previously created snapshot
  • If the error code returned by Run is not a reverted execution, also fall back to the snapshot but in addition, set the remaining gas to zero, i.e. such an error will consume all the available gas

Invocations of the call method

Having understood how Call works, we are now left with two tasks. First, we need to understand how the EVM interpreters Run method works, which will be the topic of our next post. Second, we have to learn where Call is actually invoked within the Go-Ethereum source code.

Not quite surprisingly, this happens at several points (ignoring tests). First, in a previous post, I have already shown you that the EVM’s Call method is invoked whenever a transaction is processed as part of a state transition. This call happens here, and the parameters are as we would expect – the the caller is the sender of the transaction, the contract address is the recipient, and the input data, gas and value are taken from the StateTransition object. The remaining gas returned is again stored in the state transition object and used as a basis for computing the gas refunded to the sender. Note that this entry point is (via the ApplyMessage function) also used by the JSON API when the eth_call method or the eth_estimateGas method are requested.

However, this is not the only point in the code where we find a reference to the Call method. A second point is actually the EVM interpreter itself, more precisely the function opCall in instructions.go. The background of this is at in addition to a call due to a transaction, i.e. a call initiated by an EOA, we can of course also call a smart contract from another smart contract using the CALL opcode. This opcode is implemented by the opCall function, and it turns out that it uses the EVM Call method as well. In this case, the parameters are taken from the stack respectively from the memory location referenced by the stack items.

  • the top level item on the stack is the gas that is made available (as we will see in the next post, this is not exactly true, but almost)
  • the next item on the stack is the target address
  • the third item is the value to be transferred
  • the next two items determine offset and length of the input data which is taken from memory
  • the last two items similarly determine offset and length of the return data area in memor

It is interesting to compare the handling of the returned error code. First, it is used to determine the status code that is returned. If there was an error, the status code is set to zero, otherwise it is set to one. Then, the returned data is stored in memory in case the execution was successful or explicitly reverted, for other errors no return data is passed. Finally, the unused gas is again returned to the currently executing contract.

This has an important consequence – there is no automatic propagation of errors in the EVM! If a contract A calls a contract B, and contract B reverts (either explicitly or due another error), then the call will technically go through, and contract A does not automatically revert as well. Instead, you will have to explicitly check the status code that the CALL opcode puts on the stack and handle the case that contract B fails somehow. Not doing this will make your contract vulnerable to the “King of the Ether” problem that we have discussed in my previous post on contract security.

Finally, scanning the code will reveal that there is a third point where the Call method is invoked – the EVM utility that allows you to run a specified bytecode outside of the Go-Ethereum client from the command line. It is fun to play with this, here is an example for its usage to invoke the sayHello method of our sample contract (again, assuming that you have cloned my repository for this series and are working in the root directory of the repository). Note that in order to install the evm utility, you will have to download the full geth archive, containing all the tools, and make the evm executable available in a folder in your path.

VERSION=$(python3 -c 'import solcx ; print(solcx.get_solc_version())')
DIR=$(python3 -c 'import solcx ; print(solcx.get_solcx_install_folder())')
SOLC="$DIR/solc-v$VERSION"
CODE=$($SOLC contracts/Hello.sol --bin-runtime   | grep "6080")
evm \
  --code $CODE\
  --input 0xef5fb05b \
  --debug run

This little experiment completes this post. In the next post, we will try to fill up the missing parts that we have not yet studied – how the code execution, i.e. the Run method, actually works, what pre-compiled contracts are and how gas is handled during the execution. We will also take a closer look at contract-to-contract calls and its variations.

A deep-dive into Solidity – function selectors, encoding and state variables

In the last post, we have seen how the Solidity compiler creates code – the init bytecode – to prepare and deploy the actual bytecode executed at runtime. Today, we will look at a few standard patterns that we find when looking at this runtime bytecode.

Some useful tools

While analyzing the init bytecode in the last post, we have mainly worked with the output of the Solidity compiler known as opcode listing – the output generated when we supply the –opcode switch. One major drawback of this representation of the bytecode is that we had to manually count instructions to determine the target of a JUMP instruction. Before going deeper into the runtime bytecode of our sample contract, let us collect a few tools that can help us with this.

First, there is the Solidity compiler itself. In addition to the bytecode and the opcodes, it can also generate an enriched output known as assembly output when the –asm switch is used. To do this for our sample contract, run

VERSION=$(python3 -c 'import solcx ; print(solcx.get_solc_version())')
DIR=$(python3 -c 'import solcx ; print(solcx.get_solcx_install_folder())')
SOLC="$DIR/solc-v$VERSION"
$SOLC contracts/Hello.sol --asm --optimize

The output is a mixture of opcodes and statements combining several opcodes into one. The snippet

PUSH1 0x40
PUSH1 0x80
MSTORE

for instance, is displayed as

mstore(0x40, 0x80)

In addition, and that makes this representation very useful, offsets are tagged, so that it becomes much easier to identify jump targets.

Brownie does also offer some useful features to display opcodes of a smart contract. When Brownie compiles a contract, it stores build data in the build subdirectory, and the data in this subdirectory can also be accessed using Python code. In particular, we can access the full bytecode and the runtime bytecode of a compiled contract, like this.

// including init bytecode
project.TmpProject._build.get("Hello")['bytecode']
// runtime bytecode only
project.TmpProject._build.get("Hello")['deployedBytecode']

Alternatively, we can access the bytecode from the deployed contract.

me = accounts[0]
hello = Hello.deploy({"from": me})
// runtime bytecode
hello.bytecode
// full bytecode (input of deployment transaction)
hello.tx.input

In addition to the plain bytecode, Brownie also offers a data structure which contains the opcodes along with offsets and some additional useful information – the pcMap. This is a hash map where the keys are the offsets of the opcodes into the runtime bytecode (the pcMap contains only the runtime bytecode) and the values are again hash maps containing the name of the Solidity function to which the code belongs, the opcode itself and arguments to the opcode as far as applicable. To print this map in a readable format, you can use the following statements.

pcMap = project.TmpProject._build.get("Hello")['pcMap']
for i in sorted(pcMap.keys()):
  print(i, "-->", pcMap[i]);

The pcMap is particularly useful if we combine it with another feature that Brownie has to offer – tracing transactions. A transaction trace contains the exact opcodes executed as part of the transaction. Here is an example.

tx = hello.sayHello()
tx.call_trace()
tx.trace

So the call trace is just a stack trace, while the trace is an array whose entries represent the opcodes that have actually been executed, along with information like the gas cost of the step, the memory content before the step was executed and the stack and storage content before the step was executed. Using tx.source(), we can even get the source code that belongs to a trace step.

The Remix IDE has a similar capability. Once a transaction has been executed and is displayed on the main screen, you can click on the blue “Debug” icon next to the transaction, and a debugger window will open on the left of the screen. You can now step forward and back, inspect opcodes, stack, memory and storage and even set breakpoints. In the Remix IDE, you can even debug deployment transaction, which is not possible in Brownie.

Function selectors

Having all these tools at our disposal, it is now not terribly difficult to understand the actual runtime bytecode. Here is a list of the opcodes, along with a few comments and tags.

// This is the start of the runtime bytecode
// initialize free memory pointer
PUSH1 0x80 
PUSH1 0x40 
MSTORE 
// Repeat the check for a non-zero value
CALLVALUE 
DUP1 
ISZERO 
PUSH1 0xF
// conditionally jump to target 1 
JUMPI 
PUSH1 0x0 
DUP1 
REVERT 
// This is jump target 1. We get here only
// if the value is zero
JUMPDEST 
POP 
PUSH1 0x4 
CALLDATASIZE 
LT 
PUSH1 0x28 
JUMPI // conditional jump to jump target 2
// We only get here if we have at least four bytes
// of data
PUSH1 0x0 
CALLDATALOAD 
PUSH1 0xE0 
SHR 
DUP1 
PUSH4 
0xEF5FB05B 
EQ 
PUSH1 0x2D 
JUMPI 
// This is jump target 2
JUMPDEST 
PUSH1 0x0 
DUP1 
REVERT 
// This is jump target 3, here we enter
// the sayHello function
JUMPDEST 
PUSH1 0x33  // offset of jump target 4
PUSH1 0x35  // offset of jump target 5
JUMP 
// This is jump target 4
JUMPDEST 
STOP 
// This is jump target 5 
// The code starting here is the actual sayHello function
JUMPDEST 
PUSH1 0x40 
MLOAD 
PUSH32 0x3ACB315082DEA2F72DFEEC435F2B0E4DD95A4FD423E89C8CB51DC75FA38D7961 
SWAP1
PUSH1 0x0
SWAP1
LOG1 
JUMP 

I have stripped off a few opcodes at the end which we will take about a bit later. Let us go through the code line by line and try to understand what it does.

The first three lines are familiar – we again initialize the free memory pointer which Solidity stores at memory address 0x40 to its initial value 0x80. Similary, we have already seen the next lines, starting with CALLVALUE, while analyzing the init bytecode. This code again checks that the value of the transaction is zero and reverts if this is not the case, reflecting the fact that our contract does not have a payable function. If the value is zero, the processing continues at the point in the code that I have called jump target 1.

Here, we first clean up the stack by popping the last value. We then push four onto the stack, followed by the output of CALLDATASIZE, which is the length of the transaction input field. The LT opcode compares these two values and pushes the result of the comparison onto the stack. If the result of the comparison is true, i.e. if we have less than four bytes in the input field, we jump to jump target 2, where we again revert.

To understand why this code makes sense, recall that the first four bytes of the input field are supposed to be the hash of the signature of the function we want to call. If we have less than four bytes, the call is not targeting a function, and as we do not have a fallback function, we revert.

If we have at least four bytes of data, we continue at the next line, where we first push zero onto the stack and then run CALLDATALOAD, which loads the first full 32 byte word of the call data onto the stack (the zero that we have just pushed is the offset). We then execute the set of instructions

PUSH1 0xE0 // 0xE0 is 224 
SHR
DUP1
PUSH4 0xEF5FB05B
EQ

This looks a bit mysterious, but is actually not too difficult to understand. After the first push, our stack looks as follows.

| 224 | first 32 bytes of transaction input |

When we then execute SHR, which is a shift operation to the right, we shift the second item on the stack by the number of bits specified by the first item to the right, so we shift the 32 bytes, i.e. 256 bits, by 224 bits to the right. This amounts to moving the first 32 bytes to the rightmost position, so that what we now have on the stack are the first four bytes of the input data, i.e. exactly those four bytes that contain the hash of the function signature. We then push four bytes on the stack, so that our stack is now

| 0xEF5FB05B | first four bytes of the function signature |

and use EQ to compare them, so that stack item at the top of the stack is now

first four bytes of function signature == 0xEF5FB05B

Now open Brownie and run

web3.keccak(text="sayHello()")[:4]

to convince yourself that the four bytes to which we compare are exactly the hash of “sayHello()”. Thus, we execute the conditional jump that comes next only if the first four bytes of the input data indicate that we want to call this method, otherwise we continue and hit upon our return statement.

The code that we have just seen therefore realizes the function selection. If your contract contains more than one function, you will see more than one comparison, and the upshot is that we either jump into the function that corresponds to the signature hash or revert (unless we have a fallback function).

This also tells us that in our case, the execution of sayHello() starts at jump target 3. The code that we see here is also typical. We push two values on the stack – first a return offset and then a jump target. We then jump, execute some code and eventually execute another jump. This second jump will then take its target from the stack, so it returns to the first offset that we have pushed onto the stack. In our case, we jump to target 5, execute the code there, and then jump back to target 4. This approach – pushing return values onto the stack – mimics the way how local functions are executed in other programming languages like C. In our case, jump target 4 is simply executing the STOP opcode which completes the execution without a return value.

Finally, let us take a look at the code at jump target 5, which is therefore the body of sayHello(). Here, we first run MLOAD to get the value of the free memory pointer. We then put a full 32 byte word onto the stack, namely the hash of the string “SayHello()'”, i.e. the signature of the event that we emit. We then swap the first two elements on the stack, push zero and swap once more. Our stack now looks as follows.

| 0x80 | 0x0 |  hash(event signature) | return address  |

Now we execute LOG1. Again, the yellow paper is our friend and tells us that the first entry on the stack is the offset of the log data, the second entry is the length and the third entry is the first (and, in this case, the only) topic. So we log an event with no data and topic being the hash of the event signature, as expected. The log statement will consume the first three stack items, and when we now jump, we therefore end up at tag 4, where we execute the STOP opcode to complete the transaction.

Encoding and state variables

We have now completed the analysis of our sample contract. A natural next step is to add more functionality to the contract and see how this changes the output of the compile. As an example, let us add some state to our contract. In the body of the contract code, add the line

uint256 value;

and the method

function store(uint256 _value) public {
    value = _value;
}

Let us now run the compiler again, this time with a few more flags that request additional output (the reason for this will become clear in a minute).

$SOLC contracts/Hello.sol \
       --asm \
       --optimize \
       --storage-layout \
       --combined-json generated-sources-runtime

Here is a listing of the relevant code that is newly added to our contract by the changes we have made. Again, I have added some comments and labeled the jump destinations from A to E.

PUSH1 0x47     // address of label B 
PUSH1 0x42     // address of label A 
CALLDATASIZE 
PUSH1 0x4 
PUSH1 0x76     // address of label D 
JUMP 
// Label A - this is at offset 0x42 
JUMPDEST 
PUSH1 0x0 
SSTORE 
JUMP 
// Label B - this is at offset 0x47
JUMPDEST 
STOP 
// Label C - this is at offset 0x48
// I have removed the code in this  section
// which we have already looked at before
// it logs the event and then jumps to label B
// where we STOP
// Label D - this is at offset 0x76 
JUMPDEST 
PUSH1 0x0 
PUSH1 0x20 
DUP3 
DUP5 
SUB 
SLT         
ISZERO 
PUSH1 0x87     // address of label E
JUMPI          // conditional jump to label E
PUSH1 0x0 
DUP1 
REVERT 
// Label E - this is at offset 0x87 
JUMPDEST 
POP 
CALLDATALOAD 
SWAP2 
SWAP1 
POP 
JUMP

The first few lines are again easy to interpret – we prepare a jump, which is an internal function call, i.e. we place a return address and, in this case, arguments on the stack and then jump to label D. When we get there, our stack looks as follows (recall that CALLDATASIZE puts the size of the calldata, i.e. the length of the transaction input in bytes, onto the stack).

4 | len(tx.input) | label A | label B

At label D, we put a few additional items on the stack. If you go through the instructions, you will find that when we reach the SUB opcode, the stack looks as follows.

len(tx.input) | 4 | 32 | 0 | 4 | len(tx.input) | A | B

Now we execute the SUB opcode, which will pop the first two items off the stack and push their difference. Thus, after completing this opcode, our stack will be

len(tx.input) - 4 | 32 | 0 | 4 | len(tx.input) | A | B

The next instruction, SLT, is a signed version of the less-than instruction that we have already seen. Together with the subsequent ISZERO which is a simple logical inversion, its impact is to provide the following stack.

!(len(tx.input) - 4 < 32) | 0 | 4 | len(tx.input) | A | B

To get an idea what this is supposed to do, looking at the assembler output helps. In the comments that Solidity has generated, we find a hint – utility.yul. As the Solidity documentation explains, this means that the code we are looking at is part of a library of utility functions, written in the Yul language (an intermediate language that Solidity uses internally). However, these utility functions are not stored anywhere in a file with this name, but are actually generated on the fly by the compiler (in our case, this happens here). The additional flag generated-source-runtime that we have used when running Solidity instructs the compiler to print out a Yul representation of the utility functions. The Yul code, the name of the function and the source code of the Solidity compiler that I have linked above solve the puzzle – the code we are looking at is supposed to decode the transaction input and to extract the argument (which is called _value in the source code of our contract).

Now the Solidity ABI demands that the argument be stored in the transaction input as a 256-bit, i.e. 32 byte word, directly after the four bytes containing the function signature. What the code that we are analyzing is doing is to check that the total length of the transaction input is at least those four bytes plus the 32 bytes. If this is not the case, we continue and revert. If this is the case, i.e. if the validation is successful, we perform a conditional jump and end up at label E. When we get there, our stack is

0 | 4 | len(tx.input) | A | B

We now remove the first item on the stack, use CALLDATALOAD to load a full 32 byte word starting at byte 4 of the transaction input onto the stack (i.e. the 32 byte word that is supposed to contain our parameter), and use two swaps and a pop operation to produce the following stack.

A  | _value | B

The conditional jump will therefore take us to label A again, with the _value parameter at the top of the stack. Hee, we push zero onto the stack and perform an SSTORE. This will store _value at position zero of the storage and leave us with the address of label B on the stack. The following jump will therefore take us to the STOP opcode, and the transaction completes.

So, the content at offset zero of the storage seems to represent the stored value. Here, we could easily derive this from the code, but in general, this can be more difficult. To help us to map the state variables declared in the source code to storage locations, Solidity creates a storage map which we have included in our output using the –storage-layout switch. The storage layout is an array, where each entry represents one state variable. For each variable, there is a slot and an offset. As indicated in the documentation, the slot is the address in the storage area, but one slot can contain more than one item (if an item is smaller than 32 bytes), and in this case, the offset is the offset within the slot. For dynamic data types, the layout is much more complicated, for mappings, for instance, the actual slot is determined as a hash value of he key.

Metadata and hashes

If you have followed the analysis carefully, you might have noted that the last few opcodes do not seem to be executed at all. In fact, they do not even make sense, starting already with an invalid opcode 0xFE. Again, the assembler output helps to interpret this – it designates this part of the bytecode as “auxdata”, which does in fact not contain valid bytecode, but the IFPS hash of the contract metadata (more precisely a CBOR encoded structure which contains the IPFS hash as a key)

The contract metadata, which can be produced using the –metadata compiler switch, is a JSON structure that contains, among other things

  • the contract ABI
  • the Keccak hash of the source code
  • the IPFS hash of the source code
  • the exact compiler version
  • the compiler settings used to produce the bytecode

The idea behind this is that a developer can store the metadata and the contract source in IPFS. A user who finds the contract on the blockchain can then use the last few bytes – the IPFS hash of the metadata – to retrieve that document from the IPFS network. As the metadata document contains the IPFS hash of the source, a user could now retrieve the source as well. This mechanism therefore allows you to link the source code to the contract and to prove that the contract bytecode has been created using the source code and a given set of compiler settings. Within the Solidity source code, all this happens here.

We have seen that the metadata hash and the runtime bytecode are separated by the invalid opcode 0xFE. This byte appears at another location in the full bytecode – the end of the init bytecode. In both cases, the motivation is the same – we want to avoid that, due to an error, the execution can continue past these boundaries. So we now realize that the full bytecode contains of three sections, separated by the invalid opcode 0xFE.

This closes our post for today. Of course, you could now add additional features to our contract, maybe return values or mappings, and see how this affects the generated bytecode. In the next post, however, we will turn to another topic which is central to understanding smart contracts – how the Ethereum virtual machine actually operates.

A deep-dive into Solidity – contract creation and the init code

In some of the previous posts in this series, we have already touched upon contract creation and referred to the fact that during contract creation, an init bytecode is sent as part of a transaction which is supposed to return the actual bytecode of the smart contract. In this and the next post, we will look at this in a bit more detail and, along the way, learn how to decipher the EVM bytecode for a simpler contract.

Contract creation – an overview

Before diving into details, let us first make sure we understand the contract creation process in Solidity. A good starting point is section 7 of the Ethereum yellow paper.

A transaction will create a contract if the recipient address of the transaction is empty (i.e. technically the zero address). A creation operation can contain a value, which is then credited to the address of the newly created contract (even though in Solidity, this requires a payable constructor). Then, the initialisation bytecode, i.e. the content of the init field of the transaction, is executed, and the returned array of bytes is stored as the bytecode of the newly created contract. Thus there are in fact two different types of bytecode involved during the creation of a smart contract – the runtime bytecode which is the code executed when the contract is invoked after its initial creation, and the init bytecode which is responsible for preparing the contract and returning the runtime bytecode.

To understand what “returning the runtime bytecode” actually means, we need to consult the definition of the RETURN opcode in appendix H. Here, the return value function Hreturn is specified, which is referenced in section 9 and defines the output of a bytecode execution. It takes a moment to get familiar with the notation, but what the definition actually says is that the output is placed in the virtual machine memory, where the offset is determined by the top of the stack and the length is determined by the second element on the stack. Thus the init bytecode needs to

  • make any changes to the state of the contract address needed (maybe initialize some state variables)
  • place the runtime bytecode somewhere in memory
  • push the length of the runtime bytecode onto the stack
  • push the offset of the runtime bytecode (i.e. the address in memory where it starts) onto the stack
  • execute the RETURN statement

To make this a bit more tangible, let us again use Brownie to see how this works in practice. We will use a simple sample contract which does nothing except logging an event when its sayHello method is invoked. So make sure that you have a Brownie project directory containing this contract (if you have cloned my repository, I recommend to create a tmp subdirectory and link the contract there, as described here), and open the Brownie console. Then, we deploy a copy of the contract and inspect the transaction that Brownie has used to do this.

me = accounts[0]
hello = Hello.deploy({"from": me})
tx = web3.eth.get_transaction(hello.tx.txid)  
tx
hello.balance()

You should see that the value of the transaction is zero, the recipient is None and the input is an array of bytes, starting with 0x60806040. This is the init bytecode, which we will study in the remaining part of the post. You can also see that the initial balance of the contract is zero.

Reading EVM bytecode – the basics

Before we dive into the init bytecode, we first have to collect some basic facts about how the Ethereum virtual machine (EVM) works. Recall that the bytecode is simply an array of bytes, and each byte will be interpreted as an operation. More precisely, appendix H of the yellow paper contains a list of opcodes each of which represents a certain operation that the machine can perform, and during execution, the EVM basically goes through the bytecode, tries to interpret each byte as an opcode and executes the corresponding operation.

The EVM is what computer scientists call a stack machine, meaning that virtually all operations somehow manipulate the stack – they take arguments from the stack, perform an operation and put the resulting value onto the stack again. Note that most operations actually consume values from the stack, i.e. pop them. As an example, let us take the ADD operation, which has bytecode 0x1. This operation takes the first two values from the stack, adds them and places the result on the stack again. So if the stack held 3 and 5 before the operation was executed, it will hold 8 after the operation has completed.

Even though most operations take their input from the stack, there are a few notable exceptions. First, there are the PUSH operations, which are needed to prepare the stack in the first place and cannot take their arguments from the stack, as this would create an obvious chicken-and-egg challenge. Instead, the push operation takes its argument from the code, i.e. pushes the byte or the sequence of bytes immediately following the instruction. There is one push operation for each byte length from 1 to 32, so PUSH1 pushes the byte in the code immediately following the instruction, PUSH2 pushes the next two bytes and so forth. It is important to understand that even PUSH32 will only place one item on the stack, as each stack item is a 32 byte word, using big endian notation.

The init bytecode

Armed with this understanding, let us now start to analyze the init bytecode. We have seen that the init bytecode is stored in the transaction input, which we can, after deployment, also access as hello.tx.input. The first few bytes are (using Solidity 0.8.6, this might change in future versions)

0x6080604052

Let us try to understand this. First, we can look up the opcode 0x60 in the yellow paper and find that it is the opcode of PUSH1. Therefore, the next byte in the code is the argument to PUSH1. Then, we see the same opcode again, this time with argument 0x40. And finally, 0x52 is the opcode for MSTORE, which stores the second stack item in memory at the address given by the first stack item. Thus, in an opcode notation, this first piece of the bytecode would be

PUSH1 0x80
PUSH1 0x40
MSTORE

and would result in the value 0x80 being written to address 0x40 in memory. This looks a bit mysterious, but most if not all Solidity programs start with this sequence of bytes. The reason for this is the how Solidity organizes its memory internally. In fact, Solidity uses the memory area between address zero and address 0x7F for internal purposes, and stores data starting at address 0x80. So initially, free memory starts at 0x80. To keep track of which memory can still be used and which memory areas are already in use, Solidity uses the 32 bytes starting at memory address 0x40 to keep track of this free memory pointer. This is why a typical Solidity program will start by initializing this pointer to 0x80.

We could now continue to analyze the remaining bytecode in this way, manually looking up opcodes in the yellow paper, but this if of course not terribly efficient. Instead, let us ask the Solidity compiler to spit out the opcodes for us, instead of the plain bytecode. We do not even have to download and install Solidity, because we have already done this when installing the py-solcx module. So let us politely ask Python to spit out the location and version number of the solc binary and invoke it to compile our contract to opcode.

VERSION=$(python3 -c 'import solcx ; print(solcx.get_solc_version())')
DIR=$(python3 -c 'import solcx ; print(solcx.get_solcx_install_folder())')
SOLC="$DIR/solc-v$VERSION"
$SOLC contracts/Hello.sol --opcodes

As a result, you should see something like this (I have added linebreaks to make this more readable and only reproduced the first few opcodes).

====== contracts/Hello.sol:Hello =======
Opcodes:
PUSH1 0x80 
PUSH1 0x40 
MSTORE 
CALLVALUE 
DUP1 
ISZERO 
PUSH1 0xF 
JUMPI 
PUSH1 0x0 
DUP1 
REVERT 
JUMPDEST               <---- Marker A
POP 
PUSH1 0x99 
DUP1 
PUSH2 0x1E 
PUSH1 0x0 
CODECOPY 
PUSH1 0x0 
RETURN 
INVALID 
PUSH1 0x80             <--- Marker B
PUSH1 0x40 
MSTORE

This is much better (in fact, Solidity can actually produce a number of different output formats – as we go deeper into the actual runtime bytecode in the next post, we will find –asm useful as well). I have also added two markers manually to the output that we will need when discussing the code.

We have already analyzed the first three lines, so let us look at the next section of the code, starting at CALLVALUE. Again, we can consult the yellow paper to figure out what this instruction does – it gets the value of the transaction and stores it on the stack. We then duplicate this value on the stack, so that the stack now looks like this

| value | value |

and invoke the ISZERO operation. This operation takes the first stack item and replaces it by one if it is zero or by zero otherwise. Next, we push 0x1F, so our stack now looks like this

| 0x1F | value == 0 | value

The next instruction is JUMPI. This is a conditional jump which is only executed if the second stack item is non-zero, and in this case, we jump to the point in the bytecode designated by the first stack item. Thus, if the value of the transaction is zero, we jump to the offset 0x1F, otherwise we continue.

Let us suppose for a moment we include a non-zero value with our transaction. Then, we continue with the next statement after the JUMPI, push zero onto the stack, duplicate and REVERT. Consulting the yellow paper once more, we find that the two topmost items on the stack that are present when we do a revert are used to define the return value – the rule is the same as for RETURN, meaning that the first item on the stack is an offset, the second item is the length. Thus with two zeroes on the stack, we do not return anything. Summarizing, we revert the transaction if the contract creation transaction has a non-zero value, and Solidity generates this code because we have not declared a payable constructor.

Let us now see how the execution proceeds if the value is zero. To be able to do this, we have to figure out the instruction at offset 0x1F (15). So let us count – every instruction consumes one byte, and the additional arguments to PUSH1 also consume one byte each. Thus, we find that the execution continues at the JUMPDEST instruction that I have called marker A. The JUMPDEST opcode does not actually do anything, it is simply a marker byte that the EVM uses to make sure that a jump points to valid location. So we now enter the part of the code that reads like this.

JUMPDEST               <---- Marker A
POP 
PUSH1 0x99 
DUP1 
PUSH2 0x1E 
PUSH1 0x0 
CODECOPY 
PUSH1 0x0 
RETURN 
INVALID 
PUSH1 0x80             <--- Marker B

Note that at this point, we still have the transaction value on the stack, which we remove with the first POP statement. We then push 153, duplicate this, push 30 and zero, so the stack now looks like this

| 0 | 30 | 153 | 153 |

The next instruction is CODECOPY. This copies code of the currently running contract to memory. It consumes three parameters from the stack. The element at the top of the stack defines the target address (i.e. offset) in memory. The second parameter defines the source offset in the code, and the third parameter defines the number of bytes to copy.

Counting once more, we see that the code we copy is 153 bytes long and starts at the point that I have called marker B. The code starting there will therefore be copied to address zero in memory, and after that has been done, our stack contains 153. We then push 0, so that the stack now looks like

| 0 | 153 | 

Finally, we RETURN. Now recalling how the return value of a contract execution is defined, we see that the return value of executing all of this is the bytearray of length 153 stored at address zero in memory, which, as we have just seen, are the 153 bytes of code starting at marker B. So the upshot is that this is the runtime bytecode, and the code we have just analyzed does nothing but (after making sure that the transaction value is zero) copying this bytecode into memory and returning it (by the way – if you want to see where exactly in the Solidity source code this happens, this link might be a good entry point for your research).

That’s it – we have successfully deciphered the initialization procedure of a very simple smart contract. Note that if the contract had a constructor, it would be executed first, before copying the runtime bytecode and returning (you might want to add a simple constructor and repeat the analysis). In the next post, we will learn a few additional tricks to obtain useful representations of the runtime bytecode and ten dive into how the runtime bytecode works. See you!

Smart contract security – some known issues

Smart contracts are essentially immutable programs that are designed to handle valuable assets on the blockchain and are mostly written in a programming language that has been around for a bit more than five years and is still rapidly evolving. That sounds a bit dangerous, and in fact the short history of the Ethereum blockchain is full of notable examples that demonstrate how smart contracts can be exploited to steal money. In this post, we will look at a few known security issues that you should try to avoid.

Background – receiving payments with Solidity

Not quite surprisingly, most exploits that we have seen that target smart contracts are somehow related to those parts of a contract that makes payments, so let us first try to make sure that we understand how payments are handled in Solidity.

First, recall that as any other address, a contract address has a balance and can therefore receive and transfer Ether. On the level of the Ethereum virtual machine (EVM), this is rather straightforward. Whenever a smart contract is called, be it from an EOA or another contract, the message call specifies a value. This is the amount of Ether (in Wei) that should be transferred as part of the contract execution. Very early in the processing, before any contract code is actually executed, this amount is transferred from the balance of the caller to the balance of the callee (unless, of course, the balance of the caller is not sufficient).

In Solidity, the situation is a bit more complicated. To see why, let us first imagine that you write and deploy a smart contract and then someones transfers Ether to the contract address. That amount is then added to the balance of the contract, and to access it, you would either need to submit a transaction signed with the private key of the contract address, or the contract itself needs to implement a function that can transfer the Ether to some other address, preferrably an EOA address. Now, a smart contract address has no associated private key – it is the result of a calculation at the time the contract is created, not a key generation process. So the only way to use Ether that is held by a contract is to invoke a function of the contract that transfers it out of the contract again. Thus if you accidentally transfer Ether to a smart contract which does not have such a function, maybe because it was never designed to receive Ether, the Ether is lost forever.

To avoid this, the designers of Solidity have decided that contract functions that can receive Ether need to be clearly marked as being able to handle Ether by declaring them as payable. In fact, if a contract method is not marked as being payable, the compiler will generate code that, if that method is called, checks if the message call specifies a non-zero value, i.e. if Ether should be transferred as part of the call. If yes, this code will revert the execution so that the transfer will fail.

Apart from an ordinary function call, there are special cases that we need to handle. First, it might of course happen that a smart contract is invoked without specifying a method at all. This happens if someone simply sends Ether to a smart contract (maybe without even knowing that the target address is a smart contract) and leaves the data field in the transaction (which, as we know, contains a hash of the target function to be called) empty. To handle this case, Solidity defines a special function receive. If this function is present in a contract, and the contract is called without specifying a target function, this method will be executed.

A similar mechanism exists to cover the case that a contract is invoked with a target function that does not exist or is invoked with no target function and no receive function exists. This special function is called the fallback function (in previous versions of Solidity, fallback and receive functions were identical). If none of these fallback functions is present, the execution will fail.

Send and transfer

Having discussed how a smart contract can receive Ether, let us now discuss how a smart contract can actually send Ether. Solidity offers different ways to do this. First, there is the send method. This is a method of an address object in Solidity and can be used to transfer a certain amount of Ether from the contract address to an arbitrary address. So you could do something like

address payable receiver =  payable(address(0xFC2a2b9A68514E3315f0Bd2a29e900DC1a815a1D));        
// Be careful, do NOT do this!
receiver.send(100);

to send 100 Wei to the target address receiver (note that in recent versions of Solidity, an address to which you want to send Ether needs to be marked as payable). However, this code already contains a major issue – it does not check the return value of send!

In fact, send does return true if the transfer was successful and false if the transfer failed (for instance because the current balance is not sufficient, or because the target is a smart contract without a receive or fallback function, or if the target is a contract with a receive function, but this function runs out of gas). If, as in this example, you do not check the return code, a failed transfer will go unnoticed. As an illustration, let us consider a famous example where exactly this happened – the King of the Ether contract . The idea of this contract was that by paying a certain amount of Ether, you could claim a virtual throne and be appointed “King of the Ether”. If someone else now pays an amount which is the amount which you have paid times a factor, this person would become the new King, and you would receive the amount that you invested minus a fee. In the source code of v0.4 of the contract, the broken section looks as follows (I have added a few comments not present in the original source code to make it easier to read the snippet without having the full context)

// we get to this point in the code if someone has paid enough to
// become the new king
// valuePaid is the Ether paid by the current king
// wizardCommission is a fee that remains in the account
// of the contract and can be claimed by the contract owner (wizard) 
uint compensation = valuePaid - wizardCommission;

// In its initial state, the current monarch is the wizard
// so we check for this
if (currentMonarch.etherAddress != wizardAddress) {
  // here we send the the Ether minus the fees back 
  // to the current king
  currentMonarch.etherAddress.send(compensation);
} else {
  // When the throne is vacant, the fee accumulates for the wizard.
}

Note how send is used without checking the return code. What actually happened is that some people who held the throne did apparently use what is called a contract based wallet, i.e. a wallet that manages your Ether in a smart contract. Thus, the address of the current king (currentMonarch) was actually a smart contract. If a smart contract receives Ether, then, as we have seen above, it will execute a function. Now send only makes a very small amount of gas (2300 to be precise) available to the called contract (this is called the gas stipend, and we will dive into this and how a call actually works under the hood in a later post), which was not sufficient to run the code. So the called contract failed, but, as the return value was not checked, the calling contract continued, effectively stealing the compensation instead of paying it out.

The withdrawal pattern

It is interesting to discuss how this can be fixed. The obvious idea might be to check the return value and revert the transaction if it is false. Alternatively, one can use the second method that Solidity offers to transfer Ether – the transfer method, which will revert if the transfer fails. This, however, results in a new problem, as it allows for a denial-of-service attack.

To see this, suppose that a contract successfully claims the throne, and then someone else tries to become the new king, resulting in the execution of the code above. Suppose we use transfer instead of send. Now the contract which is the current king might be a malicious contract with a receive function that always reverts, or no receive function at all. Then, any attempt to become the new king will be reverted, and the contract is stuck forever.

This is a very general problem that you will face whenever a method of a smart contract calls another contract – you can not rely on the other contract to cooperate and it is dangerous to assume that the call will be successful. Therefore, the Solidity documentation recommends a pattern known as the withdrawal pattern. In our specific case, this would work as follows. Instead of immediately paying out the compensation, you would store the claim in the contract state and allow the previous king to call a withdraw method that does the transfer, maybe like this.

// this replaces currentKing.send(compensation)
claims[currentKing]+=compensation
// code goes on...


// a new function that allows the current king to collect the compensation
function withdraw() public {
  uint256 claim = claims[msg.sender];
  if (claim > 0) {
    claims[msg.sender] = 0;
    payable(msg.sender).transfer(claim);
  }
  else {
    revert("Unjustified claim");
  }
}

Why would this help? Suppose an attacker implements a contract that reverts if Ether is sent to it. If this contract is the current king and someone else claims the throne, enthroning the new king will work, because the transfer is contained in the separate function withdraw. If now the attacker invokes this function, it will still revert, but this will not impact the functionality of the contract of other users, so not denial of service (impacting anyone except the attacker) will result.

Reentrancy attacks and TheDAO

Let us suppose for a moment that in the code snippet above, we had chosen a slightly different order of the statements, and, in addition, had decided to use a low-level call to transfer the money, like this

(bool success, bytes memory data) = msg.sender.call{value: claim}("");
require(success, "Failed to send Ether");
claims[msg.sender] = 0;

Here, we use the call method of an address, which has the advantage over transfer that it does not only make the minimum of 2300 units of gas available to the caller, but the full gas remaining at this point. This makes the contract less vulnerable to errors resulting out of non-trivial receive functions, which is the reason why it is sometimes recommended to use this approach instead of transfer.

This would in fact make our contract again vulnerable, this time to a class of attacks known as re-entrancy attack. To exploit this vulnerability, an attacker would have to prepare a malicious contract that enthrones itself and whose receive function calls the withdraw function again (but with a depth of at most one). If no someone else has claimed the throne and the malicious contract calls withdraw, the following things would happen.

  1. The malicious contract calls withdraw for the first time
  2. withdraw initiates the transfer of the current claim to the malicious contract
  3. the receive function of the malicious contract is invoked
  4. the receive function calls withdraw once more
  5. at this point in time, the variable claims[msg.sender] still has its original, non-zero value
  6. so the same transfer is made again
  7. both transfers succeed, and the claim is overwritten by zero twice

As a result, the claim is transferred twice to the malicious contract (assuming, of course, that the King of the Ether contract has a sufficient balance). Of course instead of invoking the function twice, you can let the receive function call back into the contract several times, thus multiplying the amount transferred by the number of calls, limited only by the stack size and the available gas. This sort of vulnerability was the root cause for the famous theDAO hack, which eventually lead to a fork of the Ethereum block chain.

Note that in this case, using transfer instead of call would actually protect against this sort of attack, at the second call into the King of the Ether contract would require more gas than transfer makes available.

Create2 and the illusion of immutable contracts

Smart contracts are immutable – are they? Well, actually no – there are several ways to change the behaviour of a smart contract after it has been deployed. First, you could of course build a switch into your contract that only the owner can control. A bit more advanced, a contract can act as a proxy, delegating calls to another contract, and the contract owner could change the address of the target contract while keeping the address of the proxy the same.

An additional option has been created with EIP-1014. This proposal, which went live with the Constantinople hard fork in 2019, introduced a new opcode CREATE2 which allows for the creation of a contract with a predictable address. Recall that when a contract is created, the contract address is determined from the address of the owner and the current nonce. This makes it difficult to predict the address of the contract, unless you use an account for contract creation whose nonce is kept stable. When using CREATE2 instead, the contract address is taken to be the hash value of a combination of the sender address, a salt and the init code of the contract to be created.

The problem with this is, however, that the init code does not fully determine the runtime bytecode. Recall that the init code is bytecode that is executed at deployment time, and whose return value will be stored and used as the actual contract code executed at runtime (we will see this in action in the next post). The init code could, for instance, retrieve the actual runtime bytecode by calling into another contract. If the state of this contract is changed to return a different bytecode, the init code will still be the same. Thus, by using CREATE2 repeatedly with the same init code and salt, different versions of a contract could be stored at the same address.

To avoid this, the creators of EIP-1014 introduced a safeguard – if the target address already contains code or has a non-zero nonce, the invocation will fail. However, there is a loophole, which works as follows.

  1. Prepare an init bytecode that get the actual runtime bytecode from a different contract, as outlined above
  2. Use CREATE2 to deploy this runtime bytecode to a specific address
  3. In the runtime bytecode, include a method that executes the SELFDESTRUCT opcode (protected by the condition that it only executes if the sender address is an address that you control). This is an opcode that will effectively wipe out the code of a contract and set the nonce of the contract address back to zero
  4. Motivate people to deposit something of value in your contract, maybe Ether or token
  5. At any point in time, you could now use this method to remove the existing contract. At this point, the nonce and code are both zero. You could now invoke CREATE2 once more to deploy a new contract to the same address with a different runtime bytecode, which maybe steals whatever assets have been deposited in the old contract

In this way, the functionality of a smart contract can be changed without anyone noticing it. Of course, this only works under specific conditions, the most important one being that the contract needs to contain the SELFDESTRUCT opcode. The only real protection is to have a look at the contract source code (or event at the runtime bytecode) before trusting it and become alerted if the contract has a SELFDESTRUCT in it (or uses an instruction like DELEGATECALL to invoke code that contains a SELFDESTRUCT). It seems that Etherscan is now able to track contract recreation using CREATE2, here is an example from the Ropsten test network, note the “Reinit” flag being displayed on the contract tab, and here is an example from mainnet.

This concludes our post for today. There are many more security considerations and pitfalls that you should be aware of whenever you develop a smart contract that is going to be used on a real network with real money being involved. In the next section, I have listed a few references that you might want to consult to learn more about smart contract security. I hope you found this interesting and see you again in the next post, in which we will take a closer look at how Solidity translates your source code into EVM bytecode.

References

Here is a list of references that I found useful while collecting the material for this post.

  1. OpenZeppelin has a rather comprehensive list of post-mortems on its web site
  2. Consensys maintains a collection of best practises for smart contracts that explain common vulnerabilities and how to protect against them
  3. The Solidity documentation contains a section on security considerations
  4. This paper contains a classification of common vulnerabilities and discusses which of them can be avoided by using Vyper instead of Solidity as a smart contract language
  5. A similar list can be found in this conference paper
  6. The implications of the CREATE2 opcode have been discussed in detail here
  7. Finally, the documentation on ethereum.org contains a section on security considerations as well

Compiling and deploying a smart contract with geth and Python

In our last post, we have been cheating a bit – I have shown you how to use the web3 Python library to access an existing smart contract, but in order to compile and deploy, we have still been relying on Brownie. Time to learn how this can be done with web3 and the Python-Solidity compiler interface as well. Today, we will also use the Go-Ethereum client for the first time. This will be a short post and the last one about development tools before we then turn our attention to token standards.

Preparations

To follow this post, there is again a couple of preparational steps. If you have read my previous posts, you might already have completed some of them, but I have decided to list them here once more, in case you are just joining us or start over with a fresh setup. First, you will have to install the web3 library (unless, of course, you have already done this before).

sudo apt-get install python3-pip python3-dev gcc
pip3 install web3

The next step is to install the Go-Ethereum (geth) client. As the client is written in Go, it comes as a single binary file, which you can simply extract from the distribution archive (which also contains the license) and copy to a location on your path. As we have already put the Brownie binary into .local/bin, I have decided to go with this as well.

cd /tmp
wget https://gethstore.blob.core.windows.net/builds/geth-linux-amd64-1.10.6-576681f2.tar.gz
gzip -d geth-linux-amd64-1.10.6-576681f2.tar.gz
tar -xvf  geth-linux-amd64-1.10.6-576681f2.tar
cp geth-linux-amd64-1.10.6-576681f2/geth ~/.local/bin/
chmod 700 ~/.local/bin/geth
export PATH=$PATH:$HOME/.local/bin

Once this has been done, it is time to start the client. We will talk more about the various options and switches in a later post, when we will actually use the client to connect to the Rinkeby testnet. For today, you can use the following command to start geth in development mode.

geth --dev --datadir=~/.ethereum --http

In this mode, geth will be listening on port 8545 of your local PC and bring up local, single-node blockchain, quite similar to Ganache. New blocks will automatically be mined as needed, regardless of the gas price of your transactions, and one account will be created which is unlocked and at the same time the beneficiary of newly mined blocks (so do not worry, you have plenty of Ether at your disposal).

Compiling the contract

Next, we need to compile the contract. Of course, this comes down to running the Solidity compiler, so we could go ahead, download the compiler and run it. To do this with Python, we could of course invoke the compiler as a subprocess and collect its output, thus effectively wrapping the compiler into a Python class. Fortunately, someone else has already done all of the hard work and created such a wrapper – the py-solc-x library (a fork of a previous library called py-solc). To install it and to instruct it to download a specific version of the compiler, run the following commands (this will install the compiler in ~/.solcx)

pip3 install py-solc-x
python3 -m solcx.install v0.8.6
~/.solcx/solc-v0.8.6 --version

If the last command spits out the correct version, the binary is installed and we are ready to use it. Let us try this out. Of course, we need a contract – we will use the Counter contract from the previous posts again. So go ahead, grab a copy of my repository and bring up an interactive Python session.

git clone https://github.com/christianb93/nft-bootcamp
cd nft-bootcamp
ipython3

How do we actually use solcx? The wrapper offers a few functions to invoke the Solidity compiler. We will use the so-called JSON input-output interface. With this approach, we need to feed a JSON structure into the compiler, which contains information like the code we want to compile and the output we want the compiler to produce, and the compiler will spit out a similar structure containing the results. The solcx package offers a function compile_standard which wraps this interface. So we need to prepare the input (consult the Solidity documentation to better understand what the individual fields mean), call the wrapper and collect the output.

import solcx
source = "contracts/Counter.sol"
file = "Counter.sol"
spec = {
        "language": "Solidity",
        "sources": {
            file: {
                "urls": [
                    source
                ]
            }
        },
        "settings": {
            "optimizer": {
               "enabled": True
            },
            "outputSelection": {
                "*": {
                    "*": [
                        "metadata", "evm.bytecode", "abi"
                    ]
                }
            }
        }
    };
out = solcx.compile_standard(spec, allow_paths=".");

The output is actually a rather complex data structures. It is a dictionary that contains the contracts created as result of the compilation as well as a reference to the source code. The contracts are again structured by source file and contract name. For each contract, we have the ABI, a structure called evm that contains the bytecode as well as the corresponding opcodes, and some metadata like the details of the used compiler version. Let us grab the ABI and the bytecode that we will need.

abi = out['contracts']['Counter.sol']['Counter']['abi']
bytecode = out['contracts']['Counter.sol']['Counter']['evm']['bytecode']['object']

Deploying the contract

Let us now deploy the contract. First, we will have to import web3 and establish a connection to our geth instance. We have done this before for Ganache, but there is a subtlety explained here – the PoA implementation that geth uses has extended the length of the extra data field of a block. Fortunately, web3 ships with a middleware that we can use to perform a mapping between this block layout and the standard.

import web3
w3 = web3.Web3(web3.HTTPProvider("http://127.0.0.1:8545"))
from web3.middleware import geth_poa_middleware
w3.middleware_onion.inject(geth_poa_middleware, layer=0)

Once the middleware is installed, we first get an account that we will use – this is the first and only account managed by geth in our setup, and is the coinbase account with plenty of Ether in it. Now, we want to create a transaction that deploys the smart contract. Theoretically, we know how to do this. We need a transaction that has the bytecode as data and the zero address as to address. We could probably prepare this manually, but things are a bit more tricky if the contract has a constructor which takes arguments (we will need this later when implementing our NFT). Instead of going through the process of encoding the arguments manually, there is a trick – we first build a local copy of the contract which is not yet deployed (and therefore has no address so that calls to it will fail – try it) then call its constructor() method to obtain a ContractConstructor (this is were the arguments would go) and then invoke its method buildTransaction to get a transaction that we can use. We can then send this transaction (if, as in our case, the account we want to use is managed by the node) or sign and send it as demonstrated in the last post.

me = w3.eth.get_accounts()[0];
temp = w3.eth.contract(bytecode=bytecode, abi=abi)
txn = temp.constructor().buildTransaction({"from": me}); 
txn_hash = w3.eth.send_transaction(txn)
txn_receipt = w3.eth.wait_for_transaction_receipt(txn_hash)
address = txn_receipt['contractAddress']

Now we can interact with our contract. As the temp contract is of course not the deployed contract, we first need to get a reference to the actual contract as demonstrated in the previous post – which we can do, as we have the ABI and the address in our hands – and can then invoke its methods as usual. Here is an example.

counter = w3.eth.contract(address=address, abi=abi)
counter.functions.read().call()
txn_hash = counter.functions.increment().transact({"from": me});
w3.eth.wait_for_transaction_receipt(txn_hash)
counter.functions.read().call()

This completes our post for today. Looking back at what we have achieved in the last few posts, we are now proud owner of an entire arsenal of tools and methods to compile and deploy smart contracts and to interact with them. Time to turn our attention away from the simple counter that we used so far to demonstrate this and to more complex contracts. With the next post, we will actually get into one the most exciting use cases of smart contracts – token. Hope to see you soon.

Using web3.py to interact with an Ethereum smart contract

In the previous post, we have seen how we can compile and deploy a smart contract using Brownie. Today, we will learn how to interact with our smart contract using Python and the Web3 framework which will also be essential for developing a frontend for our dApp.

Getting started with web3.py

In this section, we will learn how to install web3 and how to use it to talk to an existing smart contract. For that purpose, we will once more use Brownie to run a test client and to deploy an instance of our Counter contract to it. So please go ahead and repeat the steps from the previous post to make sure that an instance of Ganache is running (so do not close the Brownie console) and that there is a copy of the Counter smart contract deployed to it. Also write down the contract address which we will need later.

Of course, the first thing will again be to install the Python package web3, which is as simple as running pip3 install web3. Make sure, however, that you have GCC and the Python development package (python3-dev on Ubuntu) on your machine, otherwise the install will fail. Once this completes, type ipython3 to start an interactive Python session.

Before we can do anything with web3, we of course need to import the library. We can then make a connection to our Ganache server and verify that the connection is established by asking the server for its version string.

import web3
w3 = web3.Web3(web3.HTTPProvider('http://127.0.0.1:8545'))
w3.clientVersion

This is a bit confusing, with the word web3 occurring at no less than three points in one line of code, so let us dig a bit deeper. First, there is the module web3 that we have imported. Within that module, there is a class HTTPProvider. We create an instance of this class that connects to our Ganache server running on port 8545 of localhost. With this instance, we then call the constructor of another class, called Web3, which is again defined inside of the web3 module. This class is dynamically enriched at runtime, so that all namespaces of the API can be accessed via the resulting object w3. You can verify this by running dir(w3) – you should see attributes like net, eth or ens that represent the various namespaces of the JSON RPC API.

Next, let us look at accounts. We know from our previous post that Ganache has ten test accounts under its control. Let us grab one of them and check its balance. We can do this by using the w3 object that we have just created to invoke methods of the eth API, which then translate more or less directly into the corresponding RPC calls.

me = w3.eth.get_accounts()[0]
w3.eth.get_balance(me)

What about transactions? To see how transactions work, let us send 10 Ether to another address. As we plan to re-use this address later, it is a good idea to use an address with a known private key. In the last post, we have seen how Brownie can be used to create an account. There are other tools that do the same thing like clef that comes with geth. For the purpose of this post, I have created the following account.

Address:  0x7D72Df7F4C7072235523A8FEdcE9FF6D236595F3
Key:      0x5777ee3ba27ad814f984a36542d9862f652084e7ce366e2738ceaa0fb0fff350

Let us transfer Ether to this address. To create and send a transaction with web3, you first build a dictionary that contains the basic attributes of the transaction. You then invoke the API method send_transaction. As the key of the sender is controlled by the node, the node will then automatically sign the transaction. The return value is the hash of the transaction that has been generated. Having the hash, you can now wait for the transaction receipt, which is issued once the transaction has been included in a block and mined. In our test setup, this will happen immediately, but in reality, it could take some time. Finally, you can check the balance of the involved accounts to see that this worked.

alice = "0x7D72Df7F4C7072235523A8FEdcE9FF6D236595F3"
value = w3.toWei(10, "ether")
txn = {
  "from": me,
  "to": alice,
  "value": value,
  "gas": 21000,
  "gasPrice": 0
}
txn_hash = w3.eth.send_transaction(txn)
w3.eth.wait_for_transaction_receipt(txn_hash)
w3.eth.get_balance(me)
w3.eth.get_balance(alice)

Again, a few remarks are in order. First, we do not specify the nonce, this will be added automatically by the library. Second, this transaction, using a gas price, is a “pre-EIP-1559” or “pre-London” transaction. With the London hardfork, you would instead rather specify a maximum fee per gas and a priority fee per gas. As I started to work on this series before London became effective, I will stick to the legacy transactions throughout this series. Of course, in a real network, you would also not use a gas price of zero.

A second important point to be aware of is timing. When we call send_transaction, we hand the transaction over to the node which signs it and publishes it on the network. At some point, the transaction is included in a block by a miner, and only then, a transaction receipt becomes available. This is why we call wait_for_transaction_receipt which actively polls the node (at least when we are using a HTTP connection) until the receipt is available. There is also a method get_transaction_receipt that will return a transaction receipt directly, without waiting for it, and it is a common mistake to call this too early.

Also, note the conversion of the value. Within a transaction, values are always specified in Wei, and the library contains a few helper functions to easily convert from Wei into other units and back. Finally, note that the gas limit that we use is the standard gas usage of a simple transaction. If the target account is a smart contract and additional code is executed, this will not be sufficient.

Now let us try to get some Ether back from Alice. As the account is not managed by the node, we will now have to sign the transaction ourselves. The flow is very similar. We first build the transaction dictionary. We then use the helper class Account to sign the transaction. This will return a tuple consisting of the hash that was signed, the raw transaction itself, and the r, s and v values from the ECDSA signature algorithm. We can then pass the raw transaction to the eth.send_raw_transaction call.

nonce = w3.eth.get_transaction_count(alice)
refund = {
  "from": alice,
  "to": me,
  "value": value, 
  "gas": 21000,
  "gasPrice": 0,
  "nonce": nonce
}
key = "0x5777ee3ba27ad814f984a36542d9862f652084e7ce366e2738ceaa0fb0fff350"
signed_txn = w3.eth.account.sign_transaction(refund, key)
txn_hash = w3.eth.send_raw_transaction(signed_txn.rawTransaction)
w3.eth.wait_for_transaction_receipt(txn_hash)
w3.eth.get_balance(me)
w3.eth.get_balance(alice)

Note that this time, we need to include the nonce (as it is part of the data which is signed). We use the current nonce of the address of Alice, of course.

Interacting with a smart contract

So far, we have covered the basic functionality of the library – creating, signing and submitting transactions. Let us now turn to smart contracts. As stated above, I assume that you have fired up Brownie and deployed a version of our smart contract. The contract address that Brownie gave me is 0x3194cBDC3dbcd3E11a07892e7bA5c3394048Cc87, which should be identical to your result as it only depends on the nonce and the account, so it should be the same as long as the deployment is the first transaction that you have done after restarting Ganache.

To access a contract from web3, the library needs to know how the arguments and return values need to be encoded and decoded. For that purpose, you will have to specify the contract ABI. The ABI – in a JSON format – is generated by the compiler. When we deploy using Brownie, we can access it using the abi attribute of the resulting object. Here is the ABI in our case.

abi = [
    {
        'anonymous': False,
        'inputs': [
            {
                'indexed': True,
                'internalType': "address",
                'name': "sender",
                'type': "address"
            },
            {
                'indexed': False,
                'internalType': "uint256",
                'name': "oldValue",
                'type': "uint256"
            },
            {
                'indexed': False,
                'internalType': "uint256",
                'name': "newValue",
                'type': "uint256"
            }
        ],
        'name': "Increment",
        'type': "event"
    },
    {
        'inputs': [],
        'name': "increment",
        'outputs': [],
        'stateMutability': "nonpayable",
        'type': "function"
    },
    {
        'inputs': [],
        'name': "read",
        'outputs': [
            {
                'internalType': "uint256",
                'name': "",
                'type': "uint256"
            }
        ],
        'stateMutability': "view",
        'type': "function"
    }
]

This looks a bit intimidating, but is actually not so hard to read. The ABI is a list, and each entry either describes an event or a function. For both, events and functions, the inputs are specified, i.e. the parameters., and similarly the outputs are described. Every parameter has types (Solidity distinguishes between internal types and the type used for encoding), and a name. For events, the parameters can be indexed. In addition, there are some specifiers for functions like the information whether it is a view or not.

Let us start to work with the ABI. Run the command above to import the ABI into a variable abi in your ipython session. Having this, we can now instantiate an object that represents the contract within web3. To talk to a contract, the library needs to know the contract address and its ABI, and these are the parameters that we need to specify.

address = "0x3194cBDC3dbcd3E11a07892e7bA5c3394048Cc87"
counter = w3.eth.contract(address=address, abi=abi)

It is instructive to user dir and help to better understand the object that this call returns. It has an attribute called functions that is a container class for the functions of the contract. Each contract function shows up as a method of this object. Calling this method, however, does not invoke the contract yet, but instead returns an object of type ContractFunction. Once we have this object, we can either use it to make a call or a transaction (this two-step approach reminds me a bit of a prepared statement when using embedded SQL).

Let us see how this works – we will first read out the counter value, then increment by one and then read the value again.

counter.functions.read().call()
txn_hash = counter.functions.increment().transact({"from": me})
w3.eth.wait_for_transaction_receipt(txn_hash)
counter.functions.read().call()

Note how we pass the sender of the transaction to the transact method – we could as well include other parameters like the gas price, the gas limit or the nonce at this point. You can, however, not pass the data field, as the data will be set during the encoding.

Another important point is how parameters to the contract method need to be handled. Suppose we had a method add(uint256) which would allow us to increase the counter not by one, but by some provided value. To increase the counter by x, we would then have to run

counter.functions.add(x).transact({"from": me})

Thus the parameters of the contract method need to be part of the call that creates the ContractFunction, and not be included in the transaction.

So far we have seen how we can connect to an RPC server, submit transactions, get access to already deployed smart contracts and invoke their functions. The web3 API has a bit more to offer, and I urge you to read the documentation and, in ipython, play around with the built-in help function to browse through the various objects that make up the library. In the next post, we will learn how to use web3 to not only talk to an existing smart contract, but also to compile and deploy a contract.