Understanding the Ethereum virtual machine – part III

Having found our way through the mechanics of the Ethereum virtual machine in the last post, we are now in a position to better understand what exactly goes on when a smart contracts hand over control to another smart contract or transfers Ether.

Calls and static calls

Towards the end of the previous post in this series, we have already taken a glimpse at how an ordinary CALL opcode is being processed. As a reminder, here is the diagram that displays how the EVM and the EVM interpreter interact to run smart contract

In our case – the processing of a CALL – this specifically implies that the following steps will be carried out (we ignore gas processing for the time being, as this is a bit more complicated and will be discussed in depth in a separate section).

  • the interpreter hits upon the CALL opcode
  • it performs a look up in the jump table and determines that the function opCall needs to be run
  • it gets the parameters from the stack, in particular the address of the contract to be called (the second stack item)
  • it then extracts the input data from the memory of the currently executing code
  • we then invoke the Call method of the EVM, using the contract address and the input data as arguments
  • as we have learned, this will result in a new execution context (i.e. a new Contract object, a new stack and a freshly initialized memory) in which the code of the target contract will be executed
  • at the end, we get the returned data (an array of bytes) back
  • if everything went fine, we copy the returned data back into the memory of the currently executing contract

It is important to understand that this comes with a full context switch – the target contract will execute with its own stack and memory, state changes made by the target contract will refer to the state of the target contract, and Ether transferred with the call is credited to the target contract.

Also note that there are actually two ways how the result of the call is made available to the caller. First, the result of the call (a pointer to a byte array) will be copied to the memory of the calling contract. In addition, the return value is also returned by opCall and there it is copied once more, this time to a special buffer called the return data buffer. The caller can copy the data stored in this buffer and determine its length using the RETURNDATACOPY and RETURNDATALENGTH opcodes introduced with EIP-211 (in order to make it easier to pass back return data whose length is not yet known when the call is made).

In summary, the called contract is executed essentially as if it were the initial contract execution of the underlying transaction. Calls can of course be nested, so we now see that a transaction should be considered as the top-level call, which can be followed by a number of nested calls (actually, this number is limited, for instance by the limited depth of the call stack).

Of course, executing an unknown contract can be a significant security risk. We have seen an example in our post on smart contract security, where a malicious contract calling back into your own contract can cause a double-spending. Therefore, it is natural to somehow try to restrict what a called contract can due. One of the first restrictions of this type is the introduction of the STATICCALL with EIP-214. A static call is very much like an ordinary call, except that the called contract is not allowed to make any state changes, in particular no value transfer is possible as part of a static call.

The function opStaticCall realizing this is actually very similar to the processing of an ordinary call. There are two essential differences. First, there is no value and therefore one parameter less that needs to be taken from the stack. Second, the method of the EVM that is eventually invoked is not Call but StaticCall. The structure of this function is very similar to that of an ordinary call, so let us focus on the differences. Here is a short snipped (leaving out some parts to focus on the differences) of the Call method.

evm.Context.Transfer(evm.StateDB, caller.Address(), addr, value)
code := evm.StateDB.GetCode(addr)
contract := NewContract(caller, AccountRef(addrCopy), value, gas)
contract.SetCallCode(&addrCopy, evm.StateDB.GetCodeHash(addrCopy), code)
ret, err = evm.interpreter.Run(contract, input, false)

And here is the corresponding code for a static call (again, I have made a few changes to better highlight the differences).

addrCopy := addr
code := evm.StateDB.GetCode(addr)
contract := NewContract(caller, AccountRef(addrCopy), new(big.Int), gas)
contract.SetCallCode(&addrCopy, evm.StateDB.GetCodeHash(addrCopy), code)
ret, err = evm.interpreter.Run(contract, input, true)

So we see that there are three essential differences. First, in a static call, there is value transfer – this is as expected, as a static call is not allowed to make a value transfer which represents a change to the state. Second, when we build the contract, the third parameter is zero – again, this is related to the fact that there is no value transfer, as this parameter determines the value that, for instance, the opcode CALLVALUE returns. Finally, we set the third parameter of the Run function to true. In our discussion of the Run method in the previous post, we have already seen that this disallows all instructions which are marked as state changing.

Delegation and the proxy pattern

Apart from calls and static calls, there is a third way to invoke another contract, namely a delegate call. Roughly speaking, a delegate call implies that instead of executing the code of the called contract within the context of the called contract, we execute the code within the context of the caller. Thus, we essentially run the code of the called contract as if it were a part of the caller code, as you would run a library (however, this is of course not how libraries are actually realized in Solidity where a library is simply linked into the contract at build time).

In the EVM, a delegate call is done using the opcode DELEGATECALL (well, that did probably not come as a real surprise). Similar to a static call, there is no value transfer for this call and correspondingly no value parameter on the stack. Going through the same analysis as for a static call, we find that execution of the opcode delegates to the method DelegateCall() of the EVM. Let us again look at the parts of the code that differ from an ordinary call.

addrCopy := addr
code := evm.StateDB.GetCode(addr)
contract := NewContract(caller, AccountRef(caller.Address()), nil, gas).AsDelegate()
contract.SetCallCode(&addrCopy, evm.StateDB.GetCodeHash(addrCopy), code) 
ret, err = evm.interpreter.Run(contract, input, false)

Looking at this , we spot three differences compared to an ordinary call. First, the second parameter used for the creation of the new contract (which is the parameter which will determine the self field of the new contract and with that the address used to read and change state during contract execution) is not set to the target contract, but to the address of the caller, i.e. the currently executing contract, while the address used to determine the code to be run is still that of the target contract. Thus, as promised, we execute the code of the target contract within the context of the currently executing contract.

A second difference is the third argument used for contract creation, which is the value transferred with this call. Again, this is zero (even nil). Finally, after creating the contract, we execute its AsDelegate() method. This changes the attributes CallerAddress and value of the contract to that of the currently executing contract. Thus, whenever we execute the opcodes CALLVALUE or CALLER, we get the same values as in the context of the currently executing contract, as promised by EIP-7, the EIP which introduced delegate calls.

One of the motivations behind introducing this possibility was that it allows for a pattern known as proxy pattern. In this pattern, there are two contracts involved. First, there is the proxy contract. The proxy contract accepts a call or transaction and is responsible for holding the state. It does, however, not contain any non-trivial logic. Instead, it uses a delegate call to invoke the logic residing in a second contract, the logic contract.

Why would you want to do this? There are, in fact, a couple of interesting use cases for this pattern. First, it allows you to build an upgradeable contract. Recall that – at least until the CREATE2 opcode was introduced – it was not possible to change a smart contract after is has been deployed. Even though this is of course by intention and increases trust in a smart contract (it will be the same, no matter when you interact with it), it also implies a couple of challenges, most notably that it makes it impossible to add features to a smart contract over time or to fix a security issue. The proxy pattern, however, does allow you to do this. You could, for instance, store the address of the logic contract in the proxy contract instead of hard-coding it, and then add a method to the proxy that allows you to change that address. You can then deploy a new version of the logic to a new address and then update the address stored in the proxy contract to migrate from the old version to the new version. As the state is part of the proxy contract which stays at its current location, the state will be untouched, and as the address that the users interact with does not change, the users might not even notice the change. Needless to say that this is very useful for some cases, but can also be abused by tricking a user into trusting a contract and then changing its functionality, so be careful when interacting with a smart contract that performs delegation.

A second use case is related to re-use. As an example, suppose you have developed a smart contract that implements some useful wallet-like functionality, maybe time-triggered transfers. You want to make this available to others. Now you could of course allow anybody to deploy your smart contract, but this would lead to many addresses on the blockchain containing exactly the same code. Alternatively, you could store the logic in one logic contract and than only distribute the code for the proxy. A new user would then simply deploy a proxy, so each proxy would act as a wallet with an individual state and balance, but all of them would run the same logic. Again, it goes without saying that this implies that your users trust you and your contract – if, for instance, your logic contract is able to remove itself (“self-destruct” using the corresponding opcode), than this would of course render all deployed proxies useless and the balance stored in them would be lost forever.

Finally (and this apparently was one of the motivation behind EIP-7) you could have a large contract whose deployment consumes more gas than the gas limit of a block allows. You could then split the logic into several smaller logic contracts and use a proxy to tie them together into a common interface.

There are several ongoing attempts to standardize this pattern and in particular upgradeable contracts. EIP-897, for instance, proposes a standard to expose the address to which a proxy is pointing. EIP-1967 addresses an interesting problem that the pattern has – the logic contract and the proxy contract share a common state, and thus the proxy contract needs to find a way to store the address of the logic contract without conflicting with the storage layout of the logic contract. Finally, EIP-1822 proposes a standard for upgradeable contracts. It is instructive to read through these EIPs and I highly advise you to do so and also have a look at the implementations described or linked in them.

Gas handling during a call

Let us now turn to gas handling during a call. We have already seen that, as for every instruction, there is a constant gas cost and a dynamic gas cost. In addition, there are two special contributions which are not present for other instructions – a refund and a stipend.

The constant gas cost is simple – this is simply a constant value of (currently) 700 units of gas, increased from previously 40 with EIP-150. The dynamic gas cost is already a bit more complicated and itself consists of four positions. The first three positions are rather straightforward

  • first, there is a fee of 9000 units of gas when a non-zero value is transferred as part of the call
  • second, there is an account creation fee of 25000 whenever a non-zero value is transferred to a non-existing account as part of the call
  • third, there is the usual gas fee for memory expansion, as for many other instructions

The fourth contribution to the dynamic gas cost is a bit more tricky. The problem we are facing at this point is that the contract which is called will of course consume gas as well, but at this point in time, we do not know how much this is going to be. To solve this, a position called the gas cap is used. Initially, this gas cap was simply the first stack item, i.e. the first argument to the CALL instruction, which specifies the gas limit for the contract to be executed, i.e. the part of our remaining gas that we want to pass on to the called contract. We could now simply use this number as additional gas cost and then, once the called contract returns, see how much of that is still unused and refund that amount.

This is indeed how the gas payment for a call worked before EIP-150 was introduced. This EIP was drafted to address denial-of-service attacks that utilized the fact that the costs for some instructions, among them making a call, was no longer reflecting the actual computing cost on the client. As a counter-measure, the cost for a call was increased from previously 40 to the new still valid 700. This, however, caused problems with existing contract that tried to calculate the amount of gas they would make available to called contract by taking the currently remaining gas (inquired via the GAS opcode) and subtracting the constant fee of 40 units of gas. To avoid this, the developers thought about coming up with a mechanism which allowed a contract to make “almost all” remaining gas available to the caller, without having to hard-code gas fees. More precisely, “almost all” means that the following algorithm is applied to calculate the gas cap.

  • Determine the gas which is currently still available, after having deducted the constant gas cost already
  • Determine the base fee, i.e. the dynamic gas cost for the call calculated so far (memory fee, transfer fee and creation fee)
  • Subtract this from the remaining gas to determine the gas which will still be available after paying for all other gas cost contributions (“projected available gas”)
  • Read out the first value from the stack (the first parameter of the GAS instruction), i.e. the requested gas limit
  • determine a gas cap as 63 / 64 times the projected available gas
  • if the requested gas limit is higher than the gas cap, return the gas cap, otherwise return the requested gas limit

Thus a contract can effectively pass almost all of the remaining gas to the callee by providing a very large requested gas limit as first argument to the CALL instruction, so that the requested gas limit is definitely smaller than the calculated cap. The factor of 63 / 64 has been put in as an additional protection against recursive calls. The outcome of this algorithm is then used for two purposes – as an upfront payment to cover the maximum amount of gas that the callee might need, and as the gas supply that the callee actually obtains for its execution.

Now, I have been cheating a bit as there are two components in the diagram above that we have not yet discussed. First, I have just told you that the outcome of the EIP-150 algorithm is passed as available gas to the callee. This, however, is only true if the call does not transfer any Ether. If it does, there is an additional stipend of 2300 gas which is added to the gas made available to the callee before actually making the call. Note that this stipend does not count against the gas cost of the callee, as it is not part of the dynamic gas cost, so it effectively has two implications – it reduces the cost of the call by 2300 units of gas and, at the same time, it makes sure that even if the caller specified zero as gas limit for the call, the callee has at least 2300 units of gas available. The motivation of this is that a call with a non-zero value typically triggers the receive function or fallback function of the called contract, and calls with a gas supply of zero will let this function fail. Thus the gas stipend serves as a safe-guard to reduce the risk of a value transfer failing because the recipient is a smart contract and its receive- or fallback-function runs out of gas.

Finally, there is the refund, which happens here and simply amounts to adding the gas that the callee has not consumed to the available gas of the current execution context again.

The gas stipend and transfers in Solidity

The gas stipend is one of the less documented features of smart contracts, and a part of the confusion that I have seen around this topic (which, in fact, was the main motivation for the slightly elaborated treatment in this post) comes from the fact that a gas stipend exists in the EVM as well as in Solidity.

As explained above, the EVM adds the gas stipend depending on the value transferred with the call – in fact, the stipend only applies to calls with a non-zero value. In addition to this, Solidity applies the same logic, but only if the value is zero. To see this, you might want to use a simple contract like this one.

contract Transfer {

    uint256 value;

    function doIt() public {
        payable(msg.sender).transfer(value);
    }
}

If you compile this, for instance in Remix, and take a look at the generated bytecode, you will see that eventually, the transfer translates into a CALL instruction. The preparation of the stack preceding this instruction is a bit involved, but if you go through this carefully and wait until the dust has settled, you will find that the top of the stack looks as follows.

(value == 0) * 2300 | sender | value |

Thus the first value, which specified the gas to be made available for the subcontract, is 2300 (the gas stipend) if the value is zero, and zero otherwise. In the first case, the EVM will not add anything, in the second case, the EVM will add its own gas stipend. Thus, regardless of the value, the net effect will be that the gas stipend of 2300 units of gas always applies for a transfer. You might also want to look at this snippet in the Solidity source code that creates the corresponding code (at least if I interpret the code correctly).

What this analysis tells us as well is that there is no way to instruct the compiler to increase the gas limit of the transfer. As the 2300 units of gas will only be sufficient for very simple functions, you need a different approach when invoking contracts with a more complex receive function. When we discuss NFTs in a later post in this series, we will see how you can use interfaces in Solidity to easily call functions of a target contract. Alternatively, to simply invoke the fallback function or the receive function with a higher gas limit, you can use a low-level call. To see this in action, change the transfer in the above sample code to

(bool success, ) = 
     payable(msg.sender).call{value: value}("");

When you now compile again, take a look at the resulting bytecode and locate the CALL instruction, you will see that immediately before we do the CALL, we execute the GAS opcode. As we know, this pushes the remaining available gas onto the stack. Thus the first argument to the CALL is the remaining gas. As, by the EIP-150 algorithm above, this is in every case more than the calculated cap, the result is that the cap will be used, i.e. almost all remaining gas will be made available to the called contract. Be sure, however, to check the return value and handle any errors that might have occurred in the called contract, as Solidity does not add extra code to make sure that we revert upon errors. Note that there is an ongoing discussion to extend the functionality of transfer in Solidity to allow a transfer to explicitly pass on all the remaining gas, see this thread.

With this, we have reached the end of our post for today. In this and the previous two posts, we have taken a deep-dive into how the Ethereum virtual machine actually works, guided by the yellow paper and the code of the Go-Ethereum client. In the next post, we will move on and start to explore one of the currently “hottest” applications of smart contract – non-fungible token. Hope to see you soon!

Understanding the Ethereum virtual machine – part II

In todays post, we will complete our understanding of how the EVM executes a smart contract. We will investigate the actual interpreter loop, discuss gas handling and have a short look at pre-compiled contracts.

The jump table and the main loop

In the last post, we have seen that the entry point to the actual code execution is the Run method of the EVM interpreter. This method is essentially going through the bytecode step by step and, for each opcode, looking up the opcode in a data structure known as jump table– Among other things, this table contains a reference to a Go function that is to be executed to process the instruction. More specifically, an entry in the jump table contains the following fields, which partially refer to other tables in other source code files.

  • First, there is a Go function which is invoked to process the operation
  • Next, there is a gas value which is known as the constant gas cost of the operation. The idea behind this is that the gas cost for the execution of an instruction typically has two parts – a static part which is independent of the parameters and a dynamic part which depends on parameters like the memory consumption or other parameters. This field represents the static part
  • The third field is again a function that can be used to derive the dynamic part of the gas cost
  • The fourth field – minStack – is the number of stack items that this operation expects
  • The next field – maxStack – is the maximum size of the stack that will still allow this operation to work without overflowing the stack. For most operations, this is simply the maximum stack size minus the number of items that the operation pops from the stack plus the number of items that it adds to the stack
  • The next field, memorySize, specifies how much memory the opcode needs to execute. Again, this is a function, as the result could depend on parameters
  • The remaining fields are a couple of flags that describe the type of operation. The flag halts is set if the operation ends the execution of the code. At the time of writing, this is set for the opcodes STOP, RETURN and SELFDESTRUCT.
  • Similarly, the reverts flag indicates whether this opcode explicitly reverts the execution and is currently only set for the REVERT opcode itself
  • The return flag indicates whether this opcode returns any data. This is the case for the call operations STATICCALL, DELEGATECALL, CALL, and CALLCODE, but also for REVERT and contract creation via CREATE and CREATE2
  • The writes flag indicates whether the operation modifies the state and is set of operations like SSTORE
  • Finally, the jumps flag indicates whether the operation is a jump instruction and therefore modifies the program counter

Another data structure that will be important for the execution of the code is a set of fields known as the call context. This refers to a set of variables that make up the current of the interpreter and are reset every time a new execution starts, like memory, stack and the contract object.

Let us now go through the Run method step by step and try to understand what it does. First, it increments the call stack depth which will be decremented again at the end of the function. It also sets the read only flag of the interpreter if not yet done and resets the return data field. Next, we initialize the call context and set the program counter to zero before we eventually enter a loop called the main loop.

Within this loop, we first check every 1000th step whether the abort flag is set. If yes, we stop execution (my understanding is that this feature is primarily used to cancel running EVM operations that were started as part of an API call). Next, we use the current value of the program counter to read the next opcode that we need to process, and look up that operation in the jump table (raising an error if there is no entry, which indicates an invalid opcode).

Once we have the jump table entry in our hands, we can now check the current stack size against the minimum and maximum stack size of the instruction and make sure that we raise an error if we try to process an operation in read-only mode that potentially updates the state.

We then calculate the gas required to perform the operation. As already explained, the gas consumption has two parts – a static part and a dynamic part. For each of these two contributions, we invoke the method UseGas() of the contract object, which will reduce the gas left that the contract tracks and also raise an error if we are running out of gas.

We then execute the operation by invoking the Go function to which it is mapped. This function will typically get some data from the stack, perform some calculations and push data back to the stack, but can also modify the state and perform more complex operations. Most if not all operations are contained in instructions.go, and it is instructive to scan the file and look at a few operations to get a feeling for how this works (we will go through a more complex example, the CALL operation, in a later post).

Once the instruction completes, we check the returns flag of the jump table entry to see whether the instruction returns any data, and if yes, we copy this data to the returnData field of the interpreter so that it is available for the next instruction. We then decide whether the execution is complete and we need to return to leave the main loop, or whether we need to continue execution with an updated program counter.

So the main loop is actually rather straightforward, and, together with our discussion of the Call() method in the previous post, we now have a fairly complete picture of how contract execution works.

Handling gas consumption

Let us leverage this end-to-end view to put together the various bits and pieces to understand how gas consumption is handled. We start our discussion on the level of an entire block. In one of the previous posts, we have already seen that when a block is processed here, two gas related variables are maintained. First, the processing keeps track of the gas used for all transactions in this block, which corresponds to the gasUsed field of a block header. In addition, there is a block gas pool, which is simply a counter initialized with the current block gas limit and used to keep track of the gas which is still available without violating this limit.

When we now process a single transaction contained in the block, we invoke the function applyTransaction. In this function, we increase the used gas counter on the block level by the gas consumed by the transaction and use that information to create the transaction receipt, that contains both the gas used by the transaction and the current value of the cumulative gas usage on the block level. This is done based on the return value of the ApplyMessage function, which itself immediately delegates to the TransitionDB method of a newly created state transition object.

The state transition object contains two additional gas counters. The first counter (st.gas) keeps track of the gas still available for this transaction, and is initialized with the gas limit of the transaction, so this is the equivalent of the gas pool on the block level. The second counter is the initial value of this field and only used to be able to calculate the gas actually used later on.

When we now process the transaction, we go through the following steps.

  • First, we initialize the gas counters
  • Then, we deduct the upfront payment from the senders balance. The upfront payment is the gas price times the gas limit and therefore the maximum amount of Ether that the sender might have to pay for this transaction
  • Similarly, we reduce the block gas limit by the gas limit of the transaction
  • Next, we calculate the intrinsic gas for the transaction. This is the amount of gas just for executing the plain transaction, still without taking any contract execution into account. It is calculated (ignoring contract creations) by taking a flat fee of currently 21000 units of gas per transaction, plus a fee for every byte of the transaction input (which is actually different for zero bytes and non-zero bytes). In addition, there is a fee for each entry in the access list (this is basically a list of accounts and addresses for which a discount applies when accessing them, see EIP-2930). In the yellow paper, the intrinsic gas is called g0 and defined in section 6.2
  • We then reduce the remaining gas by the intrinsic gas cost (again according to what section 6.2 of the yellow paper prescribes) and invoke Call(), using the remaining gas counter st.gas as the parameter which determines the gas available for this execution. Thus the gas available to the contract execution is the gas limit minus the intrinsic gas cost. We have already seen that this creates a Contract containing another gas counter which keeps track of the gas consumed during the execution. Within the interpreter main loop, we calculate static and dynamic gas cost for each opcode and reduce the counter accordingly. At the end, the remaining gas is returned
  • We update the remaining gas counter st.gas with the value returned by Call(). We then perform a refund, i.e. we the remaining gas times gas price back to the sender and also put the remaining gas back into the gas pool on block level

This has a few interesting consequences. First, it demonstrates that the total gas cost of executing a transaction does actually consist of two parts – the intrinsic gas for the transaction and the cost of executing the opcodes of the smart contract (if any). Both of these components have a static part (the 21000 base fee for the intrinsic gas cost and the static fee per opcode for the code execution) and a dynamic part, which depends on the transaction.

The second thing that you want to remember is that in order to make sure that a transaction is processed, it is not sufficient to have enough Ether to pay for the gas actually used. Instead, you need to have at least the gas limit times the gas price, otherwise the upfront payment will fail. Similarly, you need to make sure that the gas limit of your transaction is lower than the block gas limit, otherwise the block will not be mined.

Pre-compiled contracts

There is a special case of calling a contract that we have ignored so far – pre-compiled contracts. Before diving down into the code once more, let me quickly explain what pre-compiled contracts are and why they are useful.

Suppose you wanted to develop a smart contract that needs to calculate a hash value. The EVM has a built-in opcode SHA3 to calculate the Keccak hash, but what about other hashing algorithms? Of course, as the EVM is Turing-complete, you could develop a contract that does this, but this would blow up your contract considerably and, in addition, would probably be extremely slow as this would mean executing complex mathematical operations in the EVM. As an alternative, the Ethereum designers came up with the idea of a pre-compiled contract. Roughly speaking, this is a kind of extension of the instruction set of the EVM, realized as contracts located at pre-defined addresses. The contract at address 0x2, for instance, calculates an SHA256 hash, and the contract at address 0x3 a RIPEMD-160 hash. These contracts are, however, not really placed on the blockchain – if you look at the code at this address using for instance the JSON API method eth_getCode, you will not get anything back. Instead, these pre-defined contracts are handled by the EVM. If the EVM processes a CALL targeting one of these addresses, it does not actually call a contract at this address, but simply runs a native Go function that performs the required calculation.

We have already seen where in the code this happens – when we initialize the target contract in the Call() method of the EVM, we check whether the target address is a pre-compiled contract and, if yes, execute the associated Go function instead of running the interpreter. The return values are essentially the same as for an ordinary call – return data, an error and the gas used for this operation.

The pre-compiled contracts as well as the gas cost for executing them are defined in the file contracts.go. At the time of writing, there are nine pre-compiled contracts, residing (virtually) at the addresses 0x1 to 0x9:

  • EC recover algorithm, which can be used to determine the public key of the signer of a transaction
  • SHA256 hash function
  • RIPEMD-160 hash function
  • the data copy function, which simply returns the input as output and can be used to copy large chunks of memory more efficiently than by using the built-in opcodes
  • exponentation module some number M
  • three elliptic curve operations to support zero-knowledge proofs (see the EIPs 196 and 197)
  • the BLAKE2 F compression function hash function (see EIP-152)

Here is the final flow diagram for the smart contract execution that now also reflects the special case of a pre-compiled contract.

With this, we close our post for today. In the next post, we will take a closer look at the CALL opcode and its variations to understand how a smart contract can invoke another contract.

Understanding the Ethereum virtual machine – part I

In todays post, we will shed some light on how the Ethereum virtual machine (EVM) actually works under the hood. We start with an overview of the most relevant data structures and methods and explain the big picture before we look at the interpreter main loop in the next post.

The Go-Ethereum EVM – an overview

To be able to analyze in depth what really happens if a specific opcode is executed, it is helpful to take a look at both the yellow paper and the source code of the Go-Ethereum (geth) client implementing what the yellow paper describes. The code for the EVM is in this folder (I have used version 1.10.6 for the analysis, but the structure should be rather stable across releases).

Let us first try to understand the data structures involved. The diagram below shows the most important classes, attributes and methods that we need to understand.

First, there is the block context. This class is simple, it simply contains some data fields that represent attributes of the block in which the transaction is located and is used to realize opcodes like NUMBER or DIFFICULTY. Similarly, the transaction context (TxContext) holds some fields of the transaction as part of which we execute the smart contract.

Let us now turn to the Contract class. The name of this class is a bit misleading, as it does in fact not represent a smart contract, but the execution of a smart contract, either as the result of a transaction or, more generally, of a message call. Its most important attributes (at least for our present discussion) are

  • The code, i.e. the smart contract code which is actually executed
  • the input provided
  • the gas available for the execution
  • the address at which the smart contract resides (self)
  • the address of the caller (caller and CallerAddress)

It is important to understand the meanings of the various addressed contained in this structure. First, there is the self attribute, which is the contract address, i.e. the address at which the contract itself resides. This is the address which is called Ia in the yellow paper, which is returned by the ADDRESS opcode and which is the address holding the state manipulated by the code, for instance when we run an SSTORE operation. This is also the address returned by the Address() method of the contract.

Next, there is the caller and the callerAddress. In most cases, these two addresses are identical and represent the source of the message call, i.e. what is called the sender Is in the yellow paper. There are cases, however, namely so called delegate calls, where these address are not identical. We will come back to this in the next post.

The contract object also maintains the gas available for the execution. This field is initialized when the the execution starts and can then be reduced by calling UseGas() to consume a certain amount of gas.

Next, there is the EVM itself. The EVM refers to a state (StateDB), a transaction context and a block context. It also holds an attribute abort which can be set to abort the execution, and a field callGasTemp which is used to hold the gas value in some cases, we will see this field in action later.

Finally, there is the EVM interpreter. The interpreter is doing all the hard work of running a piece of code. For that purpose, it references a jump table which is essentially a list of opcodes together with references to corresponding Go functions that need to be run whenever this opcode is encountered. The interpreter also maintains the scope context which is a structure bundling the data that is refreshed with every execution of a smart contract – the content of the memory, the content of the stack and the contract execution, represented by a contract object.

Code execution in the yellow paper

Before we move on to understand how the code execution actually works, let us take a short look at the yellow paper, in particular sections 6, 8 and 8 describing contract execution, and try to map the data structures and functions described there to the part of the source code that we have just explored.

The central function that describes the execution of a contract code in the yellow paper is a function denoted by a capital Theta (Θ) in the yellow paper. This function has the following arguments.

  • the state on which the code operates
  • the sender of the message call or transaction
  • the origin of the transaction (which is always an EOA and the address which signed the transaction)
  • the recipient of the message call
  • the address at which the code to be executed is located (this is typically the same as the recipient, but might again differ in the case of delegated calls)
  • the gas available for the execution
  • the gas price
  • the value to be transferred as part of the message call (again, there is a subtlety for delegate calls that we postpone to the next post)
  • the input data of the message call
  • the depth of the call stack
  • a flag that can be used to prevent the transaction from making any changes to the state (this is required for the STATICCALL functionality)

If you compare this list with the data structures displayed above, you will find that this is essentially the combination of the EVM attributes, the transaction context, the scope context and the contract execution object. All this data is tied together in the EVM class, so it is natural to assume that the function Θ itself is realized by a method of this class – in fact, this is the Call method that we will look at in the next section.

The output of Θ is the updated state, the remaining gas, an object known as accrued substate that contains touched and destroyed accounts, the logs generated during the execution and the gas to be refunded.

The inner workings of Θ are described in section 8 of the yellow paper, First, the value to be transferred is deducted from the balance of the sender and added to the balance of the recipient. Then, the actual code is executed – this happens by calling another function denoted by Ξ (a capital greek xi) – again, there is an exception to this rule for pre-compiled contracts that we discuss in the next post. If the execution is not succesful, then the state is reset to the its previous value, if it is successful, the state returned by Ξ is used. The function Ξ is again not terribly to identify in the source code – it is the method Run() of the EVM interpreter which will be the subject of the next post.

The call method of the EVM

Let us now take a closer look at the method Call() of the EVM which implements what the yellow paper calls Θ. The source code for this method can be found here. For today, I will ignore pre-compiled contracts completely which we will discuss in the next post.

The method starts by running a few checks, like making sure that we do not exceed the call depth limit (which is defined to be 1024 at the moment) or that we do not attempt to transfer more than the available balance.

The next step is to take a snapshot of the current state. Internally, Go-Ethereum uses revisions to keep track of different versions of the state, and taking a snapshot simply amounts to remembering a revision to which we can revert later if needed.

Next, we check whether the contract address already exists. This might be a bit confusing, as it does not seem to make sense to call a contract at a non-existing address, or, more precisely, at an address not yet initialized in the state DB. Note, however, that “calling” means a bit more general “sending a message to an account”, which is also done if you simply want to transfer Ether to an account. Sending a message to a non-contract account is perfectly valid, and it might even be that this account has never been used before and is therefore not part of the cached state.

The next step is to actually perform the transfer of any Ether involved in the message call, i.e. we send value Wei from the sender to the recipient. We then get into the actual bytecode execution by performing the following steps.

  • get the code associated with the contract address (i.e. the runtime bytecode) from the state
  • if the length of the code is zero, return – there is nothing left to be done
  • initialize a new Contract object that represents the current execution.
  • initialize the contract code
  • call the Run method of the interpreter

We then collect the return value from the Run method and a potential error code and set gas to contract.Gas – this represents the gas still remaining after executing the code. We then determine the final return values according to the following logic.

  • If Run did not result in an error, return the return value, error code and remaining gas just assembled
  • If Run returned a special error code indicating that the execution was reverted, reset the state to the previously created snapshot
  • If the error code returned by Run is not a reverted execution, also fall back to the snapshot but in addition, set the remaining gas to zero, i.e. such an error will consume all the available gas

Invocations of the call method

Having understood how Call works, we are now left with two tasks. First, we need to understand how the EVM interpreters Run method works, which will be the topic of our next post. Second, we have to learn where Call is actually invoked within the Go-Ethereum source code.

Not quite surprisingly, this happens at several points (ignoring tests). First, in a previous post, I have already shown you that the EVM’s Call method is invoked whenever a transaction is processed as part of a state transition. This call happens here, and the parameters are as we would expect – the the caller is the sender of the transaction, the contract address is the recipient, and the input data, gas and value are taken from the StateTransition object. The remaining gas returned is again stored in the state transition object and used as a basis for computing the gas refunded to the sender. Note that this entry point is (via the ApplyMessage function) also used by the JSON API when the eth_call method or the eth_estimateGas method are requested.

However, this is not the only point in the code where we find a reference to the Call method. A second point is actually the EVM interpreter itself, more precisely the function opCall in instructions.go. The background of this is at in addition to a call due to a transaction, i.e. a call initiated by an EOA, we can of course also call a smart contract from another smart contract using the CALL opcode. This opcode is implemented by the opCall function, and it turns out that it uses the EVM Call method as well. In this case, the parameters are taken from the stack respectively from the memory location referenced by the stack items.

  • the top level item on the stack is the gas that is made available (as we will see in the next post, this is not exactly true, but almost)
  • the next item on the stack is the target address
  • the third item is the value to be transferred
  • the next two items determine offset and length of the input data which is taken from memory
  • the last two items similarly determine offset and length of the return data area in memor

It is interesting to compare the handling of the returned error code. First, it is used to determine the status code that is returned. If there was an error, the status code is set to zero, otherwise it is set to one. Then, the returned data is stored in memory in case the execution was successful or explicitly reverted, for other errors no return data is passed. Finally, the unused gas is again returned to the currently executing contract.

This has an important consequence – there is no automatic propagation of errors in the EVM! If a contract A calls a contract B, and contract B reverts (either explicitly or due another error), then the call will technically go through, and contract A does not automatically revert as well. Instead, you will have to explicitly check the status code that the CALL opcode puts on the stack and handle the case that contract B fails somehow. Not doing this will make your contract vulnerable to the “King of the Ether” problem that we have discussed in my previous post on contract security.

Finally, scanning the code will reveal that there is a third point where the Call method is invoked – the EVM utility that allows you to run a specified bytecode outside of the Go-Ethereum client from the command line. It is fun to play with this, here is an example for its usage to invoke the sayHello method of our sample contract (again, assuming that you have cloned my repository for this series and are working in the root directory of the repository). Note that in order to install the evm utility, you will have to download the full geth archive, containing all the tools, and make the evm executable available in a folder in your path.

VERSION=$(python3 -c 'import solcx ; print(solcx.get_solc_version())')
DIR=$(python3 -c 'import solcx ; print(solcx.get_solcx_install_folder())')
SOLC="$DIR/solc-v$VERSION"
CODE=$($SOLC contracts/Hello.sol --bin-runtime   | grep "6080")
evm \
  --code $CODE\
  --input 0xef5fb05b \
  --debug run

This little experiment completes this post. In the next post, we will try to fill up the missing parts that we have not yet studied – how the code execution, i.e. the Run method, actually works, what pre-compiled contracts are and how gas is handled during the execution. We will also take a closer look at contract-to-contract calls and its variations.

A deep-dive into Solidity – function selectors, encoding and state variables

In the last post, we have seen how the Solidity compiler creates code – the init bytecode – to prepare and deploy the actual bytecode executed at runtime. Today, we will look at a few standard patterns that we find when looking at this runtime bytecode.

Some useful tools

While analyzing the init bytecode in the last post, we have mainly worked with the output of the Solidity compiler known as opcode listing – the output generated when we supply the –opcode switch. One major drawback of this representation of the bytecode is that we had to manually count instructions to determine the target of a JUMP instruction. Before going deeper into the runtime bytecode of our sample contract, let us collect a few tools that can help us with this.

First, there is the Solidity compiler itself. In addition to the bytecode and the opcodes, it can also generate an enriched output known as assembly output when the –asm switch is used. To do this for our sample contract, run

VERSION=$(python3 -c 'import solcx ; print(solcx.get_solc_version())')
DIR=$(python3 -c 'import solcx ; print(solcx.get_solcx_install_folder())')
SOLC="$DIR/solc-v$VERSION"
$SOLC contracts/Hello.sol --asm --optimize

The output is a mixture of opcodes and statements combining several opcodes into one. The snippet

PUSH1 0x40
PUSH1 0x80
MSTORE

for instance, is displayed as

mstore(0x40, 0x80)

In addition, and that makes this representation very useful, offsets are tagged, so that it becomes much easier to identify jump targets.

Brownie does also offer some useful features to display opcodes of a smart contract. When Brownie compiles a contract, it stores build data in the build subdirectory, and the data in this subdirectory can also be accessed using Python code. In particular, we can access the full bytecode and the runtime bytecode of a compiled contract, like this.

// including init bytecode
project.TmpProject._build.get("Hello")['bytecode']
// runtime bytecode only
project.TmpProject._build.get("Hello")['deployedBytecode']

Alternatively, we can access the bytecode from the deployed contract.

me = accounts[0]
hello = Hello.deploy({"from": me})
// runtime bytecode
hello.bytecode
// full bytecode (input of deployment transaction)
hello.tx.input

In addition to the plain bytecode, Brownie also offers a data structure which contains the opcodes along with offsets and some additional useful information – the pcMap. This is a hash map where the keys are the offsets of the opcodes into the runtime bytecode (the pcMap contains only the runtime bytecode) and the values are again hash maps containing the name of the Solidity function to which the code belongs, the opcode itself and arguments to the opcode as far as applicable. To print this map in a readable format, you can use the following statements.

pcMap = project.TmpProject._build.get("Hello")['pcMap']
for i in sorted(pcMap.keys()):
  print(i, "-->", pcMap[i]);

The pcMap is particularly useful if we combine it with another feature that Brownie has to offer – tracing transactions. A transaction trace contains the exact opcodes executed as part of the transaction. Here is an example.

tx = hello.sayHello()
tx.call_trace()
tx.trace

So the call trace is just a stack trace, while the trace is an array whose entries represent the opcodes that have actually been executed, along with information like the gas cost of the step, the memory content before the step was executed and the stack and storage content before the step was executed. Using tx.source(), we can even get the source code that belongs to a trace step.

The Remix IDE has a similar capability. Once a transaction has been executed and is displayed on the main screen, you can click on the blue “Debug” icon next to the transaction, and a debugger window will open on the left of the screen. You can now step forward and back, inspect opcodes, stack, memory and storage and even set breakpoints. In the Remix IDE, you can even debug deployment transaction, which is not possible in Brownie.

Function selectors

Having all these tools at our disposal, it is now not terribly difficult to understand the actual runtime bytecode. Here is a list of the opcodes, along with a few comments and tags.

// This is the start of the runtime bytecode
// initialize free memory pointer
PUSH1 0x80 
PUSH1 0x40 
MSTORE 
// Repeat the check for a non-zero value
CALLVALUE 
DUP1 
ISZERO 
PUSH1 0xF
// conditionally jump to target 1 
JUMPI 
PUSH1 0x0 
DUP1 
REVERT 
// This is jump target 1. We get here only
// if the value is zero
JUMPDEST 
POP 
PUSH1 0x4 
CALLDATASIZE 
LT 
PUSH1 0x28 
JUMPI // conditional jump to jump target 2
// We only get here if we have at least four bytes
// of data
PUSH1 0x0 
CALLDATALOAD 
PUSH1 0xE0 
SHR 
DUP1 
PUSH4 
0xEF5FB05B 
EQ 
PUSH1 0x2D 
JUMPI 
// This is jump target 2
JUMPDEST 
PUSH1 0x0 
DUP1 
REVERT 
// This is jump target 3, here we enter
// the sayHello function
JUMPDEST 
PUSH1 0x33  // offset of jump target 4
PUSH1 0x35  // offset of jump target 5
JUMP 
// This is jump target 4
JUMPDEST 
STOP 
// This is jump target 5 
// The code starting here is the actual sayHello function
JUMPDEST 
PUSH1 0x40 
MLOAD 
PUSH32 0x3ACB315082DEA2F72DFEEC435F2B0E4DD95A4FD423E89C8CB51DC75FA38D7961 
SWAP1
PUSH1 0x0
SWAP1
LOG1 
JUMP 

I have stripped off a few opcodes at the end which we will take about a bit later. Let us go through the code line by line and try to understand what it does.

The first three lines are familiar – we again initialize the free memory pointer which Solidity stores at memory address 0x40 to its initial value 0x80. Similary, we have already seen the next lines, starting with CALLVALUE, while analyzing the init bytecode. This code again checks that the value of the transaction is zero and reverts if this is not the case, reflecting the fact that our contract does not have a payable function. If the value is zero, the processing continues at the point in the code that I have called jump target 1.

Here, we first clean up the stack by popping the last value. We then push four onto the stack, followed by the output of CALLDATASIZE, which is the length of the transaction input field. The LT opcode compares these two values and pushes the result of the comparison onto the stack. If the result of the comparison is true, i.e. if we have less than four bytes in the input field, we jump to jump target 2, where we again revert.

To understand why this code makes sense, recall that the first four bytes of the input field are supposed to be the hash of the signature of the function we want to call. If we have less than four bytes, the call is not targeting a function, and as we do not have a fallback function, we revert.

If we have at least four bytes of data, we continue at the next line, where we first push zero onto the stack and then run CALLDATALOAD, which loads the first full 32 byte word of the call data onto the stack (the zero that we have just pushed is the offset). We then execute the set of instructions

PUSH1 0xE0 // 0xE0 is 224 
SHR
DUP1
PUSH4 0xEF5FB05B
EQ

This looks a bit mysterious, but is actually not too difficult to understand. After the first push, our stack looks as follows.

| 224 | first 32 bytes of transaction input |

When we then execute SHR, which is a shift operation to the right, we shift the second item on the stack by the number of bits specified by the first item to the right, so we shift the 32 bytes, i.e. 256 bits, by 224 bits to the right. This amounts to moving the first 32 bytes to the rightmost position, so that what we now have on the stack are the first four bytes of the input data, i.e. exactly those four bytes that contain the hash of the function signature. We then push four bytes on the stack, so that our stack is now

| 0xEF5FB05B | first four bytes of the function signature |

and use EQ to compare them, so that stack item at the top of the stack is now

first four bytes of function signature == 0xEF5FB05B

Now open Brownie and run

web3.keccak(text="sayHello()")[:4]

to convince yourself that the four bytes to which we compare are exactly the hash of “sayHello()”. Thus, we execute the conditional jump that comes next only if the first four bytes of the input data indicate that we want to call this method, otherwise we continue and hit upon our return statement.

The code that we have just seen therefore realizes the function selection. If your contract contains more than one function, you will see more than one comparison, and the upshot is that we either jump into the function that corresponds to the signature hash or revert (unless we have a fallback function).

This also tells us that in our case, the execution of sayHello() starts at jump target 3. The code that we see here is also typical. We push two values on the stack – first a return offset and then a jump target. We then jump, execute some code and eventually execute another jump. This second jump will then take its target from the stack, so it returns to the first offset that we have pushed onto the stack. In our case, we jump to target 5, execute the code there, and then jump back to target 4. This approach – pushing return values onto the stack – mimics the way how local functions are executed in other programming languages like C. In our case, jump target 4 is simply executing the STOP opcode which completes the execution without a return value.

Finally, let us take a look at the code at jump target 5, which is therefore the body of sayHello(). Here, we first run MLOAD to get the value of the free memory pointer. We then put a full 32 byte word onto the stack, namely the hash of the string “SayHello()'”, i.e. the signature of the event that we emit. We then swap the first two elements on the stack, push zero and swap once more. Our stack now looks as follows.

| 0x80 | 0x0 |  hash(event signature) | return address  |

Now we execute LOG1. Again, the yellow paper is our friend and tells us that the first entry on the stack is the offset of the log data, the second entry is the length and the third entry is the first (and, in this case, the only) topic. So we log an event with no data and topic being the hash of the event signature, as expected. The log statement will consume the first three stack items, and when we now jump, we therefore end up at tag 4, where we execute the STOP opcode to complete the transaction.

Encoding and state variables

We have now completed the analysis of our sample contract. A natural next step is to add more functionality to the contract and see how this changes the output of the compile. As an example, let us add some state to our contract. In the body of the contract code, add the line

uint256 value;

and the method

function store(uint256 _value) public {
    value = _value;
}

Let us now run the compiler again, this time with a few more flags that request additional output (the reason for this will become clear in a minute).

$SOLC contracts/Hello.sol \
       --asm \
       --optimize \
       --storage-layout \
       --combined-json generated-sources-runtime

Here is a listing of the relevant code that is newly added to our contract by the changes we have made. Again, I have added some comments and labeled the jump destinations from A to E.

PUSH1 0x47     // address of label B 
PUSH1 0x42     // address of label A 
CALLDATASIZE 
PUSH1 0x4 
PUSH1 0x76     // address of label D 
JUMP 
// Label A - this is at offset 0x42 
JUMPDEST 
PUSH1 0x0 
SSTORE 
JUMP 
// Label B - this is at offset 0x47
JUMPDEST 
STOP 
// Label C - this is at offset 0x48
// I have removed the code in this  section
// which we have already looked at before
// it logs the event and then jumps to label B
// where we STOP
// Label D - this is at offset 0x76 
JUMPDEST 
PUSH1 0x0 
PUSH1 0x20 
DUP3 
DUP5 
SUB 
SLT         
ISZERO 
PUSH1 0x87     // address of label E
JUMPI          // conditional jump to label E
PUSH1 0x0 
DUP1 
REVERT 
// Label E - this is at offset 0x87 
JUMPDEST 
POP 
CALLDATALOAD 
SWAP2 
SWAP1 
POP 
JUMP

The first few lines are again easy to interpret – we prepare a jump, which is an internal function call, i.e. we place a return address and, in this case, arguments on the stack and then jump to label D. When we get there, our stack looks as follows (recall that CALLDATASIZE puts the size of the calldata, i.e. the length of the transaction input in bytes, onto the stack).

4 | len(tx.input) | label A | label B

At label D, we put a few additional items on the stack. If you go through the instructions, you will find that when we reach the SUB opcode, the stack looks as follows.

len(tx.input) | 4 | 32 | 0 | 4 | len(tx.input) | A | B

Now we execute the SUB opcode, which will pop the first two items off the stack and push their difference. Thus, after completing this opcode, our stack will be

len(tx.input) - 4 | 32 | 0 | 4 | len(tx.input) | A | B

The next instruction, SLT, is a signed version of the less-than instruction that we have already seen. Together with the subsequent ISZERO which is a simple logical inversion, its impact is to provide the following stack.

!(len(tx.input) - 4 < 32) | 0 | 4 | len(tx.input) | A | B

To get an idea what this is supposed to do, looking at the assembler output helps. In the comments that Solidity has generated, we find a hint – utility.yul. As the Solidity documentation explains, this means that the code we are looking at is part of a library of utility functions, written in the Yul language (an intermediate language that Solidity uses internally). However, these utility functions are not stored anywhere in a file with this name, but are actually generated on the fly by the compiler (in our case, this happens here). The additional flag generated-source-runtime that we have used when running Solidity instructs the compiler to print out a Yul representation of the utility functions. The Yul code, the name of the function and the source code of the Solidity compiler that I have linked above solve the puzzle – the code we are looking at is supposed to decode the transaction input and to extract the argument (which is called _value in the source code of our contract).

Now the Solidity ABI demands that the argument be stored in the transaction input as a 256-bit, i.e. 32 byte word, directly after the four bytes containing the function signature. What the code that we are analyzing is doing is to check that the total length of the transaction input is at least those four bytes plus the 32 bytes. If this is not the case, we continue and revert. If this is the case, i.e. if the validation is successful, we perform a conditional jump and end up at label E. When we get there, our stack is

0 | 4 | len(tx.input) | A | B

We now remove the first item on the stack, use CALLDATALOAD to load a full 32 byte word starting at byte 4 of the transaction input onto the stack (i.e. the 32 byte word that is supposed to contain our parameter), and use two swaps and a pop operation to produce the following stack.

A  | _value | B

The conditional jump will therefore take us to label A again, with the _value parameter at the top of the stack. Hee, we push zero onto the stack and perform an SSTORE. This will store _value at position zero of the storage and leave us with the address of label B on the stack. The following jump will therefore take us to the STOP opcode, and the transaction completes.

So, the content at offset zero of the storage seems to represent the stored value. Here, we could easily derive this from the code, but in general, this can be more difficult. To help us to map the state variables declared in the source code to storage locations, Solidity creates a storage map which we have included in our output using the –storage-layout switch. The storage layout is an array, where each entry represents one state variable. For each variable, there is a slot and an offset. As indicated in the documentation, the slot is the address in the storage area, but one slot can contain more than one item (if an item is smaller than 32 bytes), and in this case, the offset is the offset within the slot. For dynamic data types, the layout is much more complicated, for mappings, for instance, the actual slot is determined as a hash value of he key.

Metadata and hashes

If you have followed the analysis carefully, you might have noted that the last few opcodes do not seem to be executed at all. In fact, they do not even make sense, starting already with an invalid opcode 0xFE. Again, the assembler output helps to interpret this – it designates this part of the bytecode as “auxdata”, which does in fact not contain valid bytecode, but the IFPS hash of the contract metadata (more precisely a CBOR encoded structure which contains the IPFS hash as a key)

The contract metadata, which can be produced using the –metadata compiler switch, is a JSON structure that contains, among other things

  • the contract ABI
  • the Keccak hash of the source code
  • the IPFS hash of the source code
  • the exact compiler version
  • the compiler settings used to produce the bytecode

The idea behind this is that a developer can store the metadata and the contract source in IPFS. A user who finds the contract on the blockchain can then use the last few bytes – the IPFS hash of the metadata – to retrieve that document from the IPFS network. As the metadata document contains the IPFS hash of the source, a user could now retrieve the source as well. This mechanism therefore allows you to link the source code to the contract and to prove that the contract bytecode has been created using the source code and a given set of compiler settings. Within the Solidity source code, all this happens here.

We have seen that the metadata hash and the runtime bytecode are separated by the invalid opcode 0xFE. This byte appears at another location in the full bytecode – the end of the init bytecode. In both cases, the motivation is the same – we want to avoid that, due to an error, the execution can continue past these boundaries. So we now realize that the full bytecode contains of three sections, separated by the invalid opcode 0xFE.

This closes our post for today. Of course, you could now add additional features to our contract, maybe return values or mappings, and see how this affects the generated bytecode. In the next post, however, we will turn to another topic which is central to understanding smart contracts – how the Ethereum virtual machine actually operates.

A deep-dive into Solidity – contract creation and the init code

In some of the previous posts in this series, we have already touched upon contract creation and referred to the fact that during contract creation, an init bytecode is sent as part of a transaction which is supposed to return the actual bytecode of the smart contract. In this and the next post, we will look at this in a bit more detail and, along the way, learn how to decipher the EVM bytecode for a simpler contract.

Contract creation – an overview

Before diving into details, let us first make sure we understand the contract creation process in Solidity. A good starting point is section 7 of the Ethereum yellow paper.

A transaction will create a contract if the recipient address of the transaction is empty (i.e. technically the zero address). A creation operation can contain a value, which is then credited to the address of the newly created contract (even though in Solidity, this requires a payable constructor). Then, the initialisation bytecode, i.e. the content of the init field of the transaction, is executed, and the returned array of bytes is stored as the bytecode of the newly created contract. Thus there are in fact two different types of bytecode involved during the creation of a smart contract – the runtime bytecode which is the code executed when the contract is invoked after its initial creation, and the init bytecode which is responsible for preparing the contract and returning the runtime bytecode.

To understand what “returning the runtime bytecode” actually means, we need to consult the definition of the RETURN opcode in appendix H. Here, the return value function Hreturn is specified, which is referenced in section 9 and defines the output of a bytecode execution. It takes a moment to get familiar with the notation, but what the definition actually says is that the output is placed in the virtual machine memory, where the offset is determined by the top of the stack and the length is determined by the second element on the stack. Thus the init bytecode needs to

  • make any changes to the state of the contract address needed (maybe initialize some state variables)
  • place the runtime bytecode somewhere in memory
  • push the length of the runtime bytecode onto the stack
  • push the offset of the runtime bytecode (i.e. the address in memory where it starts) onto the stack
  • execute the RETURN statement

To make this a bit more tangible, let us again use Brownie to see how this works in practice. We will use a simple sample contract which does nothing except logging an event when its sayHello method is invoked. So make sure that you have a Brownie project directory containing this contract (if you have cloned my repository, I recommend to create a tmp subdirectory and link the contract there, as described here), and open the Brownie console. Then, we deploy a copy of the contract and inspect the transaction that Brownie has used to do this.

me = accounts[0]
hello = Hello.deploy({"from": me})
tx = web3.eth.get_transaction(hello.tx.txid)  
tx
hello.balance()

You should see that the value of the transaction is zero, the recipient is None and the input is an array of bytes, starting with 0x60806040. This is the init bytecode, which we will study in the remaining part of the post. You can also see that the initial balance of the contract is zero.

Reading EVM bytecode – the basics

Before we dive into the init bytecode, we first have to collect some basic facts about how the Ethereum virtual machine (EVM) works. Recall that the bytecode is simply an array of bytes, and each byte will be interpreted as an operation. More precisely, appendix H of the yellow paper contains a list of opcodes each of which represents a certain operation that the machine can perform, and during execution, the EVM basically goes through the bytecode, tries to interpret each byte as an opcode and executes the corresponding operation.

The EVM is what computer scientists call a stack machine, meaning that virtually all operations somehow manipulate the stack – they take arguments from the stack, perform an operation and put the resulting value onto the stack again. Note that most operations actually consume values from the stack, i.e. pop them. As an example, let us take the ADD operation, which has bytecode 0x1. This operation takes the first two values from the stack, adds them and places the result on the stack again. So if the stack held 3 and 5 before the operation was executed, it will hold 8 after the operation has completed.

Even though most operations take their input from the stack, there are a few notable exceptions. First, there are the PUSH operations, which are needed to prepare the stack in the first place and cannot take their arguments from the stack, as this would create an obvious chicken-and-egg challenge. Instead, the push operation takes its argument from the code, i.e. pushes the byte or the sequence of bytes immediately following the instruction. There is one push operation for each byte length from 1 to 32, so PUSH1 pushes the byte in the code immediately following the instruction, PUSH2 pushes the next two bytes and so forth. It is important to understand that even PUSH32 will only place one item on the stack, as each stack item is a 32 byte word, using big endian notation.

The init bytecode

Armed with this understanding, let us now start to analyze the init bytecode. We have seen that the init bytecode is stored in the transaction input, which we can, after deployment, also access as hello.tx.input. The first few bytes are (using Solidity 0.8.6, this might change in future versions)

0x6080604052

Let us try to understand this. First, we can look up the opcode 0x60 in the yellow paper and find that it is the opcode of PUSH1. Therefore, the next byte in the code is the argument to PUSH1. Then, we see the same opcode again, this time with argument 0x40. And finally, 0x52 is the opcode for MSTORE, which stores the second stack item in memory at the address given by the first stack item. Thus, in an opcode notation, this first piece of the bytecode would be

PUSH1 0x80
PUSH1 0x40
MSTORE

and would result in the value 0x80 being written to address 0x40 in memory. This looks a bit mysterious, but most if not all Solidity programs start with this sequence of bytes. The reason for this is the how Solidity organizes its memory internally. In fact, Solidity uses the memory area between address zero and address 0x7F for internal purposes, and stores data starting at address 0x80. So initially, free memory starts at 0x80. To keep track of which memory can still be used and which memory areas are already in use, Solidity uses the 32 bytes starting at memory address 0x40 to keep track of this free memory pointer. This is why a typical Solidity program will start by initializing this pointer to 0x80.

We could now continue to analyze the remaining bytecode in this way, manually looking up opcodes in the yellow paper, but this if of course not terribly efficient. Instead, let us ask the Solidity compiler to spit out the opcodes for us, instead of the plain bytecode. We do not even have to download and install Solidity, because we have already done this when installing the py-solcx module. So let us politely ask Python to spit out the location and version number of the solc binary and invoke it to compile our contract to opcode.

VERSION=$(python3 -c 'import solcx ; print(solcx.get_solc_version())')
DIR=$(python3 -c 'import solcx ; print(solcx.get_solcx_install_folder())')
SOLC="$DIR/solc-v$VERSION"
$SOLC contracts/Hello.sol --opcodes

As a result, you should see something like this (I have added linebreaks to make this more readable and only reproduced the first few opcodes).

====== contracts/Hello.sol:Hello =======
Opcodes:
PUSH1 0x80 
PUSH1 0x40 
MSTORE 
CALLVALUE 
DUP1 
ISZERO 
PUSH1 0xF 
JUMPI 
PUSH1 0x0 
DUP1 
REVERT 
JUMPDEST               <---- Marker A
POP 
PUSH1 0x99 
DUP1 
PUSH2 0x1E 
PUSH1 0x0 
CODECOPY 
PUSH1 0x0 
RETURN 
INVALID 
PUSH1 0x80             <--- Marker B
PUSH1 0x40 
MSTORE

This is much better (in fact, Solidity can actually produce a number of different output formats – as we go deeper into the actual runtime bytecode in the next post, we will find –asm useful as well). I have also added two markers manually to the output that we will need when discussing the code.

We have already analyzed the first three lines, so let us look at the next section of the code, starting at CALLVALUE. Again, we can consult the yellow paper to figure out what this instruction does – it gets the value of the transaction and stores it on the stack. We then duplicate this value on the stack, so that the stack now looks like this

| value | value |

and invoke the ISZERO operation. This operation takes the first stack item and replaces it by one if it is zero or by zero otherwise. Next, we push 0x1F, so our stack now looks like this

| 0x1F | value == 0 | value

The next instruction is JUMPI. This is a conditional jump which is only executed if the second stack item is non-zero, and in this case, we jump to the point in the bytecode designated by the first stack item. Thus, if the value of the transaction is zero, we jump to the offset 0x1F, otherwise we continue.

Let us suppose for a moment we include a non-zero value with our transaction. Then, we continue with the next statement after the JUMPI, push zero onto the stack, duplicate and REVERT. Consulting the yellow paper once more, we find that the two topmost items on the stack that are present when we do a revert are used to define the return value – the rule is the same as for RETURN, meaning that the first item on the stack is an offset, the second item is the length. Thus with two zeroes on the stack, we do not return anything. Summarizing, we revert the transaction if the contract creation transaction has a non-zero value, and Solidity generates this code because we have not declared a payable constructor.

Let us now see how the execution proceeds if the value is zero. To be able to do this, we have to figure out the instruction at offset 0x1F (15). So let us count – every instruction consumes one byte, and the additional arguments to PUSH1 also consume one byte each. Thus, we find that the execution continues at the JUMPDEST instruction that I have called marker A. The JUMPDEST opcode does not actually do anything, it is simply a marker byte that the EVM uses to make sure that a jump points to valid location. So we now enter the part of the code that reads like this.

JUMPDEST               <---- Marker A
POP 
PUSH1 0x99 
DUP1 
PUSH2 0x1E 
PUSH1 0x0 
CODECOPY 
PUSH1 0x0 
RETURN 
INVALID 
PUSH1 0x80             <--- Marker B

Note that at this point, we still have the transaction value on the stack, which we remove with the first POP statement. We then push 153, duplicate this, push 30 and zero, so the stack now looks like this

| 0 | 30 | 153 | 153 |

The next instruction is CODECOPY. This copies code of the currently running contract to memory. It consumes three parameters from the stack. The element at the top of the stack defines the target address (i.e. offset) in memory. The second parameter defines the source offset in the code, and the third parameter defines the number of bytes to copy.

Counting once more, we see that the code we copy is 153 bytes long and starts at the point that I have called marker B. The code starting there will therefore be copied to address zero in memory, and after that has been done, our stack contains 153. We then push 0, so that the stack now looks like

| 0 | 153 | 

Finally, we RETURN. Now recalling how the return value of a contract execution is defined, we see that the return value of executing all of this is the bytearray of length 153 stored at address zero in memory, which, as we have just seen, are the 153 bytes of code starting at marker B. So the upshot is that this is the runtime bytecode, and the code we have just analyzed does nothing but (after making sure that the transaction value is zero) copying this bytecode into memory and returning it (by the way – if you want to see where exactly in the Solidity source code this happens, this link might be a good entry point for your research).

That’s it – we have successfully deciphered the initialization procedure of a very simple smart contract. Note that if the contract had a constructor, it would be executed first, before copying the runtime bytecode and returning (you might want to add a simple constructor and repeat the analysis). In the next post, we will learn a few additional tricks to obtain useful representations of the runtime bytecode and ten dive into how the runtime bytecode works. See you!

Smart contract security – some known issues

Smart contracts are essentially immutable programs that are designed to handle valuable assets on the blockchain and are mostly written in a programming language that has been around for a bit more than five years and is still rapidly evolving. That sounds a bit dangerous, and in fact the short history of the Ethereum blockchain is full of notable examples that demonstrate how smart contracts can be exploited to steal money. In this post, we will look at a few known security issues that you should try to avoid.

Background – receiving payments with Solidity

Not quite surprisingly, most exploits that we have seen that target smart contracts are somehow related to those parts of a contract that makes payments, so let us first try to make sure that we understand how payments are handled in Solidity.

First, recall that as any other address, a contract address has a balance and can therefore receive and transfer Ether. On the level of the Ethereum virtual machine (EVM), this is rather straightforward. Whenever a smart contract is called, be it from an EOA or another contract, the message call specifies a value. This is the amount of Ether (in Wei) that should be transferred as part of the contract execution. Very early in the processing, before any contract code is actually executed, this amount is transferred from the balance of the caller to the balance of the callee (unless, of course, the balance of the caller is not sufficient).

In Solidity, the situation is a bit more complicated. To see why, let us first imagine that you write and deploy a smart contract and then someones transfers Ether to the contract address. That amount is then added to the balance of the contract, and to access it, you would either need to submit a transaction signed with the private key of the contract address, or the contract itself needs to implement a function that can transfer the Ether to some other address, preferrably an EOA address. Now, a smart contract address has no associated private key – it is the result of a calculation at the time the contract is created, not a key generation process. So the only way to use Ether that is held by a contract is to invoke a function of the contract that transfers it out of the contract again. Thus if you accidentally transfer Ether to a smart contract which does not have such a function, maybe because it was never designed to receive Ether, the Ether is lost forever.

To avoid this, the designers of Solidity have decided that contract functions that can receive Ether need to be clearly marked as being able to handle Ether by declaring them as payable. In fact, if a contract method is not marked as being payable, the compiler will generate code that, if that method is called, checks if the message call specifies a non-zero value, i.e. if Ether should be transferred as part of the call. If yes, this code will revert the execution so that the transfer will fail.

Apart from an ordinary function call, there are special cases that we need to handle. First, it might of course happen that a smart contract is invoked without specifying a method at all. This happens if someone simply sends Ether to a smart contract (maybe without even knowing that the target address is a smart contract) and leaves the data field in the transaction (which, as we know, contains a hash of the target function to be called) empty. To handle this case, Solidity defines a special function receive. If this function is present in a contract, and the contract is called without specifying a target function, this method will be executed.

A similar mechanism exists to cover the case that a contract is invoked with a target function that does not exist or is invoked with no target function and no receive function exists. This special function is called the fallback function (in previous versions of Solidity, fallback and receive functions were identical). If none of these fallback functions is present, the execution will fail.

Send and transfer

Having discussed how a smart contract can receive Ether, let us now discuss how a smart contract can actually send Ether. Solidity offers different ways to do this. First, there is the send method. This is a method of an address object in Solidity and can be used to transfer a certain amount of Ether from the contract address to an arbitrary address. So you could do something like

address payable receiver =  payable(address(0xFC2a2b9A68514E3315f0Bd2a29e900DC1a815a1D));        
// Be careful, do NOT do this!
receiver.send(100);

to send 100 Wei to the target address receiver (note that in recent versions of Solidity, an address to which you want to send Ether needs to be marked as payable). However, this code already contains a major issue – it does not check the return value of send!

In fact, send does return true if the transfer was successful and false if the transfer failed (for instance because the current balance is not sufficient, or because the target is a smart contract without a receive or fallback function, or if the target is a contract with a receive function, but this function runs out of gas). If, as in this example, you do not check the return code, a failed transfer will go unnoticed. As an illustration, let us consider a famous example where exactly this happened – the King of the Ether contract . The idea of this contract was that by paying a certain amount of Ether, you could claim a virtual throne and be appointed “King of the Ether”. If someone else now pays an amount which is the amount which you have paid times a factor, this person would become the new King, and you would receive the amount that you invested minus a fee. In the source code of v0.4 of the contract, the broken section looks as follows (I have added a few comments not present in the original source code to make it easier to read the snippet without having the full context)

// we get to this point in the code if someone has paid enough to
// become the new king
// valuePaid is the Ether paid by the current king
// wizardCommission is a fee that remains in the account
// of the contract and can be claimed by the contract owner (wizard) 
uint compensation = valuePaid - wizardCommission;

// In its initial state, the current monarch is the wizard
// so we check for this
if (currentMonarch.etherAddress != wizardAddress) {
  // here we send the the Ether minus the fees back 
  // to the current king
  currentMonarch.etherAddress.send(compensation);
} else {
  // When the throne is vacant, the fee accumulates for the wizard.
}

Note how send is used without checking the return code. What actually happened is that some people who held the throne did apparently use what is called a contract based wallet, i.e. a wallet that manages your Ether in a smart contract. Thus, the address of the current king (currentMonarch) was actually a smart contract. If a smart contract receives Ether, then, as we have seen above, it will execute a function. Now send only makes a very small amount of gas (2300 to be precise) available to the called contract (this is called the gas stipend, and we will dive into this and how a call actually works under the hood in a later post), which was not sufficient to run the code. So the called contract failed, but, as the return value was not checked, the calling contract continued, effectively stealing the compensation instead of paying it out.

The withdrawal pattern

It is interesting to discuss how this can be fixed. The obvious idea might be to check the return value and revert the transaction if it is false. Alternatively, one can use the second method that Solidity offers to transfer Ether – the transfer method, which will revert if the transfer fails. This, however, results in a new problem, as it allows for a denial-of-service attack.

To see this, suppose that a contract successfully claims the throne, and then someone else tries to become the new king, resulting in the execution of the code above. Suppose we use transfer instead of send. Now the contract which is the current king might be a malicious contract with a receive function that always reverts, or no receive function at all. Then, any attempt to become the new king will be reverted, and the contract is stuck forever.

This is a very general problem that you will face whenever a method of a smart contract calls another contract – you can not rely on the other contract to cooperate and it is dangerous to assume that the call will be successful. Therefore, the Solidity documentation recommends a pattern known as the withdrawal pattern. In our specific case, this would work as follows. Instead of immediately paying out the compensation, you would store the claim in the contract state and allow the previous king to call a withdraw method that does the transfer, maybe like this.

// this replaces currentKing.send(compensation)
claims[currentKing]+=compensation
// code goes on...


// a new function that allows the current king to collect the compensation
function withdraw() public {
  uint256 claim = claims[msg.sender];
  if (claim > 0) {
    claims[msg.sender] = 0;
    payable(msg.sender).transfer(claim);
  }
  else {
    revert("Unjustified claim");
  }
}

Why would this help? Suppose an attacker implements a contract that reverts if Ether is sent to it. If this contract is the current king and someone else claims the throne, enthroning the new king will work, because the transfer is contained in the separate function withdraw. If now the attacker invokes this function, it will still revert, but this will not impact the functionality of the contract of other users, so not denial of service (impacting anyone except the attacker) will result.

Reentrancy attacks and TheDAO

Let us suppose for a moment that in the code snippet above, we had chosen a slightly different order of the statements, and, in addition, had decided to use a low-level call to transfer the money, like this

(bool success, bytes memory data) = msg.sender.call{value: claim}("");
require(success, "Failed to send Ether");
claims[msg.sender] = 0;

Here, we use the call method of an address, which has the advantage over transfer that it does not only make the minimum of 2300 units of gas available to the caller, but the full gas remaining at this point. This makes the contract less vulnerable to errors resulting out of non-trivial receive functions, which is the reason why it is sometimes recommended to use this approach instead of transfer.

This would in fact make our contract again vulnerable, this time to a class of attacks known as re-entrancy attack. To exploit this vulnerability, an attacker would have to prepare a malicious contract that enthrones itself and whose receive function calls the withdraw function again (but with a depth of at most one). If no someone else has claimed the throne and the malicious contract calls withdraw, the following things would happen.

  1. The malicious contract calls withdraw for the first time
  2. withdraw initiates the transfer of the current claim to the malicious contract
  3. the receive function of the malicious contract is invoked
  4. the receive function calls withdraw once more
  5. at this point in time, the variable claims[msg.sender] still has its original, non-zero value
  6. so the same transfer is made again
  7. both transfers succeed, and the claim is overwritten by zero twice

As a result, the claim is transferred twice to the malicious contract (assuming, of course, that the King of the Ether contract has a sufficient balance). Of course instead of invoking the function twice, you can let the receive function call back into the contract several times, thus multiplying the amount transferred by the number of calls, limited only by the stack size and the available gas. This sort of vulnerability was the root cause for the famous theDAO hack, which eventually lead to a fork of the Ethereum block chain.

Note that in this case, using transfer instead of call would actually protect against this sort of attack, at the second call into the King of the Ether contract would require more gas than transfer makes available.

Create2 and the illusion of immutable contracts

Smart contracts are immutable – are they? Well, actually no – there are several ways to change the behaviour of a smart contract after it has been deployed. First, you could of course build a switch into your contract that only the owner can control. A bit more advanced, a contract can act as a proxy, delegating calls to another contract, and the contract owner could change the address of the target contract while keeping the address of the proxy the same.

An additional option has been created with EIP-1014. This proposal, which went live with the Constantinople hard fork in 2019, introduced a new opcode CREATE2 which allows for the creation of a contract with a predictable address. Recall that when a contract is created, the contract address is determined from the address of the owner and the current nonce. This makes it difficult to predict the address of the contract, unless you use an account for contract creation whose nonce is kept stable. When using CREATE2 instead, the contract address is taken to be the hash value of a combination of the sender address, a salt and the init code of the contract to be created.

The problem with this is, however, that the init code does not fully determine the runtime bytecode. Recall that the init code is bytecode that is executed at deployment time, and whose return value will be stored and used as the actual contract code executed at runtime (we will see this in action in the next post). The init code could, for instance, retrieve the actual runtime bytecode by calling into another contract. If the state of this contract is changed to return a different bytecode, the init code will still be the same. Thus, by using CREATE2 repeatedly with the same init code and salt, different versions of a contract could be stored at the same address.

To avoid this, the creators of EIP-1014 introduced a safeguard – if the target address already contains code or has a non-zero nonce, the invocation will fail. However, there is a loophole, which works as follows.

  1. Prepare an init bytecode that get the actual runtime bytecode from a different contract, as outlined above
  2. Use CREATE2 to deploy this runtime bytecode to a specific address
  3. In the runtime bytecode, include a method that executes the SELFDESTRUCT opcode (protected by the condition that it only executes if the sender address is an address that you control). This is an opcode that will effectively wipe out the code of a contract and set the nonce of the contract address back to zero
  4. Motivate people to deposit something of value in your contract, maybe Ether or token
  5. At any point in time, you could now use this method to remove the existing contract. At this point, the nonce and code are both zero. You could now invoke CREATE2 once more to deploy a new contract to the same address with a different runtime bytecode, which maybe steals whatever assets have been deposited in the old contract

In this way, the functionality of a smart contract can be changed without anyone noticing it. Of course, this only works under specific conditions, the most important one being that the contract needs to contain the SELFDESTRUCT opcode. The only real protection is to have a look at the contract source code (or event at the runtime bytecode) before trusting it and become alerted if the contract has a SELFDESTRUCT in it (or uses an instruction like DELEGATECALL to invoke code that contains a SELFDESTRUCT). It seems that Etherscan is now able to track contract recreation using CREATE2, here is an example from the Ropsten test network, note the “Reinit” flag being displayed on the contract tab, and here is an example from mainnet.

This concludes our post for today. There are many more security considerations and pitfalls that you should be aware of whenever you develop a smart contract that is going to be used on a real network with real money being involved. In the next section, I have listed a few references that you might want to consult to learn more about smart contract security. I hope you found this interesting and see you again in the next post, in which we will take a closer look at how Solidity translates your source code into EVM bytecode.

References

Here is a list of references that I found useful while collecting the material for this post.

  1. OpenZeppelin has a rather comprehensive list of post-mortems on its web site
  2. Consensys maintains a collection of best practises for smart contracts that explain common vulnerabilities and how to protect against them
  3. The Solidity documentation contains a section on security considerations
  4. This paper contains a classification of common vulnerabilities and discusses which of them can be avoided by using Vyper instead of Solidity as a smart contract language
  5. A similar list can be found in this conference paper
  6. The implications of the CREATE2 opcode have been discussed in detail here
  7. Finally, the documentation on ethereum.org contains a section on security considerations as well

Compiling and deploying a smart contract with geth and Python

In our last post, we have been cheating a bit – I have shown you how to use the web3 Python library to access an existing smart contract, but in order to compile and deploy, we have still been relying on Brownie. Time to learn how this can be done with web3 and the Python-Solidity compiler interface as well. Today, we will also use the Go-Ethereum client for the first time. This will be a short post and the last one about development tools before we then turn our attention to token standards.

Preparations

To follow this post, there is again a couple of preparational steps. If you have read my previous posts, you might already have completed some of them, but I have decided to list them here once more, in case you are just joining us or start over with a fresh setup. First, you will have to install the web3 library (unless, of course, you have already done this before).

sudo apt-get install python3-pip python3-dev gcc
pip3 install web3

The next step is to install the Go-Ethereum (geth) client. As the client is written in Go, it comes as a single binary file, which you can simply extract from the distribution archive (which also contains the license) and copy to a location on your path. As we have already put the Brownie binary into .local/bin, I have decided to go with this as well.

cd /tmp
wget https://gethstore.blob.core.windows.net/builds/geth-linux-amd64-1.10.6-576681f2.tar.gz
gzip -d geth-linux-amd64-1.10.6-576681f2.tar.gz
tar -xvf  geth-linux-amd64-1.10.6-576681f2.tar
cp geth-linux-amd64-1.10.6-576681f2/geth ~/.local/bin/
chmod 700 ~/.local/bin/geth
export PATH=$PATH:$HOME/.local/bin

Once this has been done, it is time to start the client. We will talk more about the various options and switches in a later post, when we will actually use the client to connect to the Rinkeby testnet. For today, you can use the following command to start geth in development mode.

geth --dev --datadir=~/.ethereum --http

In this mode, geth will be listening on port 8545 of your local PC and bring up local, single-node blockchain, quite similar to Ganache. New blocks will automatically be mined as needed, regardless of the gas price of your transactions, and one account will be created which is unlocked and at the same time the beneficiary of newly mined blocks (so do not worry, you have plenty of Ether at your disposal).

Compiling the contract

Next, we need to compile the contract. Of course, this comes down to running the Solidity compiler, so we could go ahead, download the compiler and run it. To do this with Python, we could of course invoke the compiler as a subprocess and collect its output, thus effectively wrapping the compiler into a Python class. Fortunately, someone else has already done all of the hard work and created such a wrapper – the py-solc-x library (a fork of a previous library called py-solc). To install it and to instruct it to download a specific version of the compiler, run the following commands (this will install the compiler in ~/.solcx)

pip3 install py-solc-x
python3 -m solcx.install v0.8.6
~/.solcx/solc-v0.8.6 --version

If the last command spits out the correct version, the binary is installed and we are ready to use it. Let us try this out. Of course, we need a contract – we will use the Counter contract from the previous posts again. So go ahead, grab a copy of my repository and bring up an interactive Python session.

git clone https://github.com/christianb93/nft-bootcamp
cd nft-bootcamp
iypthon3

How do we actually use solcx? The wrapper offers a few functions to invoke the Solidity compiler. We will use the so-called JSON input-output interface. With this approach, we need to feed a JSON structure into the compiler, which contains information like the code we want to compile and the output we want the compiler to produce, and the compiler will spit out a similar structure containing the results. The solcx package offers a function compile_standard which wraps this interface. So we need to prepare the input (consult the Solidity documentation to better understand what the individual fields mean), call the wrapper and collect the output.

import solcx
source = "contracts/Counter.sol"
file = "Counter.sol"
spec = {
        "language": "Solidity",
        "sources": {
            file: {
                "urls": [
                    source
                ]
            }
        },
        "settings": {
            "optimizer": {
               "enabled": True
            },
            "outputSelection": {
                "*": {
                    "*": [
                        "metadata", "evm.bytecode", "abi"
                    ]
                }
            }
        }
    };
out = solcx.compile_standard(spec, allow_paths=".");

The output is actually a rather complex data structures. It is a dictionary that contains the contracts created as result of the compilation as well as a reference to the source code. The contracts are again structured by source file and contract name. For each contract, we have the ABI, a structure called evm that contains the bytecode as well as the corresponding opcodes, and some metadata like the details of the used compiler version. Let us grab the ABI and the bytecode that we will need.

abi = out['contracts']['Counter.sol']['Counter']['abi']
bytecode = out['contracts']['Counter.sol']['Counter']['evm']['bytecode']['object']

Deploying the contract

Let us now deploy the contract. First, we will have to import web3 and establish a connection to our geth instance. We have done this before for Ganache, but there is a subtlety explained here – the PoA implementation that geth uses has extended the length of the extra data field of a block. Fortunately, web3 ships with a middleware that we can use to perform a mapping between this block layout and the standard.

import web3
w3 = web3.Web3(web3.HTTPProvider("http://127.0.0.1:8545"))
from web3.middleware import geth_poa_middleware
w3.middleware_onion.inject(geth_poa_middleware, layer=0)

Once the middleware is installed, we first get an account that we will use – this is the first and only account managed by geth in our setup, and is the coinbase account with plenty of Ether in it. Now, we want to create a transaction that deploys the smart contract. Theoretically, we know how to do this. We need a transaction that has the bytecode as data and the zero address as to address. We could probably prepare this manually, but things are a bit more tricky if the contract has a constructor which takes arguments (we will need this later when implementing our NFT). Instead of going through the process of encoding the arguments manually, there is a trick – we first build a local copy of the contract which is not yet deployed (and therefore has no address so that calls to it will fail – try it) then call its constructor() method to obtain a ContractConstructor (this is were the arguments would go) and then invoke its method buildTransaction to get a transaction that we can use. We can then send this transaction (if, as in our case, the account we want to use is managed by the node) or sign and send it as demonstrated in the last post.

me = w3.eth.get_accounts()[0];
temp = w3.eth.contract(bytecode=bytecode, abi=abi)
txn = temp.constructor().buildTransaction({"from": me}); 
txn_hash = w3.eth.send_transaction(txn)
txn_receipt = w3.eth.wait_for_transaction_receipt(txn_hash)
address = txn_receipt['contractAddress']

Now we can interact with our contract. As the temp contract is of course not the deployed contract, we first need to get a reference to the actual contract as demonstrated in the previous post – which we can do, as we have the ABI and the address in our hands – and can then invoke its methods as usual. Here is an example.

counter = w3.eth.contract(address=address, abi=abi)
counter.functions.read().call()
txn_hash = counter.functions.increment().transact({"from": me});
w3.eth.wait_for_transaction_receipt(txn_hash)
counter.functions.read().call()

This completes our post for today. Looking back at what we have achieved in the last few posts, we are now proud owner of an entire arsenal of tools and methods to compile and deploy smart contracts and to interact with them. Time to turn our attention away from the simple counter that we used so far to demonstrate this and to more complex contracts. With the next post, we will actually get into one the most exciting use cases of smart contracts – token. Hope to see you soon.

Using web3.py to interact with an Ethereum smart contract

In the previous post, we have seen how we can compile and deploy a smart contract using Brownie. Today, we will learn how to interact with our smart contract using Python and the Web3 framework which will also be essential for developing a frontend for our dApp.

Getting started with web3.py

In this section, we will learn how to install web3 and how to use it to talk to an existing smart contract. For that purpose, we will once more use Brownie to run a test client and to deploy an instance of our Counter contract to it. So please go ahead and repeat the steps from the previous post to make sure that an instance of Ganache is running (so do not close the Brownie console) and that there is a copy of the Counter smart contract deployed to it. Also write down the contract address which we will need later.

Of course, the first thing will again be to install the Python package web3, which is as simple as running pip3 install web3. Make sure, however, that you have GCC and the Python development package (python3-dev on Ubuntu) on your machine, otherwise the install will fail. Once this completes, type ipython3 to start an interactive Python session.

Before we can do anything with web3, we of course need to import the library. We can then make a connection to our Ganache server and verify that the connection is established by asking the server for its version string.

import web3
w3 = web3.Web3(web3.HTTPProvider('http://127.0.0.1:8545'))
w3.clientVersion

This is a bit confusing, with the word web3 occurring at no less than three points in one line of code, so let us dig a bit deeper. First, there is the module web3 that we have imported. Within that module, there is a class HTTPProvider. We create an instance of this class that connects to our Ganache server running on port 8545 of localhost. With this instance, we then call the constructor of another class, called Web3, which is again defined inside of the web3 module. This class is dynamically enriched at runtime, so that all namespaces of the API can be accessed via the resulting object w3. You can verify this by running dir(w3) – you should see attributes like net, eth or ens that represent the various namespaces of the JSON RPC API.

Next, let us look at accounts. We know from our previous post that Ganache has ten test accounts under its control. Let us grab one of them and check its balance. We can do this by using the w3 object that we have just created to invoke methods of the eth API, which then translate more or less directly into the corresponding RPC calls.

me = w3.eth.get_accounts()[0]
w3.eth.get_balance(me)

What about transactions? To see how transactions work, let us send 10 Ether to another address. As we plan to re-use this address later, it is a good idea to use an address with a known private key. In the last post, we have seen how Brownie can be used to create an account. There are other tools that do the same thing like clef that comes with geth. For the purpose of this post, I have created the following account.

Address:  0x7D72Df7F4C7072235523A8FEdcE9FF6D236595F3
Key:      0x5777ee3ba27ad814f984a36542d9862f652084e7ce366e2738ceaa0fb0fff350

Let us transfer Ether to this address. To create and send a transaction with web3, you first build a dictionary that contains the basic attributes of the transaction. You then invoke the API method send_transaction. As the key of the sender is controlled by the node, the node will then automatically sign the transaction. The return value is the hash of the transaction that has been generated. Having the hash, you can now wait for the transaction receipt, which is issued once the transaction has been included in a block and mined. In our test setup, this will happen immediately, but in reality, it could take some time. Finally, you can check the balance of the involved accounts to see that this worked.

alice = "0x7D72Df7F4C7072235523A8FEdcE9FF6D236595F3"
value = w3.toWei(10, "ether")
txn = {
  "from": me,
  "to": alice,
  "value": value,
  "gas": 21000,
  "gasPrice": 0
}
txn_hash = w3.eth.send_transaction(txn)
w3.eth.wait_for_transaction_receipt(txn_hash)
w3.eth.get_balance(me)
w3.eth.get_balance(alice)

Again, a few remarks are in order. First, we do not specify the nonce, this will be added automatically by the library. Second, this transaction, using a gas price, is a “pre-EIP-1559” or “pre-London” transaction. With the London hardfork, you would instead rather specify a maximum fee per gas and a priority fee per gas. As I started to work on this series before London became effective, I will stick to the legacy transactions throughout this series. Of course, in a real network, you would also not use a gas price of zero.

A second important point to be aware of is timing. When we call send_transaction, we hand the transaction over to the node which signs it and publishes it on the network. At some point, the transaction is included in a block by a miner, and only then, a transaction receipt becomes available. This is why we call wait_for_transaction_receipt which actively polls the node (at least when we are using a HTTP connection) until the receipt is available. There is also a method get_transaction_receipt that will return a transaction receipt directly, without waiting for it, and it is a common mistake to call this too early.

Also, note the conversion of the value. Within a transaction, values are always specified in Wei, and the library contains a few helper functions to easily convert from Wei into other units and back. Finally, note that the gas limit that we use is the standard gas usage of a simple transaction. If the target account is a smart contract and additional code is executed, this will not be sufficient.

Now let us try to get some Ether back from Alice. As the account is not managed by the node, we will now have to sign the transaction ourselves. The flow is very similar. We first build the transaction dictionary. We then use the helper class Account to sign the transaction. This will return a tuple consisting of the hash that was signed, the raw transaction itself, and the r, s and v values from the ECDSA signature algorithm. We can then pass the raw transaction to the eth.send_raw_transaction call.

nonce = w3.eth.get_transaction_count(alice)
refund = {
  "from": alice,
  "to": me,
  "value": value, 
  "gas": 21000,
  "gasPrice": 0,
  "nonce": nonce
}
key = "0x5777ee3ba27ad814f984a36542d9862f652084e7ce366e2738ceaa0fb0fff350"
signed_txn = w3.eth.account.sign_transaction(refund, key)
txn_hash = w3.eth.send_raw_transaction(signed_txn.rawTransaction)
w3.eth.wait_for_transaction_receipt(txn_hash)
w3.eth.get_balance(me)
w3.eth.get_balance(alice)

Note that this time, we need to include the nonce (as it is part of the data which is signed). We use the current nonce of the address of Alice, of course.

Interacting with a smart contract

So far, we have covered the basic functionality of the library – creating, signing and submitting transactions. Let us now turn to smart contracts. As stated above, I assume that you have fired up Brownie and deployed a version of our smart contract. The contract address that Brownie gave me is 0x3194cBDC3dbcd3E11a07892e7bA5c3394048Cc87, which should be identical to your result as it only depends on the nonce and the account, so it should be the same as long as the deployment is the first transaction that you have done after restarting Ganache.

To access a contract from web3, the library needs to know how the arguments and return values need to be encoded and decoded. For that purpose, you will have to specify the contract ABI. The ABI – in a JSON format – is generated by the compiler. When we deploy using Brownie, we can access it using the abi attribute of the resulting object. Here is the ABI in our case.

abi = [
    {
        'anonymous': False,
        'inputs': [
            {
                'indexed': True,
                'internalType': "address",
                'name': "sender",
                'type': "address"
            },
            {
                'indexed': False,
                'internalType': "uint256",
                'name': "oldValue",
                'type': "uint256"
            },
            {
                'indexed': False,
                'internalType': "uint256",
                'name': "newValue",
                'type': "uint256"
            }
        ],
        'name': "Increment",
        'type': "event"
    },
    {
        'inputs': [],
        'name': "increment",
        'outputs': [],
        'stateMutability': "nonpayable",
        'type': "function"
    },
    {
        'inputs': [],
        'name': "read",
        'outputs': [
            {
                'internalType': "uint256",
                'name': "",
                'type': "uint256"
            }
        ],
        'stateMutability': "view",
        'type': "function"
    }
]

This looks a bit intimidating, but is actually not so hard to read. The ABI is a list, and each entry either describes an event or a function. For both, events and functions, the inputs are specified, i.e. the parameters., and similarly the outputs are described. Every parameter has types (Solidity distinguishes between internal types and the type used for encoding), and a name. For events, the parameters can be indexed. In addition, there are some specifiers for functions like the information whether it is a view or not.

Let us start to work with the ABI. Run the command above to import the ABI into a variable abi in your ipython session. Having this, we can now instantiate an object that represents the contract within web3. To talk to a contract, the library needs to know the contract address and its ABI, and these are the parameters that we need to specify.

address = "0x3194cBDC3dbcd3E11a07892e7bA5c3394048Cc87"
counter = w3.eth.contract(address=address, abi=abi)

It is instructive to user dir and help to better understand the object that this call returns. It has an attribute called functions that is a container class for the functions of the contract. Each contract function shows up as a method of this object. Calling this method, however, does not invoke the contract yet, but instead returns an object of type ContractFunction. Once we have this object, we can either use it to make a call or a transaction (this two-step approach reminds me a bit of a prepared statement when using embedded SQL).

Let us see how this works – we will first read out the counter value, then increment by one and then read the value again.

counter.functions.read().call()
txn_hash = counter.functions.increment().transact({"from": me})
w3.eth.wait_for_transaction_receipt(txn_hash)
counter.functions.read().call()

Note how we pass the sender of the transaction to the transact method – we could as well include other parameters like the gas price, the gas limit or the nonce at this point. You can, however, not pass the data field, as the data will be set during the encoding.

Another important point is how parameters to the contract method need to be handled. Suppose we had a method add(uint256) which would allow us to increase the counter not by one, but by some provided value. To increase the counter by x, we would then have to run

counter.functions.add(x).transact({"from": me})

Thus the parameters of the contract method need to be part of the call that creates the ContractFunction, and not be included in the transaction.

So far we have seen how we can connect to an RPC server, submit transactions, get access to already deployed smart contracts and invoke their functions. The web3 API has a bit more to offer, and I urge you to read the documentation and, in ipython, play around with the built-in help function to browse through the various objects that make up the library. In the next post, we will learn how to use web3 to not only talk to an existing smart contract, but also to compile and deploy a contract.

Fun with Solidity and Brownie

For me, choosing the featured image for a post is often the hardest part of writing it, but today, the choice was clear and I could not resist. But back to business – today we will learn how Brownie can be used to compile smart contracts, deploy them to a test chain, interact with the contract and test it.

Installing Brownie

As Brownie comes as a Python3 package (eth-brownie), installing it is rather straightforward. The only dependency that you have to observe which is not handled by the packet manager is that to Ganache, which Brownie uses as built-in node. On Ubuntu 20.04, for instance, you would run

sudo apt-get update
sudo apt-get install python3-pip python3-dev npm
pip3 install eth-brownie
sudo npm install -g ganache-cli@6.12.1

Note that by default, Brownie will install itself in .local/bin in your home directory, so you will probably want to add this to your path.

export PATH=$PATH:$HOME/.local/bin

Setting up a Brownie project

To work correctly, Brownie expects to be executed at the root node of a directory tree that has a certain standard layout. To start from scratch, you can use the command brownie init to create such a tree (do not do this yet but read on) . Brownie will create the following directories.

  • contracts – this is where Brownie expects you to store all your smart contracts as Solidity source files
  • build – Brownie uses this directory to store the results of a compile
  • interfaces – this is similar to contracts, place any interface files that you want to use here (it will become clearer a bit later what an interface is)
  • reports – this directory is used by Brownie to store reports, for instance code coverage reports
  • scripts – used to store Python scripts, for instance for deployments
  • tests – this is where all the unit tests should go

As some of the items that Brownie maintains should not end up in my GitHub repository, I typically create a subdirectory in the repository that I add to the gitignore file, set up a project inside this subdirectory and then create symlinks to the contracts and tests that I actually want to use. If you want to follow this strategy, use the following commands to clone the repository for this series and set up the Brownie project.

git clone https://github.com/christianb93/nft-bootcamp
cd nft-bootcamp
mkdir tmp
cd tmp
brownie init
cd contracts
ln -s ../../contracts/* .
cd ../tests
ln -s ../../tests/* .
cd ..

Note that all further commands should be executed from the tmp directory, which is now the project root directory from Brownies point of view.

Compiling and deploying a contract

As our first step, let us try to compile our counter. Brownie will happily compile all contracts that are stored in the project when you run

brownie compile

The first time when you execute this command, you will find that Brownie actually downloads a copy of the Solidity compiler that it then invokes behind the scenes. By default, Brownie will not recompile contracts that have not changed, but you can force a recompile via the --all flag.

Once the compile has been done, let us enter the Brownie console. This is actually the tool which you mostly use to interact with Brownie. Essentially, the console is an interactive Python console with the additional functionality of Brownie built into it.

brownie console

The first thing that Brownie will do when you start the console is to look for a running Ethereum client on your machine. Brownie expects this client to sit at port 8545 on localhost (we will learn later how this can be changed). If no such client is found, it will automatically start Ganache and, once the client is up and running, you will see a prompt. Let us now deploy our contract. At the prompt, simply enter

counter = Counter.deploy({"from": accounts[0]});

There is a couple of comments that are in order. First, to make a deployment, we need to provide an account by which the deployment transaction that Brownie will create will be signed. As we will see in the next section, Ganache provides a set of standard accounts that we can uses for that purpose. Brownie stores those in the accounts array, and we simply select the first one.

Second, the Counter object that we reference here is an object that Brownie creates on the fly. In fact, Brownie will create such an object for every contract that it finds in the project directory, using the same name as that of the contract. This is a container and does not yet reference a deployed contract, but if offers a method to deploy a contract, and this method returns another object which is now instantiated and points to the newly deployed contract. Brownie will also add methods to a contract that correspond to the methods of the smart contract that it represents, so that we can simply invoke these methods to talk to the contract. In our case, running

dir(counter)

will show you that the newly created object has methods read and increment, corresponding to those of our contract. So to get the counter value, increment it by one and get the new value, we could simply do something like

# This should return zero
counter.read()
txn = counter.increment()
# This should now return one
counter.read()

Note that by default, Brownie uses the account that deployed the contract as the “from ” account of the resulting transaction. This – and other attributes of the transaction – can be modified by including a dictionary with the transaction attributes to be used as last parameter to the method, i.e. increment in our case.

It is also instructive to look at the transaction object that the second statement has created. To read the logs, for instance, you can use

txn.logs

This will again show you the typical fields of a log entry – the address, the data and the topics. To read the interpreted version of the logs, i.e. the events, use

txn.events

The transaction (which is actually the transaction receipt) has many more interesting fields, like the gas limit, the gas used, the gas price, the block number and even a full trace of the execution on the level of single instructions (which seems to be what Geth calls a basic trace). To get a nicely formatted and comprehensive overview over some of these fields, run

txn.info()

Accounts in Brownie

Accounts can be very confusing when working with Ethereum clients, and this is a good point in time to shed some light on this. Obviously, when you want to submit a transaction, you will have to get access to the private key of the sender at some point to be able to sign it. There are essentially three different approaches how this can be done. How exactly these approaches are called is not standardized, here is the terminoloy that I prefer to use.

First, an account can be node-managed. This simply means that the node (i.e. the Ethereum client running on the node) maintains a secret store somewhere, typically on disk, and stores the private keys in this secret store. Obviously, clients will usually encrypt the private key and use a password or passphrase for that purpose. How exactly this is done is not formally standardized, but both Geth and Ganache implement an additional API with the personal namespace (see here for the documentation), and also OpenEthereum offers such an API, albeit with slightly different methods. Using this API, a user can

  • create a new account which will then be added to the key store by the client
  • get a list of managed accounts
  • sign a transaction with a given account
  • import an account, for instance by specifying its private key
  • lock an account, which stops the client from using it
  • unlock an account, which allows the client to use it again for a certain period of time

When you submit a transaction to a client using the eth_sendTransaction API method, the client will scan the key store to see whether is has the private key for the requested sender on file. If yes, it is able to sign the transaction and submit it (see for instance the source code here).

Be very careful when using this mechanism. Even though this is great for development and testing environments, it implies that while the account is unlocked, everybody with access to the API can submit transactions on your behalf! In fact, there are systematic scans going on (see for instance this article) to detect unlocked accounts and exploit them, so do not say I have not warned you….

In Brownie, we can inspect the list of node-managed accounts (i.e. accounts managed by Ganache in our case) using either the accounts array or the corresponding API call.

web3.eth.get_accounts()
accounts

You will see the same list of ten accounts using both methods, with the difference that the accounts array contains objects that Brownie has built for you, while the first method only returns an array of strings.

Let us now turn to the second method – application-managed accounts. Here, the application that you use to access the blockchain (a program, a frontend or a tool like Brownie) is in charge of managing the accounts. It can do so by storing the accounts locally, protected again by some password, or in memory. When an application wants to send a transaction, it now has to sign the transaction using the private key, and would then use the eth_sendRawTransaction method to submit the signed transaction to the network.

Brownie supports this method as well. To illustrate this, enter the following sequence of commands in the Brownie console to create two new accounts, transfer 1000 Ether from one of the test accounts to the first of the newly created accounts and then prepare and sign a transaction that is transferring some Ether to the second account.

# Will spit out a mnemonic
me = accounts.add()
alice = accounts.add()
# Give me some Ether
accounts[0].transfer(to=me.address, amount=1000);
txn = {
  "from": me.address,
  "to": alice.address,
  "value": 10,
  "gas": 21000,
  "gasPrice": 0,
  "nonce": me.nonce
}
txn_signed = web3.eth.account.signTransaction(txn, me.private_key)
web3.eth.send_raw_transaction(txn_signed.rawTransaction)
alice.balance()

When you now run the accounts command again, you will find two new entries representing the two accounts that we have added. However, these entries are now of type LocalAccount, and web3.eth.get_accounts() will show that they have not been added to the node, but are managed by Brownie.

Note that Brownie will not store the accounts on disk if not being told to do so, but you can do this. By default, Brownie keeps each local account in a separate file. To save your account, enter

me.save("myAccount")

which will prompt you for a password and then store the account in a file called myAccount. When you now exit Brownie and start it again, you can load the account by running

me = accounts.load("myAccount")

You will then be prompted once more for the password, and assuming that you supply the correct password, the account will again be available.

More or less the same code would apply if you had chosen to go for the third method – user-managed accounts. In this approach, the private key is never stored by the application. The user is responsible for managing accounts, and only if a transaction is to be made, the private key is presented to the application, using a parameter or an input field. The application will never store the account or the private key (of course, it will have to reside in memory for some time), and the user has to enter the private key for every single transaction. We will see an example for this when we deploy a contract using Python and web3 in a later post.

Writing and running unit tests

Before closing this post, let us take a look at another nice feature of Brownie – unit testing. Brownie expects test to be located in the corresponding subdirectory and to start with the prefix “test_”. Within the file, Brownie will then look for functions prefixed with “test_” as well and run them as unit tests, using pytest.

Let us look at an example. In my repository, you will find a file test_Counter.py (which should already be symlinked into the tests directory of the Brownie project tree if you have followed the instructions above to initialize the directory tree). If you have ever used pytest before, this file contains a few things that will look familiar to you – there are test methods, and there some fixtures. Let us focus on those parts which are specific to the usage in combination with Brownie. First, there is the line

from brownie import Counter, accounts, exceptions;

This imports a few objects and makes them available, similar to the Brownie console. The most important one is the Counter object itself, which will allow us to deploy instances of the counter and test them. Next, we need access to a deployed version of the contract. This is handled by a fixture which uses the Counter object that we have just imported.

@pytest.fixture
def counter():
    return accounts[0].deploy(Counter);

Here, we use the deploy method of an Account in Brownie to deploy, which is an alternative way to specify the account from which the deployment will be done. We can now use this counter object as if we would be working in the console, i.e. we can invoke its methods to communicate with the underlying smart contract and check whether they behave as expected. We also have access to other features of Brownie, we can, for instance, inspect the transaction receipt that is returned by a method invocation that results in a transaction, and use it to get and verify events and logs.

Once you have written your unit tests, you want to run them. Again, this is easy – leave the console using Ctrl-D, make sure you are still in the root directory of the project tree (i.e. the tmp directory) and run

brownie test tests/test_Counter.py

As you will see, this brings up a local Ganache server and executes your tests using pytest, with the usual pytest output. But you can generate much more information – run

brownie test tests/test_Counter.py --coverage --gas

to receive the output below.

In addition to the test results, you see a gas report, detailing the gas usage of every invoked method, and a coverage report. To get more details on the code, you can use the Brownie GUI as described here. Note, however, that this requires the tkinter package to be installed (on Ubuntu, you will have to use sudo apt-get install python3-tk).

You can also simply run brownie test to execute all tests, but the repository does already contain tests for future blog entries, so this might be a bit confusing (but should hopefully work).

This completes our short introduction. You should now be able to deploy smart contracts and to interact with them. In the next post, we will do the same thing using plain Python and the web3 framework, which will also prepare us for the usage of the web3.js framework that we will need when building our frontend.

Writing a smart contract in Solidity using the Remix IDE

Today, we will actually write our first smart contract, compile and deploy it and discuss some of the features of the Solidity programming language. To be able to start without any installation, we will use the Remix IDE, which is a fully browser-based development environment for Ethereum smart contracts.

Getting started with the Remix IDE

Without further ado, let us go ahead and work on our first contract. Open your browser and point it to http://remix.ethereum.org. The initial load might take a few seconds, but once the load is complete, you should see a screen with the layout of a typical IDE. On the left hand side of the screen, there is navigation bar, with the items (from top to the bottom)

  • Home
  • Explorer
  • Compile
  • Deploy
  • Debug

Between that screen and the main screen, there is – initially, as we start in the explorer view – a directory tree. Remix will pre-populate this tree with some sample contracts and save them in your browser storage, so if you add, change or remove a file, close the browser and reopen it, your changes will still be visible. It is also possible to synchronize with a GitHub account or with the local machine. Finally, the bottom area of the screen is the terminal that provides some additional output and also allows you to interact directly with Remix using the JS API.

Initially, the contracts directory already contains three smart contracts, called 1_Storage.sol, 2_Owner.sol and 3_Ballot.sol. Delete all those files, and create a new file Counter.sol with the same content as this contract in my GitHub repository. Open the contract in the explorer.

Next, click on the “Compile” icon on the navigation bar (you will have to select the contract in the explorer first). Hit the button “Compile Counter.sol”. After some seconds, the compile should complete. Next, select the “Deploy” icon in the navigation bar. Stick to the standard selection of the environment (this is the built-in virtual machine that is running in your browser) and click “Deploy”. When the deployment succeeds, you should see an entry in the section “Deployed contracts”.

Expanding this entry should reveal two buttons called “increment” and “read”. These buttons allow you to invoke the methods of the contract that we have defined. First, click on “read” once to get the current value of the contract, which should be zero. Then, hit “increment” once and click “read” again.

If everything worked, reading after incrementiing should show that the value has increased by one. In the screenshot above, I have hit “increment” three times, so the return value of the read method is three (this is a bit confusing – the zero in front of the variable type is the index of the return value, in our case this is the first return value). Congratulations, we have just tested our first smart contract!

Structure of a smart contract

This is fun, but let us now go back and try to understand what we have actually done. For that purpose, let us go once through the source code. This is not meant to be a systematic introduction to Solidity, and you might want to consult the actually quite readable documentation in parallel, but should give you a first idea of how a contract looks like (we will in fact need and study a few more advanced features of the language as we progress). The first two lines are actually not executed, but more or less metadata.

// SPDX-License-Identifier: GPL-3.0

pragma solidity >=0.8.0 <0.9.0;

The first line is a comment. By convention, a Solidity contract contains a license identifier, I have chosen the GPL license for this example. The second line is a pragma. Like in C or C++, a pragma is an additional instruction for the compiler. In this case, we restrict the Solidity version to somewhere between 0.8 and 0.9 (there is a reason why I have chosen 0.8 as minimum version – this is the version of Solidity which introduced overflow checks, and our counter can of course overflow – at least theoretically, as you will probably not find the time to hit the increment button 2256 times…), and the compile will fail if its own version does not match.

The next block declares our actual contract. In Solidity, a contract is a bit similar to a class in other object-oriented languages. We give our contract a name (Counter) and, inside the curly braces, define its events, methods and attributes.

contract Counter {

    event Increment (
        address indexed sender,
        uint256 oldValue,
        uint256 newValue
    );

    uint256 private counter;

    function increment() public {
        emit Increment(msg.sender,counter,counter + 1);
        counter++;
    }

    function read() public view returns (uint256) {
        return counter;
    }
  
}

Let us ignore the event that we define for a second, we will discuss this in the next section. After the event declaration, we first declare an attribute counter. This will be a 256-bit integer, which is the native datatype of Solidity, i.e. a 32 byte number. We also declare the attribute as private, i.e. not visible for other contracts, but of course it is not really private – after all, everybody with access to a blockchain explorer can inspect the storage and look at its value.

Speaking of storage, Solidity distinguishes different types of storage. For some data types, like integers, there is a default storage type, for some other data types, you will have to add the storage type explicitly as an additional qualifier called the data location. The three storage types are

  • Memory – this is linear storage that is cleaned up and reinitialized with every new transaction. Thus you can use memory inside an invocation of your contract, but as soon as the invocation finishes, the data is lost
  • Storage – this instructs Solidity to place the variable in the storage part of the blockchain state of the contract address. Thus, variables which are located in storage are actually persisted in the blockchain. This is expensive, but the only way to store information across invocations
  • Calldata – this refers to read-only memory which is used to store parameters for invocations of your contract from an EOA or other contracts and can also be used for function parameters and return values

In our case, we do not have to explicitly define a data location, as our counter is a state variable (i.e. defined on the contract level) and, being an integer, it will automatically be stored in the contract storage. Next, we define two functions, i.e. methods of the contract. We first define the increment function, which does not accept any argument and does not return anything. Note the additional public keyword which declares that our method can be called from outside the contract. Inside the function, we first emit an event (more on this later), and then simply increment the counter by one.

The next function is the function to read out the counter value. This function does not have an argument either, but a return value of type uint256. The interesting part of this function is the additional keyword view. A function can be declared as a view if it does not alter the state (which, in our case, means that the function does not alter the counter value).

What is the point of this? Recall that a change of state can only happen as part of a transaction. Thus a Solidity function can usually only be called as part of a transaction that has an associated gas cost and is executed by all nodes while validating the corresponding transaction. For many cases, this is an enormous overhead. Suppose, for instance, that you build a contract that manages a coin or a token. You would probably not want to burn gas every time you simply want to read out a balance. Here, a view function comes to the rescue. As such a function does not alter the state, but essentially only reads and processes state, it can be executed locally, outside of a transaction, and the execution will not cost you any gas at all.

Invoking a smart contract

To better understand how a function marked as view differs from an ordinary function, it is helpful to understand how a smart contract is actually invoked by an application. There are actually two different ways how this can be done.

First, you can send a transaction to the contract address, using the method eth_sendTransaction or eth_sendRawTransaction of the JSON RPC API. To pass parameters to the contract, include them in the transaction data field. This field is also used to select the method of the smart contract you want to invoke. By convention, the first four bytes of the data field are filled with the hash value of the signature of the method that you want to invoke. Recall that the EVM bytecode does not know anything about methods or arguments, contract execution will always start at address zero. Therefore, the compiler will generate code that reads the data field to first determine the method that needs to invoked, then gets the parameters from the data field, copies them to memory and finally jumps to the code representing the correct method (we will dive deeper into this in a later post).

The transaction is then executed, thereby running the bytecode, incluced in a block and persisted. This is an asynchronous process, and therefore, a contract invocation done via a transaction cannot return a value to the caller. In addition, it consumes gas. These are two good reasons why an alternative is helpful.

This alternative is the eth_call method of the API. This method also executes the bytecode at the target address, but differs in an important point – it is only executed in the local node (i.e. the node to which you direct the API call). No transaction is generated, which implies that no state update can be done, but also that no gas is consumed. The execution is synchronous and can return a value.

Theoretically, you can also make a call to a method that modifies the state. This will again run locally and can be thought of as a simulation, without making any permanent changes to the state. Similary, you could use a transaction to read the counter value, but this would cost you gas and therefore Ether and not even allow you to consume the returned value, as a transaction produces a receipt which does not contain something like a return value. IDEs like Remix or tools like Brownie or Truffle use a description of the methods of a contract which is emitted by a compiler and known as the ABI to determine whether it makes sense to invoke something via a call or a transaction (we will see how this works a bit later when we use Python to compile and deploy a smart contract).

It is instructive to run the increment method in Remix and inspect the resulting transaction. You might already have noted that every time you hit the “increment” button, a new transaction is indicated in the terminal (the tab at the bottom of the screen). At the right hand side of a transaction, you should see a blue “Debug button” (this is also fun, but we resist the temptation for a moment) and a little arrow. Click on the arrow to expand the transaction.

What Remix shows you is actually a mix of data contained in the transaction and the transaction receipt. You see a few fields that we have already discussed in one of the previous posts like the “to” field, the gas limit and the value. You also see a field called “input” which is actually the transaction data and should be

0xd09de08a

This is a four-byte value, and using an online hash calculator (just google for “keccak online”), you can easily verify that these are the first four bytes of the keccak hash of the string “increment()”, i.e. the signature of the method that you invoke.

Logs, transaction receipts and events

It should now be rather obvious what our contract is doing, but we have skipped a point that is a bit more complex – events and logs. If you look at the transaction that we have generated using the increment method, you will see that it also contains a field called logs. The idea of logs is to provide a facility that allows a smart contract to log some data without having to write it into the expensive storage. For that purpose, the EVM offers a set of instructions called LOG0 to LOG4.

The logs of a transaction are part of the transaction receipt that is created by a client when a transaction is processed. As the transaction receipts can be derived from the history of all transactions, they are not actually stored on the blockchain itself (i.e. they are not part of a block), but are maintained by each node individually. A block header does, however, contain a hash of the transaction receipts, so that transaction receipts and therefore logs can be validated.

To structure logs, a log can be assigned to one or more topics. More precisely, to produce a log entry with no topic, you would use the LOG0 instruction, to produce a log entry with one topic, you would use LOG1 and so forth. Consequently, there can only be up to four topics for each log entry. In addition to the topics, a log entry contains a data field of arbitrary length, and the address of the contract that emitted the log. So a full entry consists of

  • the sender, i.e. the contract creating the log
  • between zero and four topics
  • a data field

In Solidity, logs show up as events. Events are defined as part of a contract, as it is the case in our example contract above. By convention, Solidity will always use the first topic to hold the signature of the event. The parameters of the event can be indexed (then Solidity stores them in the remaining topics), or non-indexed (then Solidity uses the data field to store them). Thus, in our example, the events would correspond to log entries where the sender is the contract, the first topic is the signature of the event, i.e. the keccak hash of the string “Increment(address,uint256,uint256)” (note that all white spaces and all parameter names have been removed), which is 0x64f50d594c2a739c7088f9fc6785e1934030e17b52f1a894baec61b98633a59f, the second topic is the sender and the data will contain the old and new value of the counter. You can inspect the transaction receipt and with it the log entries of the transaction that we have created by typing

web3.eth.getTransactionReceipt("0x35325d4aac18a862380ee6dc4a6b7ed1e8eb9c4035dc7d24b35ad847b19deadf")

into the console, where you have to replace the argument with the hash value of the generated transaction. The logs will appear as an array, and each entry consists of the address of the creating contract, the data, and the list of topics.

Why do we need topics? The idea of logs and events is that an application can register for certain events and take action if an event is observed. To make this work, we need an efficient way to scan the chain for specific events. Now going through all transactions and looking at all log entries is of course not an efficient approach, and we need a way to quickly determine whether a certain block has produced log entries which are of interest for us. For that purpose, Ethereum employs a structure known as Bloom filter, which is a data structure designed to quickly figure out whether a data set holds a specific entry. Each block contains the Bloom filter of the addresses and topics of the log entries produced by the transactions in the block, and thus a client can quickly scan all blocks for log entries that are associated with specific topics. The JSON RPC API allows you to define filters for new log entries, specifying the contract address, the block range and the topics, so that you can retrieve only those logs which are of interest for your application.

Today, we have actually written, compiled and executed our first smart contract, and we have managed to understood a bit how this works and what happens during contract execution. In the next post, I will show you how to do the same thing with an editor of your choice and the Brownie development environment.