Understanding the Ethereum virtual machine – part III

Having found our way through the mechanics of the Ethereum virtual machine in the last post, we are now in a position to better understand what exactly goes on when a smart contracts hand over control to another smart contract or transfers Ether.

Calls and static calls

Towards the end of the previous post in this series, we have already taken a glimpse at how an ordinary CALL opcode is being processed. As a reminder, here is the diagram that displays how the EVM and the EVM interpreter interact to run smart contract

In our case – the processing of a CALL – this specifically implies that the following steps will be carried out (we ignore gas processing for the time being, as this is a bit more complicated and will be discussed in depth in a separate section).

  • the interpreter hits upon the CALL opcode
  • it performs a look up in the jump table and determines that the function opCall needs to be run
  • it gets the parameters from the stack, in particular the address of the contract to be called (the second stack item)
  • it then extracts the input data from the memory of the currently executing code
  • we then invoke the Call method of the EVM, using the contract address and the input data as arguments
  • as we have learned, this will result in a new execution context (i.e. a new Contract object, a new stack and a freshly initialized memory) in which the code of the target contract will be executed
  • at the end, we get the returned data (an array of bytes) back
  • if everything went fine, we copy the returned data back into the memory of the currently executing contract

It is important to understand that this comes with a full context switch – the target contract will execute with its own stack and memory, state changes made by the target contract will refer to the state of the target contract, and Ether transferred with the call is credited to the target contract.

Also note that there are actually two ways how the result of the call is made available to the caller. First, the result of the call (a pointer to a byte array) will be copied to the memory of the calling contract. In addition, the return value is also returned by opCall and there it is copied once more, this time to a special buffer called the return data buffer. The caller can copy the data stored in this buffer and determine its length using the RETURNDATACOPY and RETURNDATALENGTH opcodes introduced with EIP-211 (in order to make it easier to pass back return data whose length is not yet known when the call is made).

In summary, the called contract is executed essentially as if it were the initial contract execution of the underlying transaction. Calls can of course be nested, so we now see that a transaction should be considered as the top-level call, which can be followed by a number of nested calls (actually, this number is limited, for instance by the limited depth of the call stack).

Of course, executing an unknown contract can be a significant security risk. We have seen an example in our post on smart contract security, where a malicious contract calling back into your own contract can cause a double-spending. Therefore, it is natural to somehow try to restrict what a called contract can due. One of the first restrictions of this type is the introduction of the STATICCALL with EIP-214. A static call is very much like an ordinary call, except that the called contract is not allowed to make any state changes, in particular no value transfer is possible as part of a static call.

The function opStaticCall realizing this is actually very similar to the processing of an ordinary call. There are two essential differences. First, there is no value and therefore one parameter less that needs to be taken from the stack. Second, the method of the EVM that is eventually invoked is not Call but StaticCall. The structure of this function is very similar to that of an ordinary call, so let us focus on the differences. Here is a short snipped (leaving out some parts to focus on the differences) of the Call method.

evm.Context.Transfer(evm.StateDB, caller.Address(), addr, value)
code := evm.StateDB.GetCode(addr)
contract := NewContract(caller, AccountRef(addrCopy), value, gas)
contract.SetCallCode(&addrCopy, evm.StateDB.GetCodeHash(addrCopy), code)
ret, err = evm.interpreter.Run(contract, input, false)

And here is the corresponding code for a static call (again, I have made a few changes to better highlight the differences).

addrCopy := addr
code := evm.StateDB.GetCode(addr)
contract := NewContract(caller, AccountRef(addrCopy), new(big.Int), gas)
contract.SetCallCode(&addrCopy, evm.StateDB.GetCodeHash(addrCopy), code)
ret, err = evm.interpreter.Run(contract, input, true)

So we see that there are three essential differences. First, in a static call, there is value transfer – this is as expected, as a static call is not allowed to make a value transfer which represents a change to the state. Second, when we build the contract, the third parameter is zero – again, this is related to the fact that there is no value transfer, as this parameter determines the value that, for instance, the opcode CALLVALUE returns. Finally, we set the third parameter of the Run function to true. In our discussion of the Run method in the previous post, we have already seen that this disallows all instructions which are marked as state changing.

Delegation and the proxy pattern

Apart from calls and static calls, there is a third way to invoke another contract, namely a delegate call. Roughly speaking, a delegate call implies that instead of executing the code of the called contract within the context of the called contract, we execute the code within the context of the caller. Thus, we essentially run the code of the called contract as if it were a part of the caller code, as you would run a library (however, this is of course not how libraries are actually realized in Solidity where a library is simply linked into the contract at build time).

In the EVM, a delegate call is done using the opcode DELEGATECALL (well, that did probably not come as a real surprise). Similar to a static call, there is no value transfer for this call and correspondingly no value parameter on the stack. Going through the same analysis as for a static call, we find that execution of the opcode delegates to the method DelegateCall() of the EVM. Let us again look at the parts of the code that differ from an ordinary call.

addrCopy := addr
code := evm.StateDB.GetCode(addr)
contract := NewContract(caller, AccountRef(caller.Address()), nil, gas).AsDelegate()
contract.SetCallCode(&addrCopy, evm.StateDB.GetCodeHash(addrCopy), code) 
ret, err = evm.interpreter.Run(contract, input, false)

Looking at this , we spot three differences compared to an ordinary call. First, the second parameter used for the creation of the new contract (which is the parameter which will determine the self field of the new contract and with that the address used to read and change state during contract execution) is not set to the target contract, but to the address of the caller, i.e. the currently executing contract, while the address used to determine the code to be run is still that of the target contract. Thus, as promised, we execute the code of the target contract within the context of the currently executing contract.

A second difference is the third argument used for contract creation, which is the value transferred with this call. Again, this is zero (even nil). Finally, after creating the contract, we execute its AsDelegate() method. This changes the attributes CallerAddress and value of the contract to that of the currently executing contract. Thus, whenever we execute the opcodes CALLVALUE or CALLER, we get the same values as in the context of the currently executing contract, as promised by EIP-7, the EIP which introduced delegate calls.

One of the motivations behind introducing this possibility was that it allows for a pattern known as proxy pattern. In this pattern, there are two contracts involved. First, there is the proxy contract. The proxy contract accepts a call or transaction and is responsible for holding the state. It does, however, not contain any non-trivial logic. Instead, it uses a delegate call to invoke the logic residing in a second contract, the logic contract.

Why would you want to do this? There are, in fact, a couple of interesting use cases for this pattern. First, it allows you to build an upgradeable contract. Recall that – at least until the CREATE2 opcode was introduced – it was not possible to change a smart contract after is has been deployed. Even though this is of course by intention and increases trust in a smart contract (it will be the same, no matter when you interact with it), it also implies a couple of challenges, most notably that it makes it impossible to add features to a smart contract over time or to fix a security issue. The proxy pattern, however, does allow you to do this. You could, for instance, store the address of the logic contract in the proxy contract instead of hard-coding it, and then add a method to the proxy that allows you to change that address. You can then deploy a new version of the logic to a new address and then update the address stored in the proxy contract to migrate from the old version to the new version. As the state is part of the proxy contract which stays at its current location, the state will be untouched, and as the address that the users interact with does not change, the users might not even notice the change. Needless to say that this is very useful for some cases, but can also be abused by tricking a user into trusting a contract and then changing its functionality, so be careful when interacting with a smart contract that performs delegation.

A second use case is related to re-use. As an example, suppose you have developed a smart contract that implements some useful wallet-like functionality, maybe time-triggered transfers. You want to make this available to others. Now you could of course allow anybody to deploy your smart contract, but this would lead to many addresses on the blockchain containing exactly the same code. Alternatively, you could store the logic in one logic contract and than only distribute the code for the proxy. A new user would then simply deploy a proxy, so each proxy would act as a wallet with an individual state and balance, but all of them would run the same logic. Again, it goes without saying that this implies that your users trust you and your contract – if, for instance, your logic contract is able to remove itself (“self-destruct” using the corresponding opcode), than this would of course render all deployed proxies useless and the balance stored in them would be lost forever.

Finally (and this apparently was one of the motivation behind EIP-7) you could have a large contract whose deployment consumes more gas than the gas limit of a block allows. You could then split the logic into several smaller logic contracts and use a proxy to tie them together into a common interface.

There are several ongoing attempts to standardize this pattern and in particular upgradeable contracts. EIP-897, for instance, proposes a standard to expose the address to which a proxy is pointing. EIP-1967 addresses an interesting problem that the pattern has – the logic contract and the proxy contract share a common state, and thus the proxy contract needs to find a way to store the address of the logic contract without conflicting with the storage layout of the logic contract. Finally, EIP-1822 proposes a standard for upgradeable contracts. It is instructive to read through these EIPs and I highly advise you to do so and also have a look at the implementations described or linked in them.

Gas handling during a call

Let us now turn to gas handling during a call. We have already seen that, as for every instruction, there is a constant gas cost and a dynamic gas cost. In addition, there are two special contributions which are not present for other instructions – a refund and a stipend.

The constant gas cost is simple – this is simply a constant value of (currently) 700 units of gas, increased from previously 40 with EIP-150. The dynamic gas cost is already a bit more complicated and itself consists of four positions. The first three positions are rather straightforward

  • first, there is a fee of 9000 units of gas when a non-zero value is transferred as part of the call
  • second, there is an account creation fee of 25000 whenever a non-zero value is transferred to a non-existing account as part of the call
  • third, there is the usual gas fee for memory expansion, as for many other instructions

The fourth contribution to the dynamic gas cost is a bit more tricky. The problem we are facing at this point is that the contract which is called will of course consume gas as well, but at this point in time, we do not know how much this is going to be. To solve this, a position called the gas cap is used. Initially, this gas cap was simply the first stack item, i.e. the first argument to the CALL instruction, which specifies the gas limit for the contract to be executed, i.e. the part of our remaining gas that we want to pass on to the called contract. We could now simply use this number as additional gas cost and then, once the called contract returns, see how much of that is still unused and refund that amount.

This is indeed how the gas payment for a call worked before EIP-150 was introduced. This EIP was drafted to address denial-of-service attacks that utilized the fact that the costs for some instructions, among them making a call, was no longer reflecting the actual computing cost on the client. As a counter-measure, the cost for a call was increased from previously 40 to the new still valid 700. This, however, caused problems with existing contract that tried to calculate the amount of gas they would make available to called contract by taking the currently remaining gas (inquired via the GAS opcode) and subtracting the constant fee of 40 units of gas. To avoid this, the developers thought about coming up with a mechanism which allowed a contract to make “almost all” remaining gas available to the caller, without having to hard-code gas fees. More precisely, “almost all” means that the following algorithm is applied to calculate the gas cap.

  • Determine the gas which is currently still available, after having deducted the constant gas cost already
  • Determine the base fee, i.e. the dynamic gas cost for the call calculated so far (memory fee, transfer fee and creation fee)
  • Subtract this from the remaining gas to determine the gas which will still be available after paying for all other gas cost contributions (“projected available gas”)
  • Read out the first value from the stack (the first parameter of the GAS instruction), i.e. the requested gas limit
  • determine a gas cap as 63 / 64 times the projected available gas
  • if the requested gas limit is higher than the gas cap, return the gas cap, otherwise return the requested gas limit

Thus a contract can effectively pass almost all of the remaining gas to the callee by providing a very large requested gas limit as first argument to the CALL instruction, so that the requested gas limit is definitely smaller than the calculated cap. The factor of 63 / 64 has been put in as an additional protection against recursive calls. The outcome of this algorithm is then used for two purposes – as an upfront payment to cover the maximum amount of gas that the callee might need, and as the gas supply that the callee actually obtains for its execution.

Now, I have been cheating a bit as there are two components in the diagram above that we have not yet discussed. First, I have just told you that the outcome of the EIP-150 algorithm is passed as available gas to the callee. This, however, is only true if the call does not transfer any Ether. If it does, there is an additional stipend of 2300 gas which is added to the gas made available to the callee before actually making the call. Note that this stipend does not count against the gas cost of the callee, as it is not part of the dynamic gas cost, so it effectively has two implications – it reduces the cost of the call by 2300 units of gas and, at the same time, it makes sure that even if the caller specified zero as gas limit for the call, the callee has at least 2300 units of gas available. The motivation of this is that a call with a non-zero value typically triggers the receive function or fallback function of the called contract, and calls with a gas supply of zero will let this function fail. Thus the gas stipend serves as a safe-guard to reduce the risk of a value transfer failing because the recipient is a smart contract and its receive- or fallback-function runs out of gas.

Finally, there is the refund, which happens here and simply amounts to adding the gas that the callee has not consumed to the available gas of the current execution context again.

The gas stipend and transfers in Solidity

The gas stipend is one of the less documented features of smart contracts, and a part of the confusion that I have seen around this topic (which, in fact, was the main motivation for the slightly elaborated treatment in this post) comes from the fact that a gas stipend exists in the EVM as well as in Solidity.

As explained above, the EVM adds the gas stipend depending on the value transferred with the call – in fact, the stipend only applies to calls with a non-zero value. In addition to this, Solidity applies the same logic, but only if the value is zero. To see this, you might want to use a simple contract like this one.

contract Transfer {

    uint256 value;

    function doIt() public {
        payable(msg.sender).transfer(value);
    }
}

If you compile this, for instance in Remix, and take a look at the generated bytecode, you will see that eventually, the transfer translates into a CALL instruction. The preparation of the stack preceding this instruction is a bit involved, but if you go through this carefully and wait until the dust has settled, you will find that the top of the stack looks as follows.

(value == 0) * 2300 | sender | value |

Thus the first value, which specified the gas to be made available for the subcontract, is 2300 (the gas stipend) if the value is zero, and zero otherwise. In the first case, the EVM will not add anything, in the second case, the EVM will add its own gas stipend. Thus, regardless of the value, the net effect will be that the gas stipend of 2300 units of gas always applies for a transfer. You might also want to look at this snippet in the Solidity source code that creates the corresponding code (at least if I interpret the code correctly).

What this analysis tells us as well is that there is no way to instruct the compiler to increase the gas limit of the transfer. As the 2300 units of gas will only be sufficient for very simple functions, you need a different approach when invoking contracts with a more complex receive function. When we discuss NFTs in a later post in this series, we will see how you can use interfaces in Solidity to easily call functions of a target contract. Alternatively, to simply invoke the fallback function or the receive function with a higher gas limit, you can use a low-level call. To see this in action, change the transfer in the above sample code to

(bool success, ) = 
     payable(msg.sender).call{value: value}("");

When you now compile again, take a look at the resulting bytecode and locate the CALL instruction, you will see that immediately before we do the CALL, we execute the GAS opcode. As we know, this pushes the remaining available gas onto the stack. Thus the first argument to the CALL is the remaining gas. As, by the EIP-150 algorithm above, this is in every case more than the calculated cap, the result is that the cap will be used, i.e. almost all remaining gas will be made available to the called contract. Be sure, however, to check the return value and handle any errors that might have occurred in the called contract, as Solidity does not add extra code to make sure that we revert upon errors. Note that there is an ongoing discussion to extend the functionality of transfer in Solidity to allow a transfer to explicitly pass on all the remaining gas, see this thread.

With this, we have reached the end of our post for today. In this and the previous two posts, we have taken a deep-dive into how the Ethereum virtual machine actually works, guided by the yellow paper and the code of the Go-Ethereum client. In the next post, we will move on and start to explore one of the currently “hottest” applications of smart contract – non-fungible token. Hope to see you soon!

1 Comment

Leave a Comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s