Skip to content

Ethereum Internals

Matt Suiche edited this page Jul 22, 2017 · 9 revisions

Abstract

Ethereum is gaining a signicant popularity in the blockchain community, mainly due to fact that it is design in a way that enables developers to write decentralized applications (Dapps) and smart-contract using blockchain technology. This new paradigm of applications opens the door to many possibilities and opportunities. Blockchain is often referred as secure by design, but now that blockchains can embed applications this raise multiple questions regarding architecture, design, attack vectors and patch deployments. In this paper I will discuss the architecture of the core component of Ethereum (Ethereum Virtual Machine), its vulnerabilities as well as my open-source tool "Porosity". A decompiler for EVM bytecode that generates readable Solidity syntax contracts. Enabling static and dynamic analysis of such compiled contracts.

Ethereum Virtual Machines (EVM)

The Ethereum Virtual Machine (EVM) is the runtime environment for smart contracts in Ethereum. The EVM runs smart-contracts that are built up from bytecodes. Bytecodes are identied by a 160-bit address, and stored in the blockchain, which is also known as "accounts". The EVM operates on 256-bit pseudo registers. Which means that the EVM does not operate via registers. But, through an expandable stack which is used to pass parameters not only to functions/instructions, but also for memory and other algorithmic operations. The following excerpt is taken from the Solidity documentation, and it is also worth mentioning:

There are two kinds of accounts in Ethereum which share the same address space: External accounts that are controlled by public-private key pairs (i.e. humans) and contract accounts which are controlled by the code stored together with the account.

The address of an external account is determined from the public key while the address of a contract is determined at the time the contract is created (it is derived from the creator address and the number of transactions sent from that address, the so-called “nonce”).

Regardless of whether or not the account stores code, the two types are treated equally by the EVM.

Memory Management

Stack

It does not have the concept of registers. A virtual stack is being used instead for operations such as parameters for the opcodes. The EVM uses 256-bit values from that virtual stack. It has a maximum size of 1024 elements.

Storage (Persistent)

The Storage is a persistent key-value storage mapping (256-to-256-bit integers). And is documented as below:

Every account has a persistent key-value store mapping 256-bit words to 256-bit words called storage. Furthermore, every account has a balance which can be modified by sending transactions.

Each account has a persistent memory area which is called storage. Storage is a key-value store that maps 256-bit words to 256-bit words. It is not possible to enumerate storage from within a contract and it is comparatively costly to read and even more so, to modify storage. A contract can neither read nor write to any storage apart from its own.

The storage memory is the memory declared outside of the user-defined functions and within the Contract context. For instance, in listing 1, the userBalances and withdrawn will be in the memory storage. This can also be identied by the SSTORE / SLOAD instructions.

contract SendBalance {
    mapping ( address => uint ) userBalances ;
    bool withdrawn = false ;
(...)
}

Listing 1: Storage (Persistent) Example

Memory (Volatile)

This memory is mainly used when calling functions or for regular memory operations. The ocial documentation explicitly indicates that the EVM does not have traditional registers. Which means that the virtual stack previously discussed will be used primarily to push arguments to the instructions. The following is the excerpt explaining such behavior:

The second memory area is called memory, of which a contract obtains a freshly cleared instance for each message call. Memory is linear and can be addressed at byte level, but reads are limited to a width of 256 bits, while writes can be either 8 bits or 256 bits wide. Memory is expanded by a word (256-bit), when accessing (either reading or writing) a previously untouched memory word (ie. any offset within a word). At the time of expansion, the cost in gas must be paid. Memory is more costly the larger it grows (it scales quadratically).

Traditionally the MSTORE instruction is what we would generally consider to be the instruction responsible for adding data to the stack in any typical x86/x64 system. Therefore, the instructions MSTORE / MLOAD could be identied as such with respect to the x86/x64 system. Consequently, both mstore(where, what) and mload(where) are frequently used.

Addresses

EVM uses 160-bit addresses. It is extremely crucial to understand that fact when one has to deal with type discovery. As we often see the mask 0xffffffffffffffffffffffffffffffffffffffff being applied for optimization purposes either on code or on the EVM registers.

Call Types

There are two types of functions to dierentiate when working with the EVM. The first type is the EVM functions (or EVM instructions), while the second type is the user-dened function when creating the smart-contract.

EVM

Basic Blocks

Basic Blocks usually starts with the instruction JUMPDEST, with the exception of very few exception cases. Most of the conditional and unconditional jumps have a PUSH instruction preceding them in order to push the destination offset into the stack. Although, in some cases we would also notice that the PUSH instruction containing the offset can be executed way before the actual JUMP instruction, and retrieved using stack manipulation instructions such as DUP, SWAP or POP. Those cases require dynamic execution of the code to record the stack for each JUMP instruction, as we are going to discuss this later on in sub-section 6.2.2.

EVM functions

EVM functions and/or instructions includes, but are not limited to, some of the the following:

  • Arithmetic Operations
  • Comparison & Bitwise Logic Operations
  • SHA3
  • Environmental Information
  • Block Information
  • Stack, Memory, Storage and Flow Operations
  • Push/Duplication/Pop/Exchange Operations
  • Logging Operations
  • System Operations

Since the EVM does not have registers, therefore all instructions invocation are done through the EVM stack. For example, an instruction taking two parameters such as an addition or a subtraction, would use the stack entries index 0 and 1. And the return value would be stored in the stack entry index 0. In listing 2, we can see more clearly how it looks like under the hood.

PUSH1 0x1 ==> {stack[0x0] = 0x1}
PUSH2 0x2 ==> {stack[0x0] = 0x2, stack[0x1] = 0x1}
ADD       ==> {stack[0x0] = 0x3}

** Listing 2: EVM Parameter/Return Stack Location Example **

The above EVM assembly snippet would translate to the EVM pseudo-code add(0x2, 0x1) and returns 0x3 in the stack entry 0. The EVM stack model follows the standard last-in, rst-out (LIFO ) algorithm.

EVM Call

There are two possible types of external EVM function calls. They can be identied with the CALL instruction. However, this is not necessarily always a concrete identier to the call being external. Some mathematical and cryptographic functions have to be called through external contracts such as sha256 or ripemd160 using the call function. Despite the fact of having an explicitly defined instruction for the sha3 function. Which is due to the frequent usage, especially with mapping arrays such as mapping(address => uint256) balances. Where the sha3 function is used to compute the index. The function call/caseCall() is where the dispatching magic happens. Listing 3 shows the proper prototype declaration for such function.

It follows the following declaration call(gasLimit, to, value, inputOffset, inputSize, outputOffset, outputSize)

There are four ‘pre-compiled’ contracts that are present as extensions of the current design. The four contracts in addresses 1, 2, 3 and 4 executes the elliptic curve public key recovery function, the SHA2 256-bit hash scheme, the RIPEMD 160-bit hash scheme and the identity function respectively. Listing 4 shows such contracts, obtained from the EVM source code.

	precompiled.insert(make_pair(Address(1), PrecompiledContract(3000, 0, PrecompiledRegistrar::executor("ecrecover"))));
	precompiled.insert(make_pair(Address(2), PrecompiledContract(60, 12, PrecompiledRegistrar::executor("sha256"))));
	precompiled.insert(make_pair(Address(3), PrecompiledContract(600, 120, PrecompiledRegistrar::executor("ripemd160"))));
	precompiled.insert(make_pair(Address(4), PrecompiledContract(15, 3, PrecompiledRegistrar::executor("identity"))));

** Listing 4: Pre-compiled Contracts **

User-defined functions (Solidity)

In order to call user-dened functions, another level of abstraction is managed by the instruction CALLDATALOAD. The first parameter for that instruction is the offset in the current environment block.

The first 4-bytes indicates the 32-bit hash of the called function. Then the input parameters follows next. Listing 5, shows an example of such case.

function foo(int a, int b) {
   return a + b;
}

** Listing 5: CALLDATALOAD Example **

In the previous example, the outcome of such code snippet would be a = calldataload(0x4) and b = calldataload(0x24). Its imperative to remember that by default "registers" are 256-bits. Since the first 4 bytes are pre-allocated for the function's hash value, therefore the first parameter will be at the offset 0x4, followed by the second parameter at offset 0x24. This is derived mathematically by simply calculating the number of bytes added to the previous number of bytes taken by the first parameter. So in short words, 4 + (256/8) = 0x24. We can then conclude the EVM pseudo-code shown in listing 6.

return(add(calldataload(0x4), calldataload(0x24))

** Listing 6: CALLDATALOAD EVM Pseudo-code **

Type Discovery

Address

Addresses can be identied by their sources such as specic instruction such as caller but in most of cases we can proceed to better results by identifying mask applied to those values.

Non-optimized address mask

In listing 7, the 0x16 bytes EVM assembly code would translate as reg256 & 0xffffffffffffffffffffffffffffffffffffffff.

0x00000188 73 ff  ff  ff  ff  +      PUSH20 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
0x0000019d 16                        AND

** Listing 7: Non-optimized Assembly Code Example **

Optimized address mask

Listing 8 shows the optimized 0x9 bytes EVM assembly code, which also yields the same operation as shown previously in listing 7.

0x00000043: 60 01                 PUSH1 0x01
0x00000045: 60 A0                 PUSH1 0xA0
0x00000047: 60 02                 PUSH1 0x02
0x00000049: 0A                    EXP
0x0000004A: 03                    SUB
0x0000004B: 16                    AND

Listing 8: Optimized Assembly Code Example

We can then translate the EVM assembly code shown in listing 8 to the following 3 items:

  • and(reg256, sub(exp(2, 0xa0), 1)) (EVM)
  • reg256 & (2 ** 0xA0) - 1) (Intermediate)
  • address (Solidity)

With that being said, in listing 9 For instance, the following EVM byte-code would simply yield as the equivalence of msg.sender variable in Solidity format.

CALLER
PUSH1 0x01
PUSH 0xA0
PUSH1 0x02
EXP
SUB
AND

Listing 9: msg.sender EVM Bytecode Example

Parameter Address Mask

0x0000003a 60 04                      PUSH1 04
0x0000003e 35                         CALLDATALOAD
(...)
0x00000058 73 ff  ff  ff  ff  +      PUSH20 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
0x0000006d 16                         AND
0x0000006e 6c 00  00  00  00  +      PUSH13 00 00 00 00 00 00 00 00 00 00 00 00 01
0x0000007c 02                         MUL

Listing 10: Parameter Address Mask Example

In listing 10, we can see that the EVM assembly code for what would translate to mul(and(arg_4, 0xffffffffffffffffffffffffffffffffffffffff), 0x1000000000000000000000000) , which is in fact an optimization to mask the addresses as parameters before storing them in memory.

Smart-Contract

When compiling a new smart-contract with Solidity, you will be asked to choose between two options to retrieve the bytecode as shown below.

  • --bin
  • --bin-runtime

The first one will output the binary of the entire contract, which includes its pre-loader. While the second one will output the binary of the runtime part of the contract which is the part we are interested in for analysis.

Pre-Loader

Listing 11 is a copy of the output from the porosity disassembler representing the pre-loader. The instruction CODECOPY is used to copy the runtime part of the contract in EVM's memory. The offset 0x002b is the runtime part, while 0x00 is the destination address.

Note that in Ethereum assembly, PUSH/RETURN means the value pushed will be the return value from the function and won't affect the execution address.

0x00000000 60 60                      PUSH1 60
0x00000002 60 40                      PUSH1 40
0x00000004 52                         MSTORE
0x00000005 60 00                      PUSH1 00
0x00000007 60 01                      PUSH1 01
0x00000009 60 00                      PUSH1 00
0x0000000b 61 00  01                  PUSH2 00 01
0x0000000e 0a                         EXP
0x0000000f 81                         DUP2
0x00000010 54                         SLOAD
0x00000011 81                         DUP2
0x00000012 60 ff                      PUSH1 ff
0x00000014 02                         MUL
0x00000015 19                         NOT
0x00000016 16                         AND
0x00000017 90                         SWAP1
0x00000018 83                         DUP4
0x00000019 02                         MUL
0x0000001a 17                         OR
0x0000001b 90                         SWAP1
0x0000001c 55                         SSTORE
0x0000001d 50                         POP
0x0000001e 61 bb  01                  PUSH2 bb 01
0x00000021 80                         DUP1
0x00000022 61 2b  00                  PUSH2 2b 00
0x00000025 60 00                      PUSH1 00
0x00000027 39                         CODECOPY
0x00000028 60 00                      PUSH1 00
0x0000002a f3                         RETURN

Listing 11: Porosity Pre-loader Disassembly Output

Runtime Dispatcher

At the beginning of each runtime part of contracts, we find a dispatcher that branches the right function called when invoking the contract.

Function Hashes

As we discussed earlier in the user-dened function section, the rst 4 bytes of the environment block are used to pass the function hash to the runtime dispatcher that we will describe shortly. The function hash itself is generated from the ABI denition of the function using the logic presented in listing 12.

[{"constant":false,"inputs":[{"name":"a","type":"uint256"}],"name":"double","outputs":[{"name":"","type":"uint256"}],"type":"function"},{"constant":false,"inputs":[{"name":"a","type":"uint256"}],"name":"triple","outputs":[{"name":"","type":"uint256"}], "type":"function"}]

Listing 12: ABI Definition

We take the first 4 bytes of the sha3 (keccak256) value for the string functionName(param1Type,param2Type,etc.). For instance, if we consider the above function double then we need to consider the string double(uint256) as illustrated below in listing 13:

keccak256("double(uint256)") => eee972066698d890c32fec0edb38a360c32b71d0a29ffc75b6ab6d2774ec9901

Listing 13: double Function Declaration

This means that the function signature/hash is 0xeee97206 as extracted from the return value shown above in listing 13. If we repeat the same operation for the triple(uint256) function then we will get the values shown in listing 14.

Contract::setABI: Name: double(uint256)
Contract::setABI: signature: 0xeee97206
Contract::setABI: Name: triple(uint256)
Contract::setABI: signature: 0xf40a049d

Listing 14: double/triple Function Hashes

Dispatcher

Using the --disassm parameter of Porosity and by providing the --abi definition as well, Porosity will then generate a readable disassembly output resolving the symbols based on the ABI definition. Not only that, but also isolate each basic block which will help a lot in the explanation of this section. We can go ahead and examine the runtime bytecode shown in listing 15.

606060405260e060020a6000350463eee9720681146024578063f40a049d146035575b005b60456004356000604f8260025b0290565b60456004356000604f8260036031565b6060908152602090f35b9291505056

Listing 15: EVM Runtime Bytecode Example

Porosity will generate the following disassembly for the previously mentioned runtime bytecode which was obtained from the EVM itself as being shown in listing 16.

loc_00000000:
0x00000000 60 60                      PUSH1 60 
0x00000002 60 40                      PUSH1 40 
0x00000004 52                         MSTORE 
0x00000005 60 e0                      PUSH1 e0 
0x00000007 60 02                      PUSH1 02 
0x00000009 0a                         EXP 
0x0000000a 60 00                      PUSH1 00 
0x0000000c 35                         CALLDATALOAD 
0x0000000d 04                         DIV 
0x0000000e 63 06  72  e9  ee          PUSH4 06 72 e9 ee 
0x00000013 81                         DUP2 
0x00000014 14                         EQ 
0x00000015 60 24                      PUSH1 24 
0x00000017 57                         JUMPI 

loc_00000018:
0x00000018 80                         DUP1 
0x00000019 63 9d  04  0a  f4          PUSH4 9d 04 0a f4 
0x0000001e 14                         EQ 
0x0000001f 60 35                      PUSH1 35 
0x00000021 57                         JUMPI 

loc_00000022:
0x00000022 5b                         JUMPDEST 
0x00000023 00                         STOP 

double(uint256):
0x00000024 5b                         JUMPDEST 
0x00000025 60 45                      PUSH1 45 
0x00000027 60 04                      PUSH1 04 
0x00000029 35                         CALLDATALOAD 
0x0000002a 60 00                      PUSH1 00 
0x0000002c 60 4f                      PUSH1 4f 
0x0000002e 82                         DUP3 
0x0000002f 60 02                      PUSH1 02 

loc_00000031:
0x00000031 5b                         JUMPDEST 
0x00000032 02                         MUL 
0x00000033 90                         SWAP1 
0x00000034 56                         JUMP 

triple(uint256):
0x00000035 5b                         JUMPDEST 
0x00000036 60 45                      PUSH1 45 
0x00000038 60 04                      PUSH1 04 
0x0000003a 35                         CALLDATALOAD 
0x0000003b 60 00                      PUSH1 00 
0x0000003d 60 4f                      PUSH1 4f 
0x0000003f 82                         DUP3 
0x00000040 60 03                      PUSH1 03 
0x00000042 60 31                      PUSH1 31 
0x00000044 56                         JUMP 

loc_00000045:
0x00000045 5b                         JUMPDEST 
0x00000046 60 60                      PUSH1 60 
0x00000048 90                         SWAP1 
0x00000049 81                         DUP2 
0x0000004a 52                         MSTORE 
0x0000004b 60 20                      PUSH1 20 
0x0000004d 90                         SWAP1 
0x0000004e f3                         RETURN 

loc_0000004f:
0x0000004f 5b                         JUMPDEST 
0x00000050 92                         SWAP3 
0x00000051 91                         SWAP2 
0x00000052 50                         POP 
0x00000053 50                         POP 
0x00000054 56                         JUMP 

Listing 16: Runtime Bytecode Porosity Disassembly

First, the dispatcher read the 4-bytes function hash from the environment block by calling calldataload(0x0) / exp(0x2, 0xe0). Since the calldataload instruction reads 256-bits integer by default it is followed by a divide to filter the first 32-bits.

(0x12345678aaaaaaaabbbbbbbbccccccccdddddddd000000000000000000000000 / 0x0000000100000000000000000000000000000000000000000000000000000000) = 0x12345678

Listing 17: dispdisasm

We can try and emulate the code using the EVM emulator or using porosity as long as Ethereum is used in the following manner as illustrated in listing 18.

PS C:\Program Files\Geth> .\evm.exe --code 60e060020a6000350463deadbabe --debug --input 12345678aaaaaaaabbbbbbbbccccccccdddddddd
PC 00000014: STOP GAS: 9999999920 COST: 0
STACK = 2
0000: 00000000000000000000000000000000000000000000000000000000deadbabe
0001: 0000000000000000000000000000000000000000000000000000000012345678
MEM = 0
STORAGE = 0

Listing 18: EVM Emulator

We can notice there are two PUSH4 instructions that corresponds to the function hashes we previously computed. In the above scenario the equivalent EVM code would translate to the pseudo-code jumpi(eq(calldataload(0x0) / exp(0x2, 0xe0), 0xeee97206)). Using Control Flow Graph (CFG) feature of Porosity, we can generate a static CFG or a dynamic CFG. Both graphs will be generated in GraphViz format. Static CFG often contains orphan basic blocks, due to the fact that some destination addresses are computed at runtime. While the dynamic CFG resolves those orphan basic blocks by emulating the code as we can see in the output of both g. 1 and g. 2.

porosity CFG

digraph porosity {
rankdir = TB;
size = "12"
graph[fontname = Courier, fontsize = 10.0, labeljust = l, nojustify = true];node[shape = record];
    "0x00000000"[label = "loc_0x00000000"];
    "0x00000000" -> "0x00000018" [color="red"];
    "0x00000000" -> "0x00000024" [color="green"];
    "0x00000018"[label = "loc_0x00000018"];
    "0x00000018" -> "0x00000022" [color="red"];
    "0x00000018" -> "0x00000035" [color="green"];
    "0x00000022"[label = "loc_0x00000022"];
    "0x00000024"[label = "double(uint256)"];
    "0x00000024" -> "0x00000031" [color="black"];
    "0x00000031"[label = "loc_0x00000031"];
    "0x00000031" -> "0x00000045" [color="black"];
    "0x00000035"[label = "triple(uint256)"];
    "0x00000035" -> "0x00000031" [color="black"];
    "0x00000045"[label = "loc_0x00000045"];
    "0x0000004f"[label = "loc_0x0000004f"];
    "0x0000004f" -> "0xdeadbabe" [color="black"];
}

After:

digraph porosity {
rankdir = TB;
size = "12"
graph[fontname = Courier, fontsize = 10.0, labeljust = l, nojustify = true];node[shape = record];
    "0x00000000"[label = "loc_0x00000000"];
    "0x00000000" -> "0x00000018" [color="red"];
    "0x00000000" -> "0x00000024" [color="green"];
    "0x00000018"[label = "loc_0x00000018"];
    "0x00000018" -> "0x00000022" [color="red"];
    "0x00000018" -> "0x00000035" [color="green"];
    "0x00000022"[label = "loc_0x00000022"];
    "0x00000024"[label = "double(uint256)"];
    "0x00000024" -> "0x00000031" [color="black"];
    "0x00000031"[label = "loc_0x00000031"];
    "0x00000031" -> "0x0000004f" [color="black"];
    "0x00000035"[label = "triple(uint256)"];
    "0x00000035" -> "0x00000031" [color="black"];
    "0x00000045"[label = "loc_0x00000045"];
    "0x0000004f"[label = "loc_0x0000004f"];
    "0x0000004f" -> "0x00000045" [color="black"];
}

This helps us to translate such graph to the following pseudo like C code, as shown in listing 19.

hash = calldataload(0x0) / exp(0x2, 0xe0);
switch (hash) {
    case 0xeee97206: // double(uint256)
        memory[0x60] = calldataload(0x4) * 2;
        return memory[0x60];
    break;
    case 0xf40a049d: // triple(uint256)
        memory[0x60] = calldataload(0x4) * 3;
        return memory[0x60];
    break;
    default:
    // STOP
    break;
}

Listing 19: Static/Dynamic Graph Pseudo-C Code

As we can notice from the above pseudo code. Each runtime code has a dispatcher for each user-defined function. Once it is decompiled we get the following output shown in listing 20.

contract C { 
    function double(int arg_4) {
        return arg_4 * 2;
    }

    function triple(int arg_4) {
        return arg_4 * 3;
    }
}

Listing 20: Decompiled Pseudo-C code

Code Analysis

Vulnerable Contract

Let's take a simple vulnerable smart contract such as the one shown in listing 21. The detailed analysis of the vulnerability had been publish by Abhiroop Sarkar in his blog.

Solidity source code

contract SendBalance {
    mapping ( address => uint ) userBalances ;
    bool withdrawn = false ;

    function getBalance (address u) constant returns ( uint ){
        return userBalances [u];
    }

    function addToBalance () {
        userBalances[msg.sender] += msg.value ;
    }

    function withdrawBalance (){
        if (!(msg.sender.call.value (
            userBalances [msg . sender ])())) { throw ; }
        userBalances [msg.sender ] = 0;
    }
}
**Listing 21:** Vulnerable Smart Contract

runtime bytecode:

60606040526000357c0100000000000000000000000000000000000000000000000000000000900480635fd8c7101461004f578063c0e317fb1461005e578063f8b2cb4f1461006d5761004d565b005b61005c6004805050610099565b005b61006b600480505061013e565b005b610083600480803590602001909190505061017d565b6040518082815260200191505060405180910390f35b3373ffffffffffffffffffffffffffffffffffffffff16600060005060003373ffffffffffffffffffffffffffffffffffffffff1681526020019081526020016000206000505460405180905060006040518083038185876185025a03f192505050151561010657610002565b6000600060005060003373ffffffffffffffffffffffffffffffffffffffff168152602001908152602001600020600050819055505b565b34600060005060003373ffffffffffffffffffffffffffffffffffffffff1681526020019081526020016000206000828282505401925050819055505b565b6000600060005060008373ffffffffffffffffffffffffffffffffffffffff1681526020019081526020016000206000505490506101b6565b91905056

Listing 22: Vulnerable Smart Contract Runtime Bytecode

ABI Definition:

"[{"constant":false,"inputs":[],"name":"withdrawBalance","outputs":[],"type":"function"},{"constant":false,"inputs":[],"name":"addToBalance","outputs":[],"type":"function"},{"constant":true,"inputs":[{"name":"u","type":"address"}],"name":"getBalance","outputs":[{"name":"","type":"uint256"}],"type":"function"}]"

Listing 23: Vulnerable Smart Contract ABI Denition

Decompiled version

function getBalance(address) {
      return store[arg_4];
}

function addToBalance() {
      store[msg.sender] = store[msg.sender];
      return;
}

function withdrawBalance() {
      if (msg.sender.call.value(store[msg.sender])()) {
         store[msg.sender] = 0x0;
      }
}


**L12 (D8193): Potential reentrant vulnerability found.**

Listing 24: Vulnerable Smart Contract Decompilation

Bugs

Keeping an eye on Solidity Compiler Bugs is one of the important notes one would consider.

Reentrant Vulnerablity / Race Condition

Also known as the DAO vulnerability - similar to the SendBalance contract from above. In the meantime significant changes have been made to the EVM including the introduction of a REVERT instruction to restore a state.

As explained here

call the function to execute a split before that withdrawal finishes. The function will start running without updating your balance, and the line we marked above as "the attacker wants to run more than once" will run more than once.

Call stack Vulnerability

Call stack attack, described here by LeastAuthority takes advantage of the fact that a CALL operation will fail if it causes the stack depth to exceed 1024 frames. Which happens to also be the current limit of the stack as previously described earlier. It will ultimately fail and not cause an exception. Unlike stack underflow which happens when frames are not present on the stack during the invocation of a specific instruction. This is a known problem that indicates an error instead of reverting back to the state to the caller. There are often a lack of assert checks in Solidity contracts, due to the poor support for actual unit testing. Given the special condition requiring to trigger this problem, which is an environment specic problem then we cannot easily spot it through static analysis. One potential mitigation would be for the EVM to implement integrity checks before executing a contract that would ensure the state of the stack, and the depth required by the contract (computed either dynamically or statically by the compiler) are met.

Time Dependance Vulnerability

TIMESTAMP returns the current blockchain timestamp and should not be used. As the timestamp of the block can be predicted or manipulated by the miner, which is something that the developers must keep in mind when implementing routines that depend on such variable. Because of this, developers must be extremely careful with time dependency. This was well explained by the case study from @mhswende with the Ethereum Roulette[12] that shows how an implementation of Ethereum Roulette was abused.

Future

As contracts are embedded in blockchain, there is no easy way to deploy updates to patch existing contracts like we would do with any regular software. This is an implementation limitation to understand. Regular softwares development has seen the integration and the raise of Security Development Lifecycle (SDL) as part of its development lifecycle, this is a process which has became increasingly popular that also includes models such as threat modeling which has yet to be seen within the smart-contract World regardless of the platform itself.

There is also a growing community that aims at raising awareness for writing secure solidity code, such as the "Underhanded Solidity Coding Contest" [15] announced early July for the rst time that aims at judging code containing hidden vulnerabilities that can be interpreted as backdoors. Such vulnerabilities/backdoors that aren't obvious during the code auditing process, and can easily be misinterpreted and dismissed as coder error(s). USCC first contest is around the theme of Initial Coins Offering (ICOs), and includes Solidity Lead Developer, Christian Reitwiessner, in its jury. In addition of that, some forks such as Quorum [16] are rising interest by adding an privacy layer on top of the smart-contract blockchain, often required and currently missing with the actual Ethereum implementation.

In March 2017, Martin Becze, the Ethereum Foundation's JavaScript client developer, outlined the next stages of the eWASM initiative which aims at entirely replacing the Ethereum Virtual Machine with Webassembly. Since most of browser JavaScript engines (Google's V8, Microsoft's Chakra, Mozilla's Spidermonkey etc.) will have native support for WebAssembly - this will denitely enlarge the landscape of softwares/applications development on Ethereum and blockchain - including its future attack surface.

Resources

References

Acknowledgments

  • Mohamed Saher
  • Halvar Flake
  • DEFCON Review Board Team
  • Max Vorobjov & Andrey Bazhan
  • Gavin Wood
  • Andreas Olofsson