The eDSL high-level notations make the EVM specifications more succinct and closer to their high-level specifications. The succinctness increases the readability, and the closeness helps "eye-ball validation" of the specification refinement. The high-level notations are defined by translation to the corresponding EVM terms, and thus can be freely used with other EVM terms. The notations are inspired by the production compilers of the smart contract languages like Solidity and Vyper, and their definition is derived by formalizing the corresponding translation made by the compilers.
module EDSL-NOTATIONS [symbolic]
imports EVM
When a function is called in the EVM, its arguments are encoded in a single byte-array and put in the so-called 'call data' section.
The encoding is defined in the Ethereum contract application binary interface (ABI) specification.
The eDSL provides #abiCallData
, a notation to specify the ABI call data in a way similar to a high-level function call notation, defined below.
It specifies the function name and the (symbolic) arguments along with their types.
For example, the following notation represents a data that encodes a call to the transfer
function with two arguments: TO
, the receiver account address of type address
(an 160-bit unsigned integer), and VALUE
, the value to transfer of type unit256
(a 256-bit unsigned integer).
#abiCallData("transfer", #address(TO), #uint256(VALUE))
which denotes (indeed, is translated to) the following byte array:
F1 : F2 : F3 : F4 : T1 : ... : T32 : V1 : ... : V32
where F1 : F2 : F3 : F4
is the (two's complement) byte-array representation of 2835717307
, the first four bytes of the hash value of the transfer
function signature, keccak256("transfer(address,unit256)")
, and T1 : ... : T32
and V1 : ... : V32
are the byte-array representations of TO
and VALUE
respectively.
syntax TypedArg ::= #uint160 ( Int )
| #address ( Int )
| #uint256 ( Int )
// ------------------------------------
syntax TypedArgs ::= List{TypedArg, ","} [klabel(typedArgs)]
// ------------------------------------------------------------
syntax WordStack ::= #abiCallData ( String , TypedArgs ) [function]
// -------------------------------------------------------------------
rule #abiCallData( FNAME , ARGS )
=> #parseByteStack(substrString(Keccak256(#generateSignature(FNAME, ARGS)), 0, 8))
++ #encodeArgs(ARGS)
syntax String ::= #generateSignature ( String, TypedArgs ) [function]
| #generateSignatureArgs ( TypedArgs ) [function]
// -------------------------------------------------------------------------
rule #generateSignature( FNAME , ARGS ) => FNAME +String "(" +String #generateSignatureArgs(ARGS) +String ")"
rule #generateSignatureArgs(.TypedArgs) => ""
rule #generateSignatureArgs(TARGA:TypedArg, .TypedArgs) => #typeName(TARGA)
rule #generateSignatureArgs(TARGA:TypedArg, TARGB:TypedArg, TARGS) => #typeName(TARGA) +String "," +String #generateSignatureArgs(TARGB, TARGS)
syntax String ::= #typeName ( TypedArg ) [function]
// ---------------------------------------------------
rule #typeName(#uint160( _ )) => "uint160"
rule #typeName(#address( _ )) => "address"
rule #typeName(#uint256( _ )) => "uint256"
syntax WordStack ::= #encodeArgs ( TypedArgs ) [function]
// ---------------------------------------------------------
rule #encodeArgs(ARG, ARGS) => #getData(ARG) ++ #encodeArgs(ARGS)
rule #encodeArgs(.TypedArgs) => .WordStack
syntax WordStack ::= #getData ( TypedArg ) [function]
// -----------------------------------------------------
rule #getData(#uint160( DATA )) => #asByteStackInWidth( DATA , 32 )
rule #getData(#address( DATA )) => #asByteStackInWidth( DATA , 32 )
rule #getData(#uint256( DATA )) => #asByteStackInWidth( DATA , 32 )
EVM logs are special data structures in the blockchain, being searchable by off-chain clients. Events are high-level wrappers of the EVM logs provided in the high-level languages. Contracts can declare and generate the events, which will be compiled down to the EVM bytecode using the EVM log instructions. The encoding scheme of the events in the EVM logs is defined in the Ethereum contract application binary interface (ABI) specification, leveraging the ABI call data encoding scheme.
The eDSL provides #abiEventLog
, a notation to specify the EVM logs in the high-level events, defined below.
It specifies the contract account address, the event name, and the event arguments.
For example, the following notation represents an EVM log data that encodes the Transfer
event generated by the transfer
function, where ACCT_ID
is the account address, and CALLER_ID
, TO_ID
, and VALUE
are the event arguments.
Each argument is tagged with its ABI type (#address
or #uint256
), and the indexed
attribute (#indexed
) if any, according to the event declaration in the contract.
#abiEventLog(ACCT_ID, "Transfer", #indexed(#address(CALLER_ID)), #indexed(#address(TO_ID)), #uint256(VALUE))
The above notation denotes (i.e., is translated to) the following EVM log data structure:
{ ACCT_ID | |`
| 100389287136786176327247604509743168900146139575972864366142685224231313322991
: CALLER_ID |/|
: TO_ID | |
: .WordStack | |
| #asByteStackInWidth(VALUE, 32) | |
}
where 100389287136786176327247604509743168900146139575972864366142685224231313322991
is the hash value of the event signature, keccak256("Transfer(address,address,unit256)")
.
syntax EventArg ::= TypedArg
| #indexed ( TypedArg )
// -----------------------------------------
syntax EventArgs ::= List{EventArg, ","} [klabel(eventArgs)]
// ------------------------------------------------------------
syntax SubstateLogEntry ::= #abiEventLog ( Int , String , EventArgs ) [function]
// --------------------------------------------------------------------------------
rule #abiEventLog(ACCT_ID, EVENT_NAME, EVENT_ARGS)
=> { ACCT_ID | #getEventTopics(EVENT_NAME, EVENT_ARGS) | #getEventData(EVENT_ARGS) }
syntax WordStack ::= #getEventTopics ( String , EventArgs ) [function]
// ----------------------------------------------------------------------
rule #getEventTopics(ENAME, EARGS)
=> #parseHexWord(Keccak256(#generateSignature(ENAME, #getTypedArgs(EARGS))))
: #getIndexedArgs(EARGS)
syntax TypedArgs ::= #getTypedArgs ( EventArgs ) [function]
// -----------------------------------------------------------
rule #getTypedArgs(#indexed(E), ES) => E, #getTypedArgs(ES)
rule #getTypedArgs(E:TypedArg, ES) => E, #getTypedArgs(ES)
rule #getTypedArgs(.EventArgs) => .TypedArgs
syntax WordStack ::= #getIndexedArgs ( EventArgs ) [function]
// -------------------------------------------------------------
rule #getIndexedArgs(#indexed(E), ES) => #getValue(E) : #getIndexedArgs(ES)
rule #getIndexedArgs(_:TypedArg, ES) => #getIndexedArgs(ES)
rule #getIndexedArgs(.EventArgs) => .WordStack
syntax WordStack ::= #getEventData ( EventArgs ) [function]
// -----------------------------------------------------------
rule #getEventData(#indexed(_), ES) => #getEventData(ES)
rule #getEventData(E:TypedArg, ES) => #getData(E) ++ #getEventData(ES)
rule #getEventData(.EventArgs) => .WordStack
syntax Int ::= #getValue ( TypedArg ) [function]
// ------------------------------------------------
rule #getValue(#uint160(V)) => V
rule #getValue(#address(V)) => V
rule #getValue(#uint256(V)) => V
The storage accommodates permanent data such as the balances
map.
A map is laid out in the storage where the map entries are scattered over the entire storage space using the (256-bit) hash of each key to determine the location.
The detailed mechanism of calculating the location varies by compilers.
In Vyper, for example, map[key1][key2]
is stored at the location:
hash(hash(idx(map)) + key1) + key2
where idx(map)
is the position index of map
in the program, and +
is the addition modulo 2**256
, while in Solidity, it is stored at:
hash(key2 ++ hash(key1 ++ idx(map)))
where ++
is byte-array concatenation.
The eDSL provides #hashedLocation
that allows to uniformly specify the locations in a form parameterized by the underlying compilers. For example, the location of map[key1][key2]
can be specified as follows, where {COMPILER}
is a place-holder to be replaced by the name of the compiler.
Note that the keys are separated by the white spaces instead of commas.
#hashedLocation({COMPILER}, idx(map), key1 key2)
This notation makes the specification independent of the underlying compilers, enabling it to be reused for differently compiled programs.
Specifically, #hashedLocation
is defined as follows, capturing the storage layout schemes of Solidity and Vyper.
syntax IntList ::= List{Int, ""} [klabel(intList)]
syntax Int ::= #hashedLocation( String , Int , IntList ) [function]
// -----------------------------------------------------------------------
rule #hashedLocation(LANG, BASE, .IntList) => BASE
rule #hashedLocation("Vyper", BASE, OFFSET OFFSETS) => #hashedLocation("Vyper", keccakIntList(BASE) +Word OFFSET, OFFSETS)
rule #hashedLocation("Solidity", BASE, OFFSET OFFSETS) => #hashedLocation("Solidity", keccakIntList(OFFSET BASE), OFFSETS)
syntax Int ::= keccakIntList( IntList ) [function]
// --------------------------------------------------
rule keccakIntList(VS) => keccak(intList2ByteStack(VS))
syntax WordStack ::= intList2ByteStack( IntList ) [function]
// ------------------------------------------------------------
rule intList2ByteStack(.IntList) => .WordStack
rule intList2ByteStack(V VS) => #asByteStackInWidth(V, 32) ++ intList2ByteStack(VS)
requires 0 <=Int V andBool V <Int pow256
endmodule