Skip to content

Latest commit

 

History

History
185 lines (154 loc) · 11.3 KB

edsl-notations.md

File metadata and controls

185 lines (154 loc) · 11.3 KB

eDSL High-Level Notations

The eDSL high-level notations make the EVM specifications more succinct and closer to their high-level specifications. The succinctness increases the readability, and the closeness helps "eye-ball validation" of the specification refinement. The high-level notations are defined by translation to the corresponding EVM terms, and thus can be freely used with other EVM terms. The notations are inspired by the production compilers of the smart contract languages like Solidity and Vyper, and their definition is derived by formalizing the corresponding translation made by the compilers.

module EDSL-NOTATIONS [symbolic]
    imports EVM

ABI Call Data

When a function is called in the EVM, its arguments are encoded in a single byte-array and put in the so-called 'call data' section. The encoding is defined in the Ethereum contract application binary interface (ABI) specification. The eDSL provides #abiCallData, a notation to specify the ABI call data in a way similar to a high-level function call notation, defined below. It specifies the function name and the (symbolic) arguments along with their types. For example, the following notation represents a data that encodes a call to the transfer function with two arguments: TO, the receiver account address of type address (an 160-bit unsigned integer), and VALUE, the value to transfer of type unit256 (a 256-bit unsigned integer).

  #abiCallData("transfer", #address(TO), #uint256(VALUE))

which denotes (indeed, is translated to) the following byte array:

  F1 : F2 : F3 : F4 : T1 : ... : T32 : V1 : ... : V32

where F1 : F2 : F3 : F4 is the (two's complement) byte-array representation of 2835717307, the first four bytes of the hash value of the transfer function signature, keccak256("transfer(address,unit256)"), and T1 : ... : T32 and V1 : ... : V32 are the byte-array representations of TO and VALUE respectively.

    syntax TypedArg ::= #uint160 ( Int )
                      | #address ( Int )
                      | #uint256 ( Int )
 // ------------------------------------

    syntax TypedArgs ::= List{TypedArg, ","} [klabel(typedArgs)]
 // ------------------------------------------------------------

    syntax WordStack ::= #abiCallData ( String , TypedArgs ) [function]
 // -------------------------------------------------------------------
    rule #abiCallData( FNAME , ARGS )
      => #parseByteStack(substrString(Keccak256(#generateSignature(FNAME, ARGS)), 0, 8))
      ++ #encodeArgs(ARGS)

    syntax String ::= #generateSignature     ( String, TypedArgs ) [function]
                    | #generateSignatureArgs ( TypedArgs )         [function]
 // -------------------------------------------------------------------------
    rule #generateSignature( FNAME , ARGS ) => FNAME +String "(" +String #generateSignatureArgs(ARGS) +String ")"

    rule #generateSignatureArgs(.TypedArgs)                            => ""
    rule #generateSignatureArgs(TARGA:TypedArg, .TypedArgs)            => #typeName(TARGA)
    rule #generateSignatureArgs(TARGA:TypedArg, TARGB:TypedArg, TARGS) => #typeName(TARGA) +String "," +String #generateSignatureArgs(TARGB, TARGS)

    syntax String ::= #typeName ( TypedArg ) [function]
 // ---------------------------------------------------
    rule #typeName(#uint160( _ )) => "uint160"
    rule #typeName(#address( _ )) => "address"
    rule #typeName(#uint256( _ )) => "uint256"

    syntax WordStack ::= #encodeArgs ( TypedArgs ) [function]
 // ---------------------------------------------------------
    rule #encodeArgs(ARG, ARGS)  => #getData(ARG) ++ #encodeArgs(ARGS)
    rule #encodeArgs(.TypedArgs) => .WordStack

    syntax WordStack ::= #getData ( TypedArg ) [function]
 // -----------------------------------------------------
    rule #getData(#uint160( DATA )) => #asByteStackInWidth( DATA , 32 )
    rule #getData(#address( DATA )) => #asByteStackInWidth( DATA , 32 )
    rule #getData(#uint256( DATA )) => #asByteStackInWidth( DATA , 32 )

ABI Event Logs

EVM logs are special data structures in the blockchain, being searchable by off-chain clients. Events are high-level wrappers of the EVM logs provided in the high-level languages. Contracts can declare and generate the events, which will be compiled down to the EVM bytecode using the EVM log instructions. The encoding scheme of the events in the EVM logs is defined in the Ethereum contract application binary interface (ABI) specification, leveraging the ABI call data encoding scheme.

The eDSL provides #abiEventLog, a notation to specify the EVM logs in the high-level events, defined below. It specifies the contract account address, the event name, and the event arguments. For example, the following notation represents an EVM log data that encodes the Transfer event generated by the transfer function, where ACCT_ID is the account address, and CALLER_ID, TO_ID, and VALUE are the event arguments. Each argument is tagged with its ABI type (#address or #uint256), and the indexed attribute (#indexed) if any, according to the event declaration in the contract.

  #abiEventLog(ACCT_ID, "Transfer", #indexed(#address(CALLER_ID)), #indexed(#address(TO_ID)), #uint256(VALUE))

The above notation denotes (i.e., is translated to) the following EVM log data structure:

  { ACCT_ID                                                                                                                                                                                                                                                  | |`
  | 100389287136786176327247604509743168900146139575972864366142685224231313322991
  : CALLER_ID                                                                                                                                                                                                                                                |/|
  : TO_ID                                                                                                                                                                                                                                                    | |
  : .WordStack                                                                                                                                                                                                                                               | |
  | #asByteStackInWidth(VALUE, 32)                                                                                                                                                                                                                           | |
  }

where 100389287136786176327247604509743168900146139575972864366142685224231313322991 is the hash value of the event signature, keccak256("Transfer(address,address,unit256)").

    syntax EventArg ::= TypedArg
                      | #indexed ( TypedArg )
 // -----------------------------------------

    syntax EventArgs ::= List{EventArg, ","} [klabel(eventArgs)]
 // ------------------------------------------------------------

    syntax SubstateLogEntry ::= #abiEventLog ( Int , String , EventArgs ) [function]
 // --------------------------------------------------------------------------------
    rule #abiEventLog(ACCT_ID, EVENT_NAME, EVENT_ARGS)
      => { ACCT_ID | #getEventTopics(EVENT_NAME, EVENT_ARGS) | #getEventData(EVENT_ARGS) }

    syntax WordStack ::= #getEventTopics ( String , EventArgs ) [function]
 // ----------------------------------------------------------------------
    rule #getEventTopics(ENAME, EARGS)
      => #parseHexWord(Keccak256(#generateSignature(ENAME, #getTypedArgs(EARGS))))
       : #getIndexedArgs(EARGS)

    syntax TypedArgs ::= #getTypedArgs ( EventArgs ) [function]
 // -----------------------------------------------------------
    rule #getTypedArgs(#indexed(E), ES) => E, #getTypedArgs(ES)
    rule #getTypedArgs(E:TypedArg,  ES) => E, #getTypedArgs(ES)
    rule #getTypedArgs(.EventArgs)      => .TypedArgs

    syntax WordStack ::= #getIndexedArgs ( EventArgs ) [function]
 // -------------------------------------------------------------
    rule #getIndexedArgs(#indexed(E), ES) => #getValue(E) : #getIndexedArgs(ES)
    rule #getIndexedArgs(_:TypedArg,  ES) =>                #getIndexedArgs(ES)
    rule #getIndexedArgs(.EventArgs)      => .WordStack

    syntax WordStack ::= #getEventData ( EventArgs ) [function]
 // -----------------------------------------------------------
    rule #getEventData(#indexed(_), ES) =>                #getEventData(ES)
    rule #getEventData(E:TypedArg,  ES) => #getData(E) ++ #getEventData(ES)
    rule #getEventData(.EventArgs)      => .WordStack

    syntax Int ::= #getValue ( TypedArg ) [function]
 // ------------------------------------------------
    rule #getValue(#uint160(V)) => V
    rule #getValue(#address(V)) => V
    rule #getValue(#uint256(V)) => V

Hashed Location for Storage

The storage accommodates permanent data such as the balances map. A map is laid out in the storage where the map entries are scattered over the entire storage space using the (256-bit) hash of each key to determine the location. The detailed mechanism of calculating the location varies by compilers. In Vyper, for example, map[key1][key2] is stored at the location:

  hash(hash(idx(map)) + key1) + key2

where idx(map) is the position index of map in the program, and + is the addition modulo 2**256, while in Solidity, it is stored at:

  hash(key2 ++ hash(key1 ++ idx(map)))

where ++ is byte-array concatenation.

The eDSL provides #hashedLocation that allows to uniformly specify the locations in a form parameterized by the underlying compilers. For example, the location of map[key1][key2] can be specified as follows, where {COMPILER} is a place-holder to be replaced by the name of the compiler. Note that the keys are separated by the white spaces instead of commas.

  #hashedLocation({COMPILER}, idx(map), key1 key2)

This notation makes the specification independent of the underlying compilers, enabling it to be reused for differently compiled programs. Specifically, #hashedLocation is defined as follows, capturing the storage layout schemes of Solidity and Vyper.

    syntax IntList ::= List{Int, ""}                             [klabel(intList)]
    syntax Int     ::= #hashedLocation( String , Int , IntList ) [function]
 // -----------------------------------------------------------------------
    rule #hashedLocation(LANG, BASE, .IntList) => BASE

    rule #hashedLocation("Vyper",    BASE, OFFSET OFFSETS) => #hashedLocation("Vyper",    keccakIntList(BASE) +Word OFFSET, OFFSETS)
    rule #hashedLocation("Solidity", BASE, OFFSET OFFSETS) => #hashedLocation("Solidity", keccakIntList(OFFSET BASE),       OFFSETS)

    syntax Int ::= keccakIntList( IntList ) [function]
 // --------------------------------------------------
    rule keccakIntList(VS) => keccak(intList2ByteStack(VS))

    syntax WordStack ::= intList2ByteStack( IntList ) [function]
 // ------------------------------------------------------------
    rule intList2ByteStack(.IntList) => .WordStack
    rule intList2ByteStack(V VS)     => #asByteStackInWidth(V, 32) ++ intList2ByteStack(VS)
      requires 0 <=Int V andBool V <Int pow256
endmodule