diff --git a/docs/debugging.md b/docs/debugging.md index f10624bdc..81a545c98 100644 --- a/docs/debugging.md +++ b/docs/debugging.md @@ -19,11 +19,13 @@ export NATIVE_DEBUG_DUMP=1 ### Debugging with LLDB To debug with LLDB (or another debugger), we must compile the binary with the `with-debug-utils` feature. + ```bash cargo build --bin cairo-native-run --features with-debug-utils ``` Then, we can add the a debugger breakpoint trap. To add it at a given sierra statement, we can set the following env var: + ```bash export NATIVE_DEBUG_TRAP_AT_STMT=10 ``` @@ -31,6 +33,7 @@ export NATIVE_DEBUG_TRAP_AT_STMT=10 The trap instruction may not end up exactly where the statement is. If we want to manually set the breakpoint (for example, when executing a particular libfunc), then we can use the `DebugUtils` metadata in the code. + ```rust,ignore #[cfg(feature = "with-debug-utils")] { @@ -47,6 +50,7 @@ lldb -- target/debug/cairo-native-run -s programs/recursion.cairo --available-ga ``` Some usefull lldb commands: + - `process launch`: starts the program - `frame select`: shows the current line information - `thread step-in`: makes a source level single step @@ -54,6 +58,7 @@ Some usefull lldb commands: - `disassemble --frame --mixed`: shows assembly instructions mixed with source level code ## Logging + Enable logging to see the compilation process: ```bash @@ -77,6 +82,7 @@ export RUST_LOG="cairo_native=trace" ## Debugging Contracts Contracts are difficult to debug for various reasons, including: + - They are external to the project. - We don’t have their source code. - They run autogenerated code (the wrapper). @@ -86,6 +92,7 @@ Contracts are difficult to debug for various reasons, including: Some of them have workarounds: ### Obtaining the contract + There are various options for obtaining the contract, which include: - Manually invoking the a Starknet API using `curl` with the contract class. @@ -114,6 +121,7 @@ Both should provide us with the contract, but if we’re manually invoking the A - Convert the ABI from a string of JSON into a JSON object. ### Interpreting the contract + The contract JSON contains the Sierra program in a useless form (in the sense that we cannot understand anything), as well as some information about the entry points and some ABI types. We’ll need the Sierra program (in Sierra @@ -278,6 +286,7 @@ why it’s important to check for those cases and keep following the control flow backwards as required. ### Fixing the bug + Before fixing the bug it’s really important to know: - **Where** it happens (in our compiler, not so much in the contract at this point) @@ -307,7 +316,7 @@ To aid in the debugging process, we developed [sierra-emu](https://github.com/la In addition to this, we developed the `with-trace-dump` feature for Cairo Native, which generates an execution trace that records every statement executed. It has the same shape as the one generated by the Sierra emulator. Supporting transaction execution with Cairo Native trace dump required quite a few hacks, which is why we haven’t merged it to main. This is why we need to use a specific cairo native branch. -By combining both tools, we can hopefully pinpoint exactly which *libfunc* implementation is buggy. +By combining both tools, we can hopefully pinpoint exactly which _libfunc_ implementation is buggy. Before starting, make sure to clone [starknet-replay](https://github.com/lambdaclass/starknet-replay). @@ -315,9 +324,9 @@ Before starting, make sure to clone [starknet-replay](https://github.com/lambdac 1. Checkout starknet-replay `trace-dump` branch. 2. Execute a single transaction with the `use-sierra-emu` feature - ```bash - cargo run --features use-sierra-emu tx - ``` + ```bash + cargo run --features use-sierra-emu tx + ``` 3. Once finished, it will have written the traces of each inner contract inside of `traces/emu`, relative to the current working directory. As a single transaction can invoke multiple contracts (by contract and library calls), this generates a trace file for each contract executed, numbered in ascending order: `trace_0.json`, `trace_1.json`, etc. @@ -326,9 +335,9 @@ As a single transaction can invoke multiple contracts (by contract and library c 1. Checkout starknet-replay `trace-dump` branch. 2. Execute a single transaction with the `with-trace-dump` feature - ```bash - cargo run --features with-trace-dump tx - ``` + ```bash + cargo run --features with-trace-dump tx + ``` 3. Once finished, it will have written the traces of each inner contract inside of `traces/native`, relative to the current working directory. #### Patching Dependencies @@ -347,41 +356,41 @@ sierra-emu = { path = "../sierra-emu" } Once you have generated the traces for both the Sierra emulator and Cairo Native, you can begin debugging. 1. Compare the traces of the same contract with the favorite tool: - ```bash - diff "traces/{emu,native}/trace_0.json" # or - delta "traces/{emu,native}/trace_0.json" --side-by-side - ``` + ```bash + diff "traces/{emu,native}/trace_0.json" # or + delta "traces/{emu,native}/trace_0.json" --side-by-side + ``` 2. Look for the first significant difference between the traces. Not all the differences are significant, for example: - 1. Sometimes the emulator and Cairo Native differ in the Gas builtin. It usually doesn’t affect the outcome of the contract. - 2. The ec_state_init libfunc randomizes an elliptic curve point, which is why they always differ. + 1. Sometimes the emulator and Cairo Native differ in the Gas builtin. It usually doesn’t affect the outcome of the contract. + 2. The ec_state_init libfunc randomizes an elliptic curve point, which is why they always differ. 3. Find the index of the statement executed immediately previous to the first difference. 4. Open `traces/prog_0.sierra` and look for that statement. - 1. If it’s a return, then you are dealing with a control flow bug. These are difficult to debug. - 2. If it’s a libfunc invocation, then that libfunc is probably the one that is buggy. - 3. If it’s a library or contract call, then the bug is probably in another contract, and you should move onto the next trace. + 1. If it’s a return, then you are dealing with a control flow bug. These are difficult to debug. + 2. If it’s a libfunc invocation, then that libfunc is probably the one that is buggy. + 3. If it’s a library or contract call, then the bug is probably in another contract, and you should move onto the next trace. #### Useful Scripts In the `scripts` folder of starknet-replay, you can find useful scripts for debugging. Make sure to execute them in the root directory. Some scripts require `delta` to be installed. - `compare-traces`: Compares every trace and outputs which are different. This can help finding the buggy contract when there are a lot of traces. - ```bash - > ./scripts/compare-traces.sh - difference: ./traces/emu/trace_0.json ./traces/native/trace_0.json - difference: ./traces/emu/trace_1.json ./traces/native/trace_1.json - difference: ./traces/emu/trace_3.json ./traces/native/trace_3.json - missing file: ./traces/native/trace_4.json - ``` + ```bash + > ./scripts/compare-traces.sh + difference: ./traces/emu/trace_0.json ./traces/native/trace_0.json + difference: ./traces/emu/trace_1.json ./traces/native/trace_1.json + difference: ./traces/emu/trace_3.json ./traces/native/trace_3.json + missing file: ./traces/native/trace_4.json + ``` - `diff-trace`: Receives a trace number, and executes `delta` to compare that trace. - ```bash - ./scripts/diff-trace.sh 1 - ``` + ```bash + ./scripts/diff-trace.sh 1 + ``` - `diff-trace-flow`: Like `diff-trace`, but only diffs (with `delta`) the statement indexes. It can be used to visualize the control flow difference. - ```bash - ./scripts/diff-trace-flow.sh 1 - ``` + ```bash + ./scripts/diff-trace-flow.sh 1 + ``` - `string-to-felt`: Converts the given string to a felt. Can be used to search in the code where a specific error message was generated. - ```bash - > ./scripts/string-to-felt.sh "u256_mul Overflow" - 753235365f6d756c204f766572666c6f77 - ``` + ```bash + > ./scripts/string-to-felt.sh "u256_mul Overflow" + 753235365f6d756c204f766572666c6f77 + ``` diff --git a/docs/execution_walkthrough.md b/docs/execution_walkthrough.md index d5b78d26f..47ce22499 100644 --- a/docs/execution_walkthrough.md +++ b/docs/execution_walkthrough.md @@ -1,104 +1,136 @@ # Execution Walkthrough -Given the following Cairo program: +Let's walk through the execution of the following Cairo program: ```rust,ignore // This is the cairo program. It just adds two numbers together and returns the // result in an enum whose variant is selected using the result's parity. enum Parity { - Even: T, - Odd: T, + Even: T, + Odd: T, } + /// Add `lhs` and `rhs` together and return the result in `Parity::Even` if it's /// even or `Parity::Odd` otherwise. fn run(lhs: u128, rhs: u128) -> Parity { - let res = lhs + rhs; - if (res & 1) == 0 { - Parity::Even(res) - } else { - Parity::Odd(res) -} } + let res = lhs + rhs; + if (res & 1) == 0 { + Parity::Even(res) + } else { + Parity::Odd(res) + } +} ``` -Let's see how it is executed. We start with the following Rust code: +First, we need to compile the program to Sierra and then MLIR: ```rust,ignore -let program = get_sierra_program(); // The result of the `cairo-compile` program. -let module = get_native_module(&program); // This compiles the Sierra program to - // MLIR (not covered here). +// Compile the Cairo to Sierra (using the Cairo compiler). +let program = get_sierra_program(); + +// Compile the Sierra to MLIR (using Cairo native, not covered here). +let module = get_native_module(&program); ``` ## Execution engine preparation -Given a compiled Cairo program in an MLIR module, once it is lowered to the LLVM dialect we have two options to execute it: AOT and JIT. + +Once we have the lowered MLIR module (using only the LLVM dialect) we can +instantiate an execution engine. + +There's two kind of execution engines: + +- The just-in-time (JIT) engine: Generates machine code on the fly. Can be + optimized further taking into account hot paths and other metrics. +- The ahead-of-time (AOT) engine: Uses pre-generated machine code. Has lower + overhead because the machine code is fixed and already compiled, but cannot be + optimized further. ### Using the JIT executor -If we decide to use the JIT executor we just create the jit runner and we're done. + +Using the JIT executor is the easiest option, since we just need to create it +and we're done: ```rust,ignore let program = get_sierra_program(); let module = get_native_module(&program); -// The optimization level can be `None`, `Less`, `Default` or `Aggressive`. They -// are equivalent to compiling a C program using `-O0`, `-O1`, `-O2` and `-O3` -// respectively. +// The JIT engine accepts an optimization level. The available optimization +// levels are: +// - `OptLevel::None`: Applies no optimization (other than what's already been +// optimized by earlier passes). +// - `OptLevel::Less`: Uses only a reduced set of optimizations. +// - `OptLevel::Default`: The default. +// - `OptLevel::Aggressive`: Tries to apply all the (safe) optimizations. +// They're equivalent to using `-O0`, `-O1`, `-O2` and `-O3` when compiling +// C/C++ respectively. let engine = JitNativeExecutor::from_native_module(module, OptLevel::Default); ``` ### Using the AOT executor -Preparing the AOT executor is more complicated since we need to compile it into a shared library and load it from disk. + +Using the AOT executor is a bit more complicated because we need to compile it +into a shared library on disk, but all that complexity has been hidden within +the `AotNativeExecutor::from_native_module` method: ```rust,ignore let program = get_sierra_program(); let module = get_native_module(&program); -// Internally, this method will run all the steps mentioned before internally into -// temporary files and return a working `AotNativeExecutor`. +// Check out the previous section for information about `OptLevel`. let engine = AotNativeExecutor::from_native_module(module, OptLevel::Default); ``` -### Using caches -You can use caches to keep the compiled programs in memory or disk and reuse them between runs. You may use the `ProgramCache` type, or alternatively just `AotProgramCache` or `JitProgramCache` directly. +### Caching the compiled programs -Adding programs to the program cache involves steps not covered here, but once they're inserted you can get executors like this: +Some use cases may benefit from storing the final (machine code) programs. Both +the JIT and AOT programs can be cached within the same process using the +`JitProgramCache` or `AotProgramCache` respectively, or just `ProgramCache` for +a cache that supports both. However, only the AOT supports persisting programs +between runs. They are stored using a different API from the `AotProgramCache`. ```rust,ignore -let engine = program_cache.get(key).expect("program not found"); +// An `Option<...>` is returned, indicating whether the program was present or +// not. +let executor = program_cache.get(key).unwrap(); ``` ## Invoking the program -Regardless of whether we decided to go with AOT or JIT, the program invocation involves the exact same steps. We need to know the entrypoint that we'll be calling and its arguments. -In a future we may be able to implement compile-time trampolines for known program signatures, but for now we need to call the `invoke_dynamic` or `invoke_dynamic_with_syscall_handler` methods which works with any signature. +Invoking the program involves the same steps for both AOT and JIT executors. +There are various methods that may help with invoking both normal programs and +Starknet contracts: -> Note: A trampoline is a function that invokes an compiled MLIR function from Rust code.], +- `invoke_dynamic`: Call into a normal program that doesn't require a syscall + handler. +- `invoke_dynamic_with_syscall_handler`: Same as before, but providing a syscall + handler in case the program needs it. +- `invoke_contract_dynamic`: Call a contract's entry point. It accepts the entry + point's ABI (a span of felts) instead of `Value`s and requires a syscall + handler. -Now we need to find the function id: +There's an extra, more performant way to invoke programs and contracts when we +know the exact signature of the function: we should obtain the function pointer, +cast it into an `extern "C" fn(...) -> ...` and invoke it directly from Rust. It +requires the user to convert the inputs and outputs into/from the expected +internal representation, and to manage the builtins manually. Because of that, +it has not been covered here. -```rust,ignore -let program = get_sierra_program(); - -// The utility function needs the symbol of the entry point, which is built as -// follows: -// ::::() -// -// The `` comes from the Sierra program. It's the index of the -// function in the function declaration section. -let function_id = find_function_id(&program, "program::program::main(f0)"); -``` +All those methods for invoking the program need to know which entrypoint we're +trying to call. We can use the Sierra's function id directly. -The arguments must be placed in a list of `JitValue` instances. The builtins should be ignored since they are filled in automatically. The only builtins required are the `GasBuiltin` and `System` (aka. the syscall handler). They are only mandatory when required by the program itself. +Then we'll need the arguments. Since they can have any supported type in any +order we need to wrap them all in `Value`s and send those to the invoke method. +Builtins are automatically added by the invoke method and should be skipped. ```rust,ignore let engine = get_execution_engine(); // This creates the execution engine (covered before). let args = [ - JitValue::Uint128(1234), - JitValue::Uint128(4321), + Value::Uint128(1234), + Value::Uint128(4321), ]; ``` -> Note: Although it's called `JitValue` for now, it's not tied in any way to the JIT engine. `JitValue`s are used for both the AOT and JIT engines.], - Finally we can invoke the program like this: ```rust,ignore @@ -106,8 +138,8 @@ let engine = get_execution_engine(); let function_id = find_function_id(&program, "program::program::main(f0)"); let args = [ - JitValue::Uint128(1234), - JitValue::Uint128(4321), + Value::Uint128(1234), + Value::Uint128(4321), ]; let execution_result = engine.invoke_dynamic( @@ -155,15 +187,21 @@ Builtin stats: BuiltinStats { bitwise: 1, ec_op: 0, range_check: 1, pedersen: 0, ``` ### Contracts -Contracts always have the same interface, therefore they have an alternative to `invoke_dynamic` called `invoke_contract_dynamic`. + +Contracts always have the same interface, therefore they have an alternative to +`invoke_dynamic` called `invoke_contract_dynamic`. ```rust,ignore fn(Span) -> PanicResult>; ``` -This wrapper will attempt to deserialize the real contract arguments from the span of felts, invoke the contracts, and finally serialize and return the result. When this deserialization fails, the contract will panic with the mythical `Failed to deserialize param #N` error. +This wrapper will attempt to deserialize the real contract arguments from the +span of felts, invoke the contracts, and finally serialize and return the +result. When this deserialization fails, the contract will panic with the +mythical `Failed to deserialize param #N` error. -If the example program had the same interface as a contract (a span of felts) then it'd be invoked like this: +If the example program had the same interface as a contract (a span of felts) +then it'd be invoked like this: ```rust,ignore let engine = get_execution_engine(); @@ -204,27 +242,55 @@ Builtin stats: BuiltinStats { bitwise: 1, ec_op: 0, range_check: 1, pedersen: 0, ``` ## The Cairo Native runtime -Sometimes we need to use stuff that would be too complicated or error-prone to implement in MLIR, but that we have readily available from Rust. That's when we use the runtime library. -When using the JIT it'll be automatically linked (if compiled with support for it, which is enabled by default). If using the AOT, the `CAIRO_NATIVE_RUNTIME_LIBRARY` environment variable will have to be modified to point to the `libcairo_native_runtime.a` file, which is built and placed in said folder by `make build`. +Sometimes we need to use stuff that would be too complicated or error-prone to +implement in MLIR, but that we have readily available from Rust. That's when we +use the runtime library. -Although it's implemented in Rust, its functions use the C ABI and have Rust's name mangling disabled. This means that to the extern observer it's technically indistinguishible from a library written in C. By doing this we're making the functions callable from MLIR. +When using the JIT it'll be automatically linked (if compiled with support for +it, which is enabled by default). If using the AOT, the +`CAIRO_NATIVE_RUNTIME_LIBRARY` environment variable will have to be modified to +point to the `libcairo_native_runtime.a` file, which is built and placed in said +folder by `make build`. + +Although it's implemented in Rust, its functions use the C ABI and have Rust's +name mangling disabled. This means that to the extern observer it's technically +indistinguishible from a library written in C. By doing this we're making the +functions callable from MLIR. ### Syscall handlers -The syscall handler is similar to the runtime in the sense that we have C-compatible functions called from MLIR, but it's different in that they're built into Cairo Native itself rather than an external library, and that their implementation is user-dependent. -To allow for user-provided syscall handler implementations we pass a pointer to a vtable every time we detect a `System` builtin. We need a vtable and cannot use function names because the methods themselves are generic over the syscall handler implementation. +The syscall handler is similar to the runtime in the sense that we have +C-compatible functions called from MLIR, but it's different in that they're +built into Cairo Native itself rather than an external library, and that their +implementation is user-dependent. + +To allow for user-provided syscall handler implementations we pass a pointer to +a vtable every time we detect a `System` builtin. We need a vtable and cannot +use function names because the methods themselves are generic over the syscall +handler implementation. -> Note: The `System` is used only for syscalls; every syscall has it, therefore it's a perfect candidate for this use. +> Note: The `System` is used only for syscalls; every syscall has it, therefore +> it's a perfect candidate for this use. -Those wrappers then receive a mutable reference to the syscall handler implementation. They are responsible of converting the MLIR-compatible inputs to the Rust representations, calling the implementation, and then converting the results back into MLIR-compatible formats. +Those wrappers then receive a mutable reference to the syscall handler +implementation. They are responsible of converting the MLIR-compatible inputs to +the Rust representations, calling the implementation, and then converting the +results back into MLIR-compatible formats. -This means that as far as the user is concerned, writing a syscall handler is equivalent to implementing the trait `StarknetSyscallHandler` for a custom type. +This means that as far as the user is concerned, writing a syscall handler is +equivalent to implementing the trait `StarknetSyscallHandler` for a custom type. ## Appendix: The C ABI and the trampoline -Normally, calling FFI functions in Rust is as easy as defining an extern function using C-compatible types. We can't do this here because we don't know the function's signature. -It all boils down to the [SystemV ABI](https://refspecs.linuxbase.org/elf/x86_64-abi-0.99.pdf) in `x86_64` or its equivalent for ARM. Both of them are really similar: +Normally, calling FFI functions in Rust is as easy as defining an extern +function using C-compatible types. We can't do this here because we don't know +the function's signature. + +It all boils down to the +[SystemV ABI](https://refspecs.linuxbase.org/elf/x86_64-abi-0.99.pdf) in +`x86_64` or its equivalent for ARM. Both of them are really similar: + - The stack must be aligned to 16 bytes before calling. - Function arguments are spread between some registers and the stack. - Return values use either a few registers or require a pointer. @@ -234,79 +300,103 @@ There's a few other quirks, like which registers are caller vs callee-saved, but ### Arguments Argument location in `x86_64`: -| # | Reg. | Description | +| # | Reg. | Description | |----|-------|------------------------| -| 1 | rdi | A single 64-bit value. | -| 2 | rsi | A single 64-bit value. | -| 3 | rdx | A single 64-bit value. | -| 4 | rcx | A single 64-bit value. | -| 5 | r8 | A single 64-bit value. | -| 6 | r9 | A single 64-bit value. | -| 7+ | Stack | Everything else. | +| 1 | rdi | A single 64-bit value. | +| 2 | rsi | A single 64-bit value. | +| 3 | rdx | A single 64-bit value. | +| 4 | rcx | A single 64-bit value. | +| 5 | r8 | A single 64-bit value. | +| 6 | r9 | A single 64-bit value. | +| 7+ | Stack | Everything else. | Argument location in `aarch64`: -| # | Reg. | Description | -|----|-------|------------------------| -| 1 | x0 | A single 64-bit value. | -| 2 | x1 | A single 64-bit value. | -| 3 | x2 | A single 64-bit value. | -| 4 | x3 | A single 64-bit value. | -| 5 | x4 | A single 64-bit value. | -| 6 | x5 | A single 64-bit value. | -| 7 | x6 | A single 64-bit value. | -| 8 | x7 | A single 64-bit value. | -| 9+ | Stack | Everything else. | - -Usually function calls have arguments of types other than just 64-bit integers. In those cases, for values smaller than 64 bits the smaller register variants are written. For values larger than 64 bits the value is split into multiple registers, but there's a catch: if when splitting the value only one value would remain in registers then that register is padded and the entire value goes into the stack. For example, an `u128` that would be split between registers and the stack is always padded and written entirely in the stack. - -For complex values like structs, the types are flattened into a list of values when written into registers, or just written into the stack the same way they would be written into memory (aka. with the correct alignment, etc). +| # | Reg. | Description | +| --- | ----- | ---------------------- | +| 1 | x0 | A single 64-bit value. | +| 2 | x1 | A single 64-bit value. | +| 3 | x2 | A single 64-bit value. | +| 4 | x3 | A single 64-bit value. | +| 5 | x4 | A single 64-bit value. | +| 6 | x5 | A single 64-bit value. | +| 7 | x6 | A single 64-bit value. | +| 8 | x7 | A single 64-bit value. | +| 9+ | Stack | Everything else. | + +Usually function calls have arguments of types other than just 64-bit integers. +In those cases, for values smaller than 64 bits the smaller register variants +are written. For values larger than 64 bits the value is split into multiple +registers, but there's a catch: if when splitting the value only one value would +remain in registers then that register is padded and the entire value goes into +the stack. For example, an `u128` that would be split between registers and the +stack is always padded and written entirely in the stack. + +For complex values like structs, the types are flattened into a list of values +when written into registers, or just written into the stack the same way they +would be written into memory (aka. with the correct alignment, etc). ### Return values -As mentioned before, return values may be either returned in registers or memory (most likely the stack, but not necessarily). + +As mentioned before, return values may be either returned in registers or memory +(most likely the stack, but not necessarily). Argument location in `x86_64`: -| # | Reg | Description | -|---|-----|-----------------------------| -| 1 | rax | A single 64-bit value. | -| 2 | rdx | The "continuation" of `rax` | +| # | Reg | Description | +| --- | --- | --------------------------- | +| 1 | rax | A single 64-bit value. | +| 2 | rdx | The "continuation" of `rax` | Argument location in `aarch64`: -| # | Reg | Description | -|---|-----|----------------------------| -| 1 | x0 | A single 64-bit value | -| 2 | x1 | The "continuation" of `x0` | -| 3 | x2 | The "continuation" of `x1` | -| 4 | x3 | The "continuation" of `x2` | +| # | Reg | Description | +| --- | --- | -------------------------- | +| 1 | x0 | A single 64-bit value | +| 2 | x1 | The "continuation" of `x0` | +| 3 | x2 | The "continuation" of `x1` | +| 4 | x3 | The "continuation" of `x2` | -Values are different that arguments in that only a single value is returned. If more than a single value needs to be returned then it'll use a pointer. +Values are different that arguments in that only a single value is returned. If +more than a single value needs to be returned then it'll use a pointer. -When a pointer is involved we need to pass it as the first argument. This means that every actual argument has to be shifted down one slot, pushing more stuff into the stack in the process. +When a pointer is involved we need to pass it as the first argument. This means +that every actual argument has to be shifted down one slot, pushing more stuff +into the stack in the process. ### The trampoline -We cannot really influence what values are in the register or the stack from Rust, therefore we need something written in assembler to put everything into place and invoke the function pointer. -This is where the trampoline comes in. It's a simple assembler function that does three things: -1. Fill in the 6 or 8 argument registers with the first values in the data pointer and copy the rest into the stack as-is (no stack alignment or anything, we guarantee from the Rust side that the stack will end up properly aligned). +We cannot really influence what values are in the register or the stack from +Rust, therefore we need something written in assembler to put everything into +place and invoke the function pointer. + +This is where the trampoline comes in. It's a simple assembler function that +does three things: + +1. Fill in the 6 or 8 argument registers with the first values in the data + pointer and copy the rest into the stack as-is (no stack alignment or anything, + we guarantee from the Rust side that the stack will end up properly aligned). 2. Invoke the function pointer. 3. Write the return values (in registers only) into the return pointer. -This function always has the same signature, which is C-compatible, and therefore can be used with Rust's FFI facilities without problems. +This function always has the same signature, which is C-compatible, and +therefore can be used with Rust's FFI facilities without problems. #### AOT calling convention: ##### Arguments + - Written on registers, then the stack. - Structs' fields are treated as individual arguments (flattened). - Enums are structs internally, therefore they are also flattened (including the padding). - The default payload works as expected since it has the correct signature. - - All other payloads require breaking it down into bytes and scattering it through the padding - and default payload's space. + - All other payloads require breaking it down into bytes and scattering it + through the padding and default payload's space. ##### Return values + - Indivisible values that do not fit within a single register (ex. felt252) use multiple registers (x0-x3 for felt252). - Struct arguments, etc... use the stack. -In other words, complex values require a return pointer while simple values do not but may still use multiple registers if they don't fit within one. +In other words, complex values require a return pointer while simple values do +not but may still use multiple registers if they don't fit within one.