From e97d5bbe56d3edc5d3d60866ebb1408a66049797 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Juli=C3=A1n=20Gonz=C3=A1lez=20Calder=C3=B3n?= Date: Mon, 4 Nov 2024 15:37:14 -0300 Subject: [PATCH 01/17] Add tooling section --- docs/README.md | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/docs/README.md b/docs/README.md index 5d70a46fea..2bfa065f5e 100644 --- a/docs/README.md +++ b/docs/README.md @@ -1,4 +1,5 @@ # Documentation + * [How does the original Cairo VM work?](./python_vm/) * [Benchmarks](./benchmarks/) * [Custom Hint Processor](./hint_processor/) @@ -6,3 +7,11 @@ * [References parsing](./references_parsing/) * [Tracer](./tracer/) * [Debugging](./debugging.md) + +## Tooling + +* [cairo1-run](/cairo1-run) +* [cairo-vm-cli](/cairo-vm-cli) +* [cairo-vm-tracer](/cairo-vm-tracer) +* [fuzzer](/fuzzer) +* [hint_accountant](/hint_accountant) From ea0d0042cac36b1406b65698da856981dfa7c148 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Juli=C3=A1n=20Gonz=C3=A1lez=20Calder=C3=B3n?= Date: Mon, 4 Nov 2024 15:41:20 -0300 Subject: [PATCH 02/17] Link to new section with doc --- docs/README.md | 1 + docs/vm/README.md | 1 + 2 files changed, 2 insertions(+) create mode 100644 docs/vm/README.md diff --git a/docs/README.md b/docs/README.md index 2bfa065f5e..8b9477258a 100644 --- a/docs/README.md +++ b/docs/README.md @@ -1,5 +1,6 @@ # Documentation +* [How does the Cairo VM work?](./vm/) * [How does the original Cairo VM work?](./python_vm/) * [Benchmarks](./benchmarks/) * [Custom Hint Processor](./hint_processor/) diff --git a/docs/vm/README.md b/docs/vm/README.md new file mode 100644 index 0000000000..e2e6b201a5 --- /dev/null +++ b/docs/vm/README.md @@ -0,0 +1 @@ +# How does the Cairo VM work? From 3bee81c3222d6820f1c439e8a4dc4e524140996a Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Juli=C3=A1n=20Gonz=C3=A1lez=20Calder=C3=B3n?= Date: Mon, 4 Nov 2024 15:45:58 -0300 Subject: [PATCH 03/17] Copy documentation --- docs/vm/README.md | 159 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 159 insertions(+) diff --git a/docs/vm/README.md b/docs/vm/README.md index e2e6b201a5..dd50e6fd51 100644 --- a/docs/vm/README.md +++ b/docs/vm/README.md @@ -1 +1,160 @@ # How does the Cairo VM work? + +## High Level Overview + +The Cairo virtual machine is meant to be used in the context of STARK validity proofs. What this means is that the point of Cairo is not just to execute some code and get a result, but to *prove* to someone else that said execution was done correctly, without them having to re-execute the entire thing. The rough flow for it looks like this: + +- A user writes a Cairo program. +- The program is compiled into Cairo's VM bytecode. +- The VM executes said code and provides a *trace* of execution, i.e. a record of the state of the machine and its memory *at every step of the computation*. +- This trace is passed on to a STARK prover, which creates a cryptographic proof from it, attesting to the correct execution of the program. +- The proof is passed to a verifier, who checks that the proof is valid in a fraction of a second, without re-executing. + +The main three components of this flow are: + +- A Cairo compiler to turn a program written in the [Cairo programming language](https://www.cairo-lang.org/) into bytecode. +- A Cairo VM to then execute it and generate a trace. +- [A STARK prover and verifier](https://github.com/lambdaclass/starknet_stack_prover_lambdaworks) so one party can prove correct execution, while another can verify it. + +While this repo is only concerned with the second component, it's important to keep in mind the other two; especially important are the prover and verifier that this VM feeds its trace to, as a lot of its design decisions come from them. This virtual machine is designed to make proving and verifying both feasible and fast, and that makes it quite different from most other VMs you are probably used to. + +## Basic VM flow + +Our virtual machine has a very simple flow: + +- Take a compiled cairo program as input. You can check out an example program [here](https://github.com/lambdaclass/cairo-vm.go/blob/main/cairo_programs/fibonacci.cairo). +- Run the bytecode from the compiled program, doing the usual `fetch->decode->execute` loop, running until program termination. +- On every step of the execution, record the values of each register. +- Take the register values and memory at every step and write them to a file, called the `execution trace`. + +Barring some simplifications we made, this is all the Cairo VM does. The two main things that stand out as radically different are the memory model and the use of `Field Elements` to perform arithmetic. Below we go into more detail on each step, and in the process explain the ommisions we made. + +## Architecture + +The Cairo virtual machine uses a Von Neumann architecture with a Non-deterministic read-only memory. What this means, roughly, is that memory is immutable after you've written to it (i.e. you can only write to it once); this is to make the STARK proving easier, but we won't go into that here. + +### Memory Segments and Relocation + +The process of memory allocation in a contiguous write-once memory region can get pretty complicated. Imagine you want to have a regular call stack, with a stack pointer pointing to the top of it and allocation and deallocation of stack frames and local variables happening throughout execution. Because memory is immutable, this cannot be done the usual way; once you allocate a new stack frame that memory is set, it can't be reused for another one later on. + +Because of this, memory in Cairo is divided into `segments`. This is just a way of organizing memory more conveniently for this write-once model. Each segment is nothing more than a contiguous memory region. Segments are identified by an `index`, an integer value that uniquely identifies them. + +Memory `cells` (i.e. values in memory) are identified by the index of the segment they belong to and an `offset` into said segment. Thus, the memory cell `{2,0}` is the first cell of segment number `2`. + +Even though this segment model is extremely convenient for the VM's execution, the STARK prover needs to have the memory as just one contiguous region. Because of this, once execution of a Cairo program finishes, all the memory segments are collapsed into one; this process is called `Relocation`. We will go into more detail on all of this below. + +### Registers + +There are only three registers in the Cairo VM: + +- The program counter `pc`, which points to the next instruction to be executed. +- The allocation pointer `ap`, pointing to the next unused memory cell. +- The frame pointer `fp`, pointing to the base of the current stack frame. When a new function is called, `fp` is set to the current `ap`. When the function returns, `fp` goes back to its previous value. The VM creates new segments whenever dynamic allocation is needed, so for example the cairo analog to a Rust `Vec` will have its own segment. Relocation at the end meshes everything together. + +### Instruction Decoding/Execution + +TODO: explain the components of an instruction (`dst_reg`, `op0_reg`, etc), what each one is used for and how they're encoded/decoded. + +### Felts + +Felts, or Field Elements, are cairo's basic integer type. Every variable in a cairo vm that is not a pointer is a felt. From our point of view we could say a felt in cairo is an unsigned integer in the range [0, CAIRO_PRIME). This means that all operations are done modulo CAIRO_PRIME. The CAIRO_PRIME is 0x800000000000011000000000000000000000000000000000000000000000001, which means felts can be quite big (up to 252 bits), luckily, we have the [Lambdaworks](https://github.com/lambdaclass/lambdaworks) library to help with handling these big integer values and providing fast and efficient modular arithmetic. + +### Lambdaworks library wrapper + +[Lambdaworks](https://github.com/lambdaclass/lambdaworks) is a custom performance-focused library that aims to ease programming for developers. It provides essential mathematical and cryptographic methods required for this project, enabling arithmetic operations between `felts` and type conversions efficiently. +We've developed a C wrapper to expose the library's functions and enable easy usage from Go. This allows seamless integration of the library's features within Go projects, enhancing performance and functionality. + +### More on memory + +The cairo memory is made up of contiguous segments of variable length identified by their index. The first segment (index 0) is the program segment, which stores the instructions of a cairo program. The following segment (index 1) is the execution segment, which holds the values that are created along the execution of the vm, for example, when we call a function, a pointer to the next instruction after the call instruction will be stored in the execution segment which will then be used to find the next instruction after the function returns. The following group of segments are the builtin segments, one for each builtin used by the program, and which hold values used by the builtin runners. The last group of segments are the user segments, which represent data structures created by the user, for example, when creating an array on a cairo program, that array will be represented in memory as its own segment. + +An address (or pointer) in cairo is represented as a `relocatable` value, which is made up of a `segment_index` and an `offset`, the `segment_index` tells us which segment the value is stored in and the `offset` tells us how many values exist between the start of the segment and the value. + +As the cairo memory can hold both felts and pointers, the basic memory unit is a `maybe_relocatable`, a variable that can be either a `relocatable` or a `felt`. + +While memory is continous, some gaps may be present. These gaps can be created on purpose by the user, for example by running: + +```text +[ap + 1] = 2; +``` + +Where a gap is created at ap. But they may also be created indireclty by diverging branches, as for example one branch may declare a variable that the other branch doesn't, as memory needs to be allocated for both cases if the second case is ran then a gap is left where the variable should have been written. + +#### Memory API + +The memory can perform the following basic operations: + +- `memory_add_segment`: Creates a new, empty segment in memory and returns a pointer to its start. Values cannot be inserted into a memory segment that hasn't been previously created. + +- `memory_insert`: Inserts a `maybe_relocatable` value at an address indicated by a `relocatable` pointer. For this operation to succeed, the pointer's segment_index must be an existing segment (created using `memory_add_segment`), and there mustn't be a value stored at that address, as the memory is immutable after its been written once. If there is a value already stored at that address but it is equal to the value to be inserted then the operation will be successful. + +- `memory_get`: Fetches a `maybe_relocatable` value from a memory address indicated by a `relocatable` pointer. + +Other operations: + +- `memory_load_data`: This is a convenience method, which takes an array of `maybe_relocatable` and inserts them contiguosuly in memory by calling `memory_insert` and advancing the pointer by one after each insertion. Returns a pointer to the next free memory slot after the inserted data. + +#### Memory Relocation + +During execution, the memory consists of segments of varying length, and they can be accessed by indicating their segment index, and the offset within that segment. When the run is finished, a relocation process takes place, which transforms this segmented memory into a contiguous list of values. The relocation process works as follows: + +1. The size of each segment is calculated (The size is equal to the highest offset within the segment + 1, and not the amount of `maybe_relocatable` values, as there can be gaps) +2. A base is assigned to each segment by accumulating the size of the previous segment. The first segment's base is set to 1. +3. All `relocatable` values are converted into a single integer by adding their `offset` value to their segment's base calculated in the previous step + +For example, if we have this memory represented by address, value pairs: + +```text + 0:0 -> 1 + 0:1 -> 4 + 0:2 -> 7 + 1:0 -> 8 + 1:1 -> 0:2 + 1:4 -> 0:1 + 2:0 -> 1 +``` + +Step 1: Calculate segment sizes: + +```text + 0 --(has size)--> 3 + 1 --(has size)--> 5 + 2 --(has size)--> 1 +``` + +Step 2: Assign a base to each segment: + +```text + 0 --(has base value)--> 1 + 1 --(has base value)--> 4 (that is: 1 + 3) + 2 --(has base value)--> 9 (that is: 4 + 5) +``` + +Step 3: Convert relocatables to integers + +```text + 1 (base[0] + 0) -> 1 + 2 (base[0] + 1) -> 4 + 3 (base[0] + 2) -> 7 + 4 (base[1] + 0) -> 8 + 5 (base[1] + 1) -> 3 (that is: base[0] + 2) + .... (memory gaps) + 8 (base[1] + 4) -> 2 (that is: base[0] + 1) + 9 (base[2] + 0) -> 1 +``` + +### Program parsing + +The input of the Virtual Machine is a compiled Cairo program in Json format. The main part of the file are listed below: + +- **data:** List of hexadecimal values that represent the instructions and immediate values defined in the cairo program. Each hexadecimal value is stored as a maybe_relocatable element in memory, but they can only be felts because the decoder has to be able to get the instruction fields in its bit representation. +- **debug_info:** This field provides information about the instructions defined in the data list. Each one is identified with its index inside the data list. For each one it contains information about the cairo variables in scope, the hints executed before that instruction if any, and its location inside the cairo program. +- **hints:** All the hints used in the program, ordered by the pc offset at which they should be executed. +- **identifiers:** User-defined symbols in the Cairo code representing variables, functions, classes, etc. with unique names. The expected offset, type and its corresponding information is provided for each identifier + + For example, the identifier representing the main function (usually the entrypoint of the program) is of `function` type, and a list of decorators wrappers (if there are any) are provided as additional information. + Another example is a user defined struct, is of `struct` type, it provides its size, the members it contains (with its information) and more. + +- **main_scope:** Usually something like `__main__`. All the identifiers associated with main function will be identified as `__main__`.identifier_name. Useful to identify the entrypoint of the program. +- **prime:** The cairo prime in hexadecimal format. As explained above, all arithmetic operations are done over a base field, modulo this primer number. +- **reference_manager:** Contains information about cairo variables. This information is useful to access to variables when executing cairo hints. From d2ded9b2d4c1e3191f4296d21b7699d6c7fd4562 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Juli=C3=A1n=20Gonz=C3=A1lez=20Calder=C3=B3n?= Date: Mon, 4 Nov 2024 15:57:56 -0300 Subject: [PATCH 04/17] Remove unused section --- docs/vm/README.md | 5 ----- 1 file changed, 5 deletions(-) diff --git a/docs/vm/README.md b/docs/vm/README.md index dd50e6fd51..243f20ad0e 100644 --- a/docs/vm/README.md +++ b/docs/vm/README.md @@ -59,11 +59,6 @@ TODO: explain the components of an instruction (`dst_reg`, `op0_reg`, etc), what Felts, or Field Elements, are cairo's basic integer type. Every variable in a cairo vm that is not a pointer is a felt. From our point of view we could say a felt in cairo is an unsigned integer in the range [0, CAIRO_PRIME). This means that all operations are done modulo CAIRO_PRIME. The CAIRO_PRIME is 0x800000000000011000000000000000000000000000000000000000000000001, which means felts can be quite big (up to 252 bits), luckily, we have the [Lambdaworks](https://github.com/lambdaclass/lambdaworks) library to help with handling these big integer values and providing fast and efficient modular arithmetic. -### Lambdaworks library wrapper - -[Lambdaworks](https://github.com/lambdaclass/lambdaworks) is a custom performance-focused library that aims to ease programming for developers. It provides essential mathematical and cryptographic methods required for this project, enabling arithmetic operations between `felts` and type conversions efficiently. -We've developed a C wrapper to expose the library's functions and enable easy usage from Go. This allows seamless integration of the library's features within Go projects, enhancing performance and functionality. - ### More on memory The cairo memory is made up of contiguous segments of variable length identified by their index. The first segment (index 0) is the program segment, which stores the instructions of a cairo program. The following segment (index 1) is the execution segment, which holds the values that are created along the execution of the vm, for example, when we call a function, a pointer to the next instruction after the call instruction will be stored in the execution segment which will then be used to find the next instruction after the function returns. The following group of segments are the builtin segments, one for each builtin used by the program, and which hold values used by the builtin runners. The last group of segments are the user segments, which represent data structures created by the user, for example, when creating an array on a cairo program, that array will be represented in memory as its own segment. From ecb95681251a726fdc49207a4eb2c48beac3a3c4 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Juli=C3=A1n=20Gonz=C3=A1lez=20Calder=C3=B3n?= Date: Mon, 4 Nov 2024 15:57:58 -0300 Subject: [PATCH 05/17] Fix typo --- docs/vm/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/vm/README.md b/docs/vm/README.md index 243f20ad0e..9aad47aefa 100644 --- a/docs/vm/README.md +++ b/docs/vm/README.md @@ -140,7 +140,7 @@ Step 3: Convert relocatables to integers ### Program parsing -The input of the Virtual Machine is a compiled Cairo program in Json format. The main part of the file are listed below: +The input of the Virtual Machine is a compiled Cairo program in Json format. The main parts of the file are listed below: - **data:** List of hexadecimal values that represent the instructions and immediate values defined in the cairo program. Each hexadecimal value is stored as a maybe_relocatable element in memory, but they can only be felts because the decoder has to be able to get the instruction fields in its bit representation. - **debug_info:** This field provides information about the instructions defined in the data list. Each one is identified with its index inside the data list. For each one it contains information about the cairo variables in scope, the hints executed before that instruction if any, and its location inside the cairo program. From ab05b2367a7467f91a53a0ee1cfe9255d5291398 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Juli=C3=A1n=20Gonz=C3=A1lez=20Calder=C3=B3n?= Date: Mon, 4 Nov 2024 16:19:49 -0300 Subject: [PATCH 06/17] Clean up doc --- docs/vm/README.md | 113 ++++++++++++++++++++++------------------------ 1 file changed, 54 insertions(+), 59 deletions(-) diff --git a/docs/vm/README.md b/docs/vm/README.md index 9aad47aefa..9799eac6e9 100644 --- a/docs/vm/README.md +++ b/docs/vm/README.md @@ -29,7 +29,7 @@ Our virtual machine has a very simple flow: Barring some simplifications we made, this is all the Cairo VM does. The two main things that stand out as radically different are the memory model and the use of `Field Elements` to perform arithmetic. Below we go into more detail on each step, and in the process explain the ommisions we made. -## Architecture +## Memory The Cairo virtual machine uses a Von Neumann architecture with a Non-deterministic read-only memory. What this means, roughly, is that memory is immutable after you've written to it (i.e. you can only write to it once); this is to make the STARK proving easier, but we won't go into that here. @@ -43,57 +43,35 @@ Memory `cells` (i.e. values in memory) are identified by the index of the segmen Even though this segment model is extremely convenient for the VM's execution, the STARK prover needs to have the memory as just one contiguous region. Because of this, once execution of a Cairo program finishes, all the memory segments are collapsed into one; this process is called `Relocation`. We will go into more detail on all of this below. -### Registers +The first segment (index 0) is the program segment, which stores the instructions of a cairo program. The following segment (index 1) is the execution segment, which holds the values that are created along the execution of the vm, for example, when we call a function, a pointer to the next instruction after the call instruction will be stored in the execution segment which will then be used to find the next instruction after the function returns. -There are only three registers in the Cairo VM: +The following group of segments are the builtin segments, one for each builtin used by the program, and which hold values used by the builtin runners. -- The program counter `pc`, which points to the next instruction to be executed. -- The allocation pointer `ap`, pointing to the next unused memory cell. -- The frame pointer `fp`, pointing to the base of the current stack frame. When a new function is called, `fp` is set to the current `ap`. When the function returns, `fp` goes back to its previous value. The VM creates new segments whenever dynamic allocation is needed, so for example the cairo analog to a Rust `Vec` will have its own segment. Relocation at the end meshes everything together. +The last group of segments are the user segments, which represent data structures created by the user, for example, when creating an array on a cairo program, that array will be represented in memory as its own segment. -### Instruction Decoding/Execution - -TODO: explain the components of an instruction (`dst_reg`, `op0_reg`, etc), what each one is used for and how they're encoded/decoded. - -### Felts - -Felts, or Field Elements, are cairo's basic integer type. Every variable in a cairo vm that is not a pointer is a felt. From our point of view we could say a felt in cairo is an unsigned integer in the range [0, CAIRO_PRIME). This means that all operations are done modulo CAIRO_PRIME. The CAIRO_PRIME is 0x800000000000011000000000000000000000000000000000000000000000001, which means felts can be quite big (up to 252 bits), luckily, we have the [Lambdaworks](https://github.com/lambdaclass/lambdaworks) library to help with handling these big integer values and providing fast and efficient modular arithmetic. - -### More on memory - -The cairo memory is made up of contiguous segments of variable length identified by their index. The first segment (index 0) is the program segment, which stores the instructions of a cairo program. The following segment (index 1) is the execution segment, which holds the values that are created along the execution of the vm, for example, when we call a function, a pointer to the next instruction after the call instruction will be stored in the execution segment which will then be used to find the next instruction after the function returns. The following group of segments are the builtin segments, one for each builtin used by the program, and which hold values used by the builtin runners. The last group of segments are the user segments, which represent data structures created by the user, for example, when creating an array on a cairo program, that array will be represented in memory as its own segment. - -An address (or pointer) in cairo is represented as a `relocatable` value, which is made up of a `segment_index` and an `offset`, the `segment_index` tells us which segment the value is stored in and the `offset` tells us how many values exist between the start of the segment and the value. - -As the cairo memory can hold both felts and pointers, the basic memory unit is a `maybe_relocatable`, a variable that can be either a `relocatable` or a `felt`. - -While memory is continous, some gaps may be present. These gaps can be created on purpose by the user, for example by running: +While memory is continous, some gaps may be present. These gaps can be created on purpose by the user, for example by executing the following CASM: ```text [ap + 1] = 2; ``` -Where a gap is created at ap. But they may also be created indireclty by diverging branches, as for example one branch may declare a variable that the other branch doesn't, as memory needs to be allocated for both cases if the second case is ran then a gap is left where the variable should have been written. - -#### Memory API +These gaps may also be created indireclty by diverging branches, as for example one branch may declare a variable that the other branch doesn't, as memory needs to be allocated for both cases if the second case is ran then a gap is left where the variable should have been written. -The memory can perform the following basic operations: - -- `memory_add_segment`: Creates a new, empty segment in memory and returns a pointer to its start. Values cannot be inserted into a memory segment that hasn't been previously created. +### Felts -- `memory_insert`: Inserts a `maybe_relocatable` value at an address indicated by a `relocatable` pointer. For this operation to succeed, the pointer's segment_index must be an existing segment (created using `memory_add_segment`), and there mustn't be a value stored at that address, as the memory is immutable after its been written once. If there is a value already stored at that address but it is equal to the value to be inserted then the operation will be successful. +Felts, or Field Elements, are cairo's basic integer type. Every variable in a cairo vm that is not a pointer is a felt. From our point of view we could say a felt in cairo is an unsigned integer in the range [0, P), where P is a very large prime currently equal to `2^251+17*2^192+1`. This means that all operations are done modulo P. -- `memory_get`: Fetches a `maybe_relocatable` value from a memory address indicated by a `relocatable` pointer. +### Relocatable -Other operations: +An address (or pointer) in cairo is represented as a `Relocatable` value, which is made up of a `segment_index` and an `offset`, the `segment_index` tells us which segment the value is stored in and the `offset` tells us how many values exist between the start of the segment and the value. -- `memory_load_data`: This is a convenience method, which takes an array of `maybe_relocatable` and inserts them contiguosuly in memory by calling `memory_insert` and advancing the pointer by one after each insertion. Returns a pointer to the next free memory slot after the inserted data. +As the cairo memory can hold both felts and pointers, the basic memory unit is a `MaybeRelocatable`, a variable that can be either a `Relocatable` or a `Felt`. -#### Memory Relocation +### Memory Relocation During execution, the memory consists of segments of varying length, and they can be accessed by indicating their segment index, and the offset within that segment. When the run is finished, a relocation process takes place, which transforms this segmented memory into a contiguous list of values. The relocation process works as follows: -1. The size of each segment is calculated (The size is equal to the highest offset within the segment + 1, and not the amount of `maybe_relocatable` values, as there can be gaps) +1. The size of each segment is calculated as the highest offset within the segment + 1. 2. A base is assigned to each segment by accumulating the size of the previous segment. The first segment's base is set to 1. 3. All `relocatable` values are converted into a single integer by adding their `offset` value to their segment's base calculated in the previous step @@ -109,36 +87,53 @@ For example, if we have this memory represented by address, value pairs: 2:0 -> 1 ``` -Step 1: Calculate segment sizes: +Then, to relocate: + +1. Calculate segment sizes: + ```text + 0 --(has size)--> 3 + 1 --(has size)--> 5 + 2 --(has size)--> 1 + ``` + +2. Assign a base to each segment: + ```text + 0 --(has base value)--> 1 + 1 --(has base value)--> 4 (that is: 1 + 3) + 2 --(has base value)--> 9 (that is: 4 + 5) + ``` + +3. Convert relocatables to integers + ```text + 1 (base[0] + 0) -> 1 + 2 (base[0] + 1) -> 4 + 3 (base[0] + 2) -> 7 + 4 (base[1] + 0) -> 8 + 5 (base[1] + 1) -> 3 (that is: base[0] + 2) + .... (memory gaps) + 8 (base[1] + 4) -> 2 (that is: base[0] + 1) + 9 (base[2] + 0) -> 1 + ``` + +## Instruction Set -```text - 0 --(has size)--> 3 - 1 --(has size)--> 5 - 2 --(has size)--> 1 -``` +### Registers -Step 2: Assign a base to each segment: +There are only three registers in the Cairo VM: -```text - 0 --(has base value)--> 1 - 1 --(has base value)--> 4 (that is: 1 + 3) - 2 --(has base value)--> 9 (that is: 4 + 5) -``` +- The program counter `pc`, which points to the next instruction to be executed. +- The allocation pointer `ap`, pointing to the next unused memory cell. +- The frame pointer `fp`, pointing to the base of the current stack frame. When a new function is called, `fp` is set to the current `ap`. When the function returns, `fp` goes back to its previous value. The VM creates new segments whenever dynamic allocation is needed, so for example the cairo analog to a Rust `Vec` will have its own segment. Relocation at the end meshes everything together. -Step 3: Convert relocatables to integers +### Instruction Decoding/Execution -```text - 1 (base[0] + 0) -> 1 - 2 (base[0] + 1) -> 4 - 3 (base[0] + 2) -> 7 - 4 (base[1] + 0) -> 8 - 5 (base[1] + 1) -> 3 (that is: base[0] + 2) - .... (memory gaps) - 8 (base[1] + 4) -> 2 (that is: base[0] + 1) - 9 (base[2] + 0) -> 1 -``` +TODO: explain the components of an instruction (`dst_reg`, `op0_reg`, etc), what each one is used for and how they're encoded/decoded. + +### Operations + +TODO: Explain primitive operations and how they affect the registers -### Program parsing +## Program parsing The input of the Virtual Machine is a compiled Cairo program in Json format. The main parts of the file are listed below: From 636548459aa79952e28a851496e530bcfd7519d8 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Juli=C3=A1n=20Gonz=C3=A1lez=20Calder=C3=B3n?= Date: Mon, 4 Nov 2024 16:42:14 -0300 Subject: [PATCH 07/17] Add instruction format section --- docs/vm/README.md | 23 +++++++++++++++++++++-- 1 file changed, 21 insertions(+), 2 deletions(-) diff --git a/docs/vm/README.md b/docs/vm/README.md index 9799eac6e9..1541a1b34e 100644 --- a/docs/vm/README.md +++ b/docs/vm/README.md @@ -125,9 +125,28 @@ There are only three registers in the Cairo VM: - The allocation pointer `ap`, pointing to the next unused memory cell. - The frame pointer `fp`, pointing to the base of the current stack frame. When a new function is called, `fp` is set to the current `ap`. When the function returns, `fp` goes back to its previous value. The VM creates new segments whenever dynamic allocation is needed, so for example the cairo analog to a Rust `Vec` will have its own segment. Relocation at the end meshes everything together. -### Instruction Decoding/Execution +### Instruction Format -TODO: explain the components of an instruction (`dst_reg`, `op0_reg`, etc), what each one is used for and how they're encoded/decoded. +CASM instruction have the following format. If the instruction uses an immediate operand, it's value is taken from the cell of the program segment. + +``` +// Structure of the 63-bit that form the first word of each instruction. +// See Cairo whitepaper, page 32 - https://eprint.iacr.org/2021/1063.pdf. +// ┌─────────────────────────────────────────────────────────────────────────┐ +// │ off_dst (biased representation) │ +// ├─────────────────────────────────────────────────────────────────────────┤ +// │ off_op0 (biased representation) │ +// ├─────────────────────────────────────────────────────────────────────────┤ +// │ off_op1 (biased representation) │ +// ├─────┬─────┬───────┬───────┬───────────┬────────┬───────────────────┬────┤ +// │ dst │ op0 │ op1 │ res │ pc │ ap │ opcode │ 0 │ +// │ reg │ reg │ src │ logic │ update │ update │ │ │ +// ├─────┼─────┼───┬───┼───┬───┼───┬───┬───┼───┬────┼────┬────┬────┬────┼────┤ +// │ 0 │ 1 │ 2 │ 3 │ 4 │ 5 │ 6 │ 7 │ 8 │ 9 │ 10 │ 11 │ 12 │ 13 │ 14 │ 15 │ +// └─────┴─────┴───┴───┴───┴───┴───┴───┴───┴───┴────┴────┴────┴────┴────┴────┘ +``` + +TODO: Explain the meaning of each element ### Operations From 0f94a3ed241dd7e11617af4f1c04f56d4b892dab Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Juli=C3=A1n=20Gonz=C3=A1lez=20Calder=C3=B3n?= Date: Mon, 4 Nov 2024 16:44:49 -0300 Subject: [PATCH 08/17] Add more todos --- docs/vm/README.md | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/docs/vm/README.md b/docs/vm/README.md index 1541a1b34e..709deaca99 100644 --- a/docs/vm/README.md +++ b/docs/vm/README.md @@ -152,6 +152,14 @@ TODO: Explain the meaning of each element TODO: Explain primitive operations and how they affect the registers +## Hints + +TODO: + +## Builtins + +TODO: + ## Program parsing The input of the Virtual Machine is a compiled Cairo program in Json format. The main parts of the file are listed below: From 74c3e5c3b63c4ce92af07aede61bb8b6c7ad675c Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Juli=C3=A1n=20Gonz=C3=A1lez=20Calder=C3=B3n?= Date: Mon, 4 Nov 2024 16:46:22 -0300 Subject: [PATCH 09/17] Add description to some tools --- docs/README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/README.md b/docs/README.md index 8b9477258a..c781d4d706 100644 --- a/docs/README.md +++ b/docs/README.md @@ -11,8 +11,8 @@ ## Tooling -* [cairo1-run](/cairo1-run) -* [cairo-vm-cli](/cairo-vm-cli) +* [cairo1-run](/cairo1-run): Execute Cairo 1 programs +* [cairo-vm-cli](/cairo-vm-cli): Execute Cairo 0 programs * [cairo-vm-tracer](/cairo-vm-tracer) * [fuzzer](/fuzzer) * [hint_accountant](/hint_accountant) From 5d08f7df8510c9f7beee8f8bf5ad610dc8774cd7 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Juli=C3=A1n=20Gonz=C3=A1lez=20Calder=C3=B3n?= Date: Mon, 4 Nov 2024 16:49:10 -0300 Subject: [PATCH 10/17] Ack doc source --- docs/vm/README.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/docs/vm/README.md b/docs/vm/README.md index 709deaca99..2201a92f71 100644 --- a/docs/vm/README.md +++ b/docs/vm/README.md @@ -1,3 +1,5 @@ + + # How does the Cairo VM work? ## High Level Overview From 5acfa5e7cdfa47c4d76d5a0337418592acd064f6 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Juli=C3=A1n=20Gonz=C3=A1lez=20Calder=C3=B3n?= Date: Mon, 4 Nov 2024 17:01:40 -0300 Subject: [PATCH 11/17] Add hint and builtin documentation --- docs/vm/README.md | 141 +++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 139 insertions(+), 2 deletions(-) diff --git a/docs/vm/README.md b/docs/vm/README.md index 2201a92f71..4cc161111b 100644 --- a/docs/vm/README.md +++ b/docs/vm/README.md @@ -156,11 +156,148 @@ TODO: Explain primitive operations and how they affect the registers ## Hints -TODO: +So far we have been thinking about the VM mostly abstracted from the prover and verifier it's meant to feed its results to. The last main feature we need to talk about, however, requires keeping this proving/verifying logic in mind. + +As a reminder, the whole point of the Cairo VM is to output a trace/memory file so that a `prover` can then create a cryptographic proof that the execution of the program was done correctly. A `verifier` can then take that proof and verify it in much less time than it would have taken to re-execute the entire program. + +In this model, the one actually using the VM to run a cairo program is *always the prover*. The verifier does not use the VM in any way, as that would defeat the entire purpose of validity proofs; they just get the program being run and the proof generated by the prover and run some cryptographic algorithm to check it. + +While the verifier does not execute the code, they do *check it*. As an example, if a cairo program computes a fibonacci number like this: + +``` +func main() { + // Call fib(1, 1, 10). + let result: felt = fib(1, 1, 10); +} +``` + +the verifier won't *run* this, but they will reject any incorrect execution of the call to `fib`. The correct value for `result` in this case is `144` (it's the 10th fibonacci number); any attempt by the prover to convince the verifier that `result` is not `144` will fail, because the call to the `fib` function is *being proven* and thus *seen* by the verifier. + +A `Hint` is a piece of code that is not proven, and therefore not seen by the verifier. If `fib` above were a hint, then the prover could convince the verifier that `result` is $144$, $0$, $1000$ or any other number. + +In cairo 0, hints are code written in `Python` and are surrounded by curly brackets. Here's an example from the `alloc` function, provided by the Cairo common library + +``` +func alloc() -> (ptr: felt*) { + %{ memory[ap] = segments.add() %} + ap += 1; + return (ptr=cast([ap - 1], felt*)); +} +``` + +The first line of the function, + +``` +%{ memory[ap] = segments.add() %} +``` + +is a hint called `ADD_SEGMENT`. All it does is create a new memory segment, then write its base to the current value of `ap`. This is python code that is being run in the context of the VM's execution; thus `memory` refers to the VM's current memory and `segments.add()` is just a function provided by the VM to allocate a new segment. + +At this point you might be wondering: why run code that's not being proven? Isn't the whole point of Cairo to prove correct execution? There are (at least) two reasons for hints to exist. + +### Nothing to prove + +For some operations there's simply nothing to prove, as they are just convenient things one wants to do during execution. The `ADD_SEGMENT` hint shown above is a good example of that. When proving execution, the program's memory is presented as one relocated continuous segment, it does not matter at all which segment a cell was in, or when that segment was added. The verifier doesn't care. + +Because of this, there's no reason to make `ADD_SEGMENT` a part of the cairo language and have an instruction for it. + +### Optimization + +Certain operations can be very expensive, in the sense that they might involve a huge amount of instructions or memory usage, and therefore contribute heavily to the proving time. For certain calculations, there are two ways to convince the verifier that it was done correctly: + +- Write the entire calculation in Cairo/Cairo Assembly. This makes it show up in the trace and therefore get proven. +- *Present the result of the calculation to the verifier through a hint*, then show said result indeed satisfies the relevant condition that makes it the actual result. + +To make this less abstract, let's show two examples. + +#### Square root + +Let's say the calculation in question is to compute the square root of a number `x`. The two ways to do it then become: + +- Write the usual square root algorithm in Cairo to compute `sqrt(x)`. +- Write a hint that computes `sqrt(x)`, then immediately after calling the hint show __in Cairo__ that `(sqrt(x))^2 = x`. + +The second approach is exactly what the `sqrt` function in the Cairo common library does: + +``` +// Returns the floor value of the square root of the given value. +// Assumptions: 0 <= value < 2**250. +func sqrt{range_check_ptr}(value) -> felt { + alloc_locals; + local root: felt; + + %{ + from starkware.python.math_utils import isqrt + value = ids.value % PRIME + assert value < 2 ** 250, f"value={value} is outside of the range [0, 2**250)." + assert 2 ** 250 < PRIME + ids.root = isqrt(value) + %} + + assert_nn_le(root, 2 ** 125 - 1); + tempvar root_plus_one = root + 1; + assert_in_range(value, root * root, root_plus_one * root_plus_one); + + return root; +} +``` + +If you read it carefully, you'll see that the hint in this function computes the square root in python, then this line + +``` +assert_in_range(value, root * root, root_plus_one * root_plus_one); +``` + +asserts __in Cairo__ that `(sqrt(x))^2 = x`. + +This is done this way because it is much cheaper, in terms of the generated trace (and thus proving time), to square a number than compute its square root. + +Notice that the last assert is absolutely mandatory to make this safe. If you forget to write it, the square root calculation does not get proven, and anyone could convince the verifier that the result of `sqrt(x)` is any number they like. + +#### Linear search turned into an O(1) lookup + +This example is taken from the [Cairo documentation](https://docs.cairo-lang.org/0.12.0/hello_cairo/program_input.html). + +Given a list of `(key, value)` pairs, if we want to write a `get_value_by_key` function that returns the value associated to a given key, there are two ways to do it: + +- Write a linear search in Cairo, iterating over each key until you find the requested one. +- Do that exact same linear search *inside a hint*, find the result, then show that the result's key is the one requested. + +Again, the second approach makes the resulting trace and proving much faster, because it's just a lookup; there's no linear search. Notice this only applies to proving, the VM has to execute the hint, so there's still a linear search when executing to generate the trace. In fact, the second approach is more expensive for the VM than the first one. It has to do both a linear search and a lookup. This is a tradeoff in favor of proving time. + +Also note that, as in the square root example, when writing this logic you need to remember to show the hint's result is the correct one in Cairo. If you don't, your code is not being proven. + +### Non-determinism + +The Cairo paper and documentation refers to this second approach to calculating things through hints as *non-determinism*. The reason for this is that sometimes there is more than one result that satisfies a certain condition. This means that cairo execution becomes non deterministic; a hint could output multiple values, and in principle there is no way to know which one it's going to be. Running the same code multiple times could give different results. + +The square root is an easy example of this. The condition `(sqrt(x))^2 = x` is not unique, there are two solutions to it. Without the hint, this is non-deterministic, `x` could have multiple values; the hint resolves that by choosing a specific value when being run. + +### Common Library and Hints + +As explained above, using hints in your code is highly unsafe. Forgetting to add a check after calling them can make your code vulnerable to any sorts of attacks, as your program will not prove what you think it proves. + +Because of this, most hints in Cairo 0 are wrapped around or used by functions in the Cairo common library that do the checks for you, thus making them safe to use. Ideally, Cairo developers should not be using hints on their own; only transparently through Cairo library functions they call. + +### Whitelisted Hints + +In Cairo, a hint could be any Python code you like. In the context of it as just another language someone might want to use, this is fine. In the context of Cairo as a programming language used to write smart contracts deployed on a blockchain, it's not. Users could deploy contracts with hints that simply do + +```python +while true: + pass +``` + +and grind the network down to a halt, as nodes get stuck executing an infinite loop when calling the contract. + +To address this, the starknet network maintains a list of *whitelisted* hints, which are the only ones that can be used in starknet contracts. These are the ones implemented in this VM. ## Builtins -TODO: +A builtin is a low level optimization integrated into the core loop of the VM that allows otherwise expensive computation to be performed more efficiently. Builtins have two ways to operate: via validation rules and via auto-deduction rules. + +- Validation rules are applied to every element that is inserted into a builtin's segment. For example, if I want to verify an ecdsa signature, I can insert it into the ecdsa builtin's segment and let a validation rule take care of verifying the signature. +- Auto-deduction rules take over during instruction execution, when we can't compute the value of an operand who's address belongs to a builtin segment, we can use that builtin's auto-deduction rule to calculate the value of the operand. For example, If I want to calculate the pedersen hash of two values, I can write the values into the pedersen builtin's segment and then ask for the next memory cell, without builtins, this instruction would have failed, as there is no value stored in that cell, but now we can use auto-deduction rules to calculate the hash and fill in that memory cell. ## Program parsing From 25ceaa7e4ac13c6d6a584182dd8562a43680199c Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Juli=C3=A1n=20Gonz=C3=A1lez=20Calder=C3=B3n?= Date: Mon, 4 Nov 2024 17:05:39 -0300 Subject: [PATCH 12/17] Remove extra todo --- docs/vm/README.md | 4 ---- 1 file changed, 4 deletions(-) diff --git a/docs/vm/README.md b/docs/vm/README.md index 4cc161111b..07738ddfed 100644 --- a/docs/vm/README.md +++ b/docs/vm/README.md @@ -150,10 +150,6 @@ CASM instruction have the following format. If the instruction uses an immediate TODO: Explain the meaning of each element -### Operations - -TODO: Explain primitive operations and how they affect the registers - ## Hints So far we have been thinking about the VM mostly abstracted from the prover and verifier it's meant to feed its results to. The last main feature we need to talk about, however, requires keeping this proving/verifying logic in mind. From 506698c8e9d96bfc1065c4c8cb6fb8b1e3a9898b Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Juli=C3=A1n=20Gonz=C3=A1lez=20Calder=C3=B3n?= Date: Mon, 4 Nov 2024 17:11:49 -0300 Subject: [PATCH 13/17] Fix grammar error --- docs/vm/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/vm/README.md b/docs/vm/README.md index 07738ddfed..4f1f3800c4 100644 --- a/docs/vm/README.md +++ b/docs/vm/README.md @@ -129,7 +129,7 @@ There are only three registers in the Cairo VM: ### Instruction Format -CASM instruction have the following format. If the instruction uses an immediate operand, it's value is taken from the cell of the program segment. +CASM instruction have the following format. If the instruction uses an immediate operand, it's value is taken from the next cell of the program segment. ``` // Structure of the 63-bit that form the first word of each instruction. From 7a4436299d5b39f3946b03b94ccd16544d9f6f86 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Juli=C3=A1n=20Gonz=C3=A1lez=20Calder=C3=B3n?= Date: Mon, 4 Nov 2024 17:25:38 -0300 Subject: [PATCH 14/17] Update prover link --- docs/vm/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/vm/README.md b/docs/vm/README.md index 4f1f3800c4..a166117508 100644 --- a/docs/vm/README.md +++ b/docs/vm/README.md @@ -16,7 +16,7 @@ The main three components of this flow are: - A Cairo compiler to turn a program written in the [Cairo programming language](https://www.cairo-lang.org/) into bytecode. - A Cairo VM to then execute it and generate a trace. -- [A STARK prover and verifier](https://github.com/lambdaclass/starknet_stack_prover_lambdaworks) so one party can prove correct execution, while another can verify it. +- [A STARK prover and verifier](https://github.com/starkware-libs/stone-prover/tree/main) so one party can prove correct execution, while another can verify it. While this repo is only concerned with the second component, it's important to keep in mind the other two; especially important are the prover and verifier that this VM feeds its trace to, as a lot of its design decisions come from them. This virtual machine is designed to make proving and verifying both feasible and fast, and that makes it quite different from most other VMs you are probably used to. From ab60b1aad30ab6318d749d53d2eb28f01a9069c2 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Juli=C3=A1n=20Gonz=C3=A1lez=20Calder=C3=B3n?= Date: Thu, 19 Dec 2024 14:20:10 -0300 Subject: [PATCH 15/17] Mention Code Walkthrough --- docs/vm/README.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/docs/vm/README.md b/docs/vm/README.md index a166117508..d39ed017bf 100644 --- a/docs/vm/README.md +++ b/docs/vm/README.md @@ -310,3 +310,7 @@ The input of the Virtual Machine is a compiled Cairo program in Json format. The - **main_scope:** Usually something like `__main__`. All the identifiers associated with main function will be identified as `__main__`.identifier_name. Useful to identify the entrypoint of the program. - **prime:** The cairo prime in hexadecimal format. As explained above, all arithmetic operations are done over a base field, modulo this primer number. - **reference_manager:** Contains information about cairo variables. This information is useful to access to variables when executing cairo hints. + +# Code Walkthrough + +If you want a step by step guide on how to implement your own VM you can read our [code walkthrough in Go](https://github.com/lambdaclass/cairo-vm_in_go?tab=readme-ov-file#code-walkthroughwrite-your-own-cairo-vm) From fddba577f7f9a364ae9d557d363fe1269f9712e7 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Juli=C3=A1n=20Gonz=C3=A1lez=20Calder=C3=B3n?= Date: Thu, 19 Dec 2024 14:46:15 -0300 Subject: [PATCH 16/17] Remove TODO --- docs/vm/README.md | 33 ++++++++++++++++++++------------- 1 file changed, 20 insertions(+), 13 deletions(-) diff --git a/docs/vm/README.md b/docs/vm/README.md index d39ed017bf..a2844963c8 100644 --- a/docs/vm/README.md +++ b/docs/vm/README.md @@ -134,21 +134,28 @@ CASM instruction have the following format. If the instruction uses an immediate ``` // Structure of the 63-bit that form the first word of each instruction. // See Cairo whitepaper, page 32 - https://eprint.iacr.org/2021/1063.pdf. -// ┌─────────────────────────────────────────────────────────────────────────┐ -// │ off_dst (biased representation) │ -// ├─────────────────────────────────────────────────────────────────────────┤ -// │ off_op0 (biased representation) │ -// ├─────────────────────────────────────────────────────────────────────────┤ -// │ off_op1 (biased representation) │ -// ├─────┬─────┬───────┬───────┬───────────┬────────┬───────────────────┬────┤ -// │ dst │ op0 │ op1 │ res │ pc │ ap │ opcode │ 0 │ -// │ reg │ reg │ src │ logic │ update │ update │ │ │ -// ├─────┼─────┼───┬───┼───┬───┼───┬───┬───┼───┬────┼────┬────┬────┬────┼────┤ -// │ 0 │ 1 │ 2 │ 3 │ 4 │ 5 │ 6 │ 7 │ 8 │ 9 │ 10 │ 11 │ 12 │ 13 │ 14 │ 15 │ -// └─────┴─────┴───┴───┴───┴───┴───┴───┴───┴───┴────┴────┴────┴────┴────┴────┘ +┌─────────────────────────────────────────────────────────────────────────┐ +│ off_dst (biased representation) │ +├─────────────────────────────────────────────────────────────────────────┤ +│ off_op0 (biased representation) │ +├─────────────────────────────────────────────────────────────────────────┤ +│ off_op1 (biased representation) │ +├─────┬─────┬───────┬───────┬───────────┬────────┬───────────────────┬────┤ +│ dst │ op0 │ op1 │ res │ pc │ ap │ opcode │ 0 │ +│ reg │ reg │ src │ logic │ update │ update │ │ │ +├─────┼─────┼───┬───┼───┬───┼───┬───┬───┼───┬────┼────┬────┬────┬────┼────┤ +│ 0 │ 1 │ 2 │ 3 │ 4 │ 5 │ 6 │ 7 │ 8 │ 9 │ 10 │ 11 │ 12 │ 13 │ 14 │ 15 │ +└─────┴─────┴───┴───┴───┴───┴───┴───┴───┴───┴────┴────┴────┴────┴────┴────┘ ``` -TODO: Explain the meaning of each element +- The first 6 fields: `off_dst`, `off_op0`, `off_op1`, `dst_reg`, `op0_reg`, `op1_src` determine the memory locations of the operands, and the destination of the result (which is not always used). + - `op1_src` occupies 2 bits, as it can also be an immediate, in which case the value will be taken from the next instruction (`[pc + 1]`). +- The `res_logic` field determines how to compute the result (op1, sum, multiplication). The usage of the result depends on the following fields. +- The `opcode` field describes which operation is been performed (noop, assert, call, ret). It modifies the meaning of the following fields. +- The `pc_update` field determines how to update the program counter (advance, jump, branch, etc.). +- The `ap_update` field determines how to update the allocation pointer. + +For an in-depth explanation, you can see Cairo whitepaper, page 33 - https://eprint.iacr.org/2021/1063.pdf, or checkout [our implementation](/vm/src/vm/vm_core.rs). ## Hints From f112a3ea15946acf5d40bcdb82ee497e450b642c Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Juli=C3=A1n=20Gonz=C3=A1lez=20Calder=C3=B3n?= Date: Thu, 19 Dec 2024 17:19:57 -0300 Subject: [PATCH 17/17] Add example --- docs/vm/README.md | 38 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 38 insertions(+) diff --git a/docs/vm/README.md b/docs/vm/README.md index a2844963c8..f340a4b8c2 100644 --- a/docs/vm/README.md +++ b/docs/vm/README.md @@ -157,6 +157,44 @@ CASM instruction have the following format. If the instruction uses an immediate For an in-depth explanation, you can see Cairo whitepaper, page 33 - https://eprint.iacr.org/2021/1063.pdf, or checkout [our implementation](/vm/src/vm/vm_core.rs). +Take, for example, the following instruction: + +``` +[ap + 1] = [fp + 2] + 3 +``` + +The instruction (at `[pc]`) will be encoded as: + +``` +off_dst = 1 +off_op0 = 2 +off_op1 = 1 # It's always 1 when op1 is an immediate +dst_reg = 0 # AP +op0_reg = 1 # FP +op1_src = 1 # Immediate +res_logic = 1 # Add +pc_update = 0 # Regular (advance) +ap_update = 0 # Regular (no update) +opcode = 4 # Assert +``` + +The next instruction (at `[pc + 1]`) will be the immediate, and it will have a value of `3`. + +Given the following initial register values: +``` +fp = 5 +ap = 10 +pc = 15 +``` +Then: +- `op1` is `[fp + 2]`, which is resolved to `[7]`. +- `op2` is `[pc + 1]`, which is resolved to `[16] == 3`. +- `dst` is `[ap + 1]`, which is resolved to `[11]` +- The result of `op1 + op2` is stored at `dst` +- The register `pc` is increased by 2, we skip the next instruction because it was the immediate. +- The register `fp` is not updated +- The register `ap` is not updated + ## Hints So far we have been thinking about the VM mostly abstracted from the prover and verifier it's meant to feed its results to. The last main feature we need to talk about, however, requires keeping this proving/verifying logic in mind.