Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enlarge the instruction set #62

Open
28 of 30 tasks
Mesabloo opened this issue Jun 17, 2022 · 2 comments
Open
28 of 30 tasks

Enlarge the instruction set #62

Mesabloo opened this issue Jun 17, 2022 · 2 comments
Assignees
Labels
codegen I guess the object files generated are buggy design Something that influences the language's design enhancement New feature or request parsing Is the syntax parsed well enough ? type-checking Anything about the typechecking process
Milestone

Comments

@Mesabloo
Copy link
Member

Mesabloo commented Jun 17, 2022

Currently, there are too few instructions included in N*.

Here's an exhaustive list:

  • jmp unconditionnally jumps to the given label;
  • call is a restricted form of jmp which accounts for a "return address"
  • ret jumps to the address in the continuation space
  • sld loads a value from the stack into a register
  • sst stores the value inside a register into the stack at the given index
  • salloc allocates some space on top of the stack (similarly to alloca in C, though the size is restricted by the type instead of being a literal integer)
  • sfree removes the top-most stack cell
  • sref allows fetching a reference (pointer) to a given stack cell
  • ld loads a value from a memory address into a register
  • st writes a value inside a register at a memory address
  • mv moves a register or immediate value into a register
  • nop is useless and does absolutely nothing more than take space in the resulting object

However, there are quite a lot more useful instructions to have in order to make N⋆ usable as both a programming assembly language and a compiler backend for Zilch.
Here are some examples:

  • arithmetic operations such as add, sub, mul, etc. with floating point counterparts (addf, subf, mulf, etc. where the f suffix stands for “floating”)
  • conditionnal control flow operations using jz, jnz, je, jne, jl, jle, jg, jge, etc. which act as simple jumps (e.g. jz %r0, then, else jumps to the then label if the content of %r0 is 0, else jumps to the else label)
  • bitwise operators such as and, or, xor, not, etc. Boolean logic operators may also be of great use as specific instructions, in order not to fall into traps such as not 5 != 0 (we could have notb 5 = 0, or andb 1 2 = 1, where every non-zero value is considered the truth value)

Of course this list is absolutely not exhaustive (although it should be sufficient in the beginning). Many more instructions could be added (even for specific things such as SIMD), as long as they can be typed properly inside N⋆.
All the above instructions will need to have types, but this can be done while adding them to the language/compiler.

Roadmap

  • logical instructions:
    • and: bitwise logical AND
    • or: bitwise logical OR
    • xor: bitwise logical XOR
    • not: bitwise logical NOT
    • shiftl: bitwise logical LEFT SHIFT
    • shiftr: bitwise logical RIGHT SHIFT
  • Conditional moves:
    • cmvz: conditionally move if zero
    • cmvnz: conditionally move if not zero
    • cmvl: conditionally move if less than
    • cmvle: conditionally move if less than or equal
    • cmvg: conditionally move if greater than
    • cmvge: conditionally move if greater than or equal
    • cmve: conditionally move if equal
    • cmvne: conditionally move if not equal
  • Conditional jumps:
    • cjz: jump if zero
    • cjnz: jump if not zero
    • cjl: jump if less than
    • cjle: jump if less than or equal
    • cjg: jump if greater than
    • cjge: jump if greater than or equal
    • cje: jump if equal
    • cjne: jump if not equal
  • Integer arithmetic instructions:
    • add: add two integers
    • sub: subtract one integer from another
    • mul: multiply two integers together
    • div: divide one integer by another
@Mesabloo Mesabloo added enhancement New feature or request design Something that influences the language's design labels Jun 17, 2022
@Mesabloo Mesabloo added this to the N* v3 milestone Jun 17, 2022
@Mesabloo Mesabloo moved this to Todo in Issue tracker Jun 17, 2022
@Mesabloo
Copy link
Member Author

Mesabloo commented Jun 17, 2022

Note on boolean instructions: there are no direct equivalent in modern machine code (x86 does not have such concept of "boolean").
notb %r0 will therefore be compiled to code similar to (on x86)

CMP $0, %rax     # Compare %r0 with 0, set ZF=1 if equal
SETE %al         # set %r0[56:8] to ZF
MOVZX %al, %rax  # Copy %ro[56:8] to %r0, filling %ro[0:56] with zeros

The last two operations could be replaced with

CMOVZ $1, %rax  # set %r0 to 1 if ZF=1
CMOVNZ $0, %rax  # set %r0 to 0 if ZF=0

although performance could be different (because CMOVcc needs to look at the flags every time, whereas only SETE looks at them above).

As there is no notion of "compare flag (CF)" and "zero flag (ZF), the cmp operation cannot exist in N⋆. This explains the need for boolean instructions.


On ARM, we may have these opcodes

cmp r0, #0
ite eq        /* if-then-else  */
moveq r0, #1  /* r0 = 1 iff r0 = 0 */
movne r0, #0  /* r0 = 0 iff r0 = 1 */
uxtb r0       /* 0-extend %r0 */

On ARM64 (AArch64) we have

cmp w0, #0
cset w0, eq
and w0, w0, K  /* where K depends on the size of w0, as long as it is 1-filled (e.g. 0xFFFF) */

The idea is that the and is used to 0-fill the register.


On MIPS, we may have

sltu $0, $0, 1
andi $0, $0, K  # same as above for ARM64 

@Mesabloo Mesabloo mentioned this issue Aug 26, 2022
10 tasks
@Mesabloo Mesabloo moved this from Todo to In Progress in Issue tracker Sep 20, 2022
@Mesabloo
Copy link
Member Author

Mesabloo commented Sep 22, 2022

Conditional moves are also a very good part of the instruction set. They avoid most of branches.
See the example above for notb, which uses conditional moves. If they were not available, one would have to define notb as the following:

  cmp %rax, $0
  jz set1
set0:
  mv $0, %rax
  j end_not
set1:
  mv $1, %rax
end_not:
  # ...

This is very heavy and makes quite a lot of jumps… And it's also not quite readable compared to the earlier version.

In N* though, this control flow is a lot heavier:

  # ...
  jz %r0, set1, set0

set0: ...
  mv %r0, 1;
  jmp end_not<...>
set1: ...
  mv %r0, 0;
  jmp end_not<...>

end_not: ...
  # ...  

If we take a look at the omitted type, one will see that the types for both set0 and set1 are exactly the same (they require the continuation in the same space, the same registers, and will return the same registers.
end_not on the other end is the same type as for set0 with the added constraint that it must take %r0: uN (which we may not propagate in both set0 and set1).
So there is quite a lot of redundancy for that few instructions, which conditional moves actually solve (by integrating them within the type inference rules).


As we have every kind of branching instructions (see above), we will also need all the conditional moves for the same “branchings”.

@Mesabloo Mesabloo self-assigned this Sep 23, 2022
@Mesabloo Mesabloo added type-checking Anything about the typechecking process parsing Is the syntax parsed well enough ? codegen I guess the object files generated are buggy labels Sep 23, 2022
Mesabloo added a commit to zilch-lang/zilch.kak that referenced this issue Sep 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
codegen I guess the object files generated are buggy design Something that influences the language's design enhancement New feature or request parsing Is the syntax parsed well enough ? type-checking Anything about the typechecking process
Projects
Status: In Progress
Development

No branches or pull requests

1 participant