Support Short (8/16 bit) atomic RMW operations on RISCV #297

matetokodi · 2024-10-21T23:53:01Z

Add support for short atomic RMW operations for RISCV

Since RISCV does not support short atomic RMW operations, only 32 and 64 bit ones, we must modify the correct values inside the 32 bit values ourselves.

zherczeg · 2024-10-22T04:31:06Z

src/jit/MemoryInl.h

    if (!(operationSize & SLJIT_32) && operationSize != SLJIT_MOV32) {
        compareTopFalse = sljit_emit_cmp(compiler, SLJIT_NOT_EQUAL, SLJIT_IMM, 0, srcExpectedPair.arg2, srcExpectedPair.arg2w);
    }
+    if (noShortAtomic && size <= 2) {
+        sljit_emit_op2(compiler, SLJIT_AND, maskReg, 0, baseReg, 0, SLJIT_IMM, 0x3);
+        sljit_emit_op2(compiler, SLJIT_SHL, maskReg, 0, maskReg, 0, SLJIT_IMM, 3); // multiply by 8


32 bit atomic should be present, so mutiply by 4 should be enough

zherczeg · 2024-10-24T07:20:44Z

src/jit/MemoryInl.h

    sljit_s32 operationSize = SLJIT_MOV;
    sljit_s32 size = 0;
    sljit_s32 offset = 0;
    sljit_s32 operation;
    uint32_t options = MemAddress::CheckNaturalAlignment | MemAddress::AbsoluteAddress;
+    sljit_sw stackTmpStart = CompileContext::get(compiler)->stackTmpStart;
+


No need this newline.

zherczeg · 2024-10-24T07:23:50Z

src/jit/ByteCodeParser.cpp

+        case ByteCode::I32AtomicRmwAndOpcode:
+        case ByteCode::I32AtomicRmwOrOpcode:
+        case ByteCode::I32AtomicRmwXorOpcode:
+        case ByteCode::I32AtomicRmwXchgOpcode: {
            info = Instruction::kIs32Bit;
            requiredInit = OTAtomicRmwI32;


There should be an OTAtomicRmwShort which adds an extra tmp register when short atomic is not present. Could use the compiler option bits to check this.

zherczeg · 2024-10-30T08:06:25Z

src/jit/MemoryInl.h

+            maskReg = instr->requiredReg(3);
+        }
+
+        if (SLJIT_IS_IMM(memValueReg)) {


What is this case?

In the case of unshared memory, the arguments are immediate zeroes

walrus/test/extended/threads/atomic.wast

Line 440 in 7f492b3

;; unshared memory is OK

Is it always 0? Cannot use other immediates?

Upon further investigation memValueReg is only an immediate in the RMW cases (not RMW Cmpxchg), but in my observation only in the unshared memory test cases. The only value I have seen it take has been 0, but that does not guarantee it will always be 0, but its value is not used in any case; its register is needed as a working TMP register.

But the deciding factor is not the memory being unshared, as the code works fine without the check for immediates even if I remove the shared flag from the very first module at the top of the test file for example.

However just returning in these cases is most likely incorrect; I have updated it so a TMP register is assigned in these cases instead.

zherczeg · 2024-10-30T08:07:21Z

src/jit/MemoryInl.h

+        JITArg memValue(operands + 0);
+        sljit_s32 memValueReg = SLJIT_EXTRACT_REG(memValue.arg);
+        sljit_s32 maskReg;
+        sljit_s32 shiftReg;


Usually these should be initialized to 0.

zherczeg · 2024-10-30T08:08:56Z

src/jit/MemoryInl.h

+#if (defined SLJIT_32BIT_ARCHITECTURE && SLJIT_32BIT_ARCHITECTURE)
+            sljit_emit_op2(compiler, SLJIT_AND32, baseReg, 0, baseReg, 0, SLJIT_IMM, ~0x3);
+#else /* !SLJIT_32BIT_ARCHITECTURE */
+            sljit_emit_op2(compiler, SLJIT_AND, baseReg, 0, baseReg, 0, SLJIT_IMM, ~0x3);


AND and AND32 is the same on 32 bit systems.

zherczeg · 2024-10-30T08:09:31Z

src/jit/MemoryInl.h

+        }
+
+        if (noShortAtomic && size <= 2) {
+            sljit_emit_op2(compiler, SLJIT_AND32, shiftReg, 0, baseReg, 0, SLJIT_IMM, 0x3);


Isn't baseReg is a word sized reg?

it is; would the more proper way to do this be using SLJIT_AND here, and then using an SLJIT_MOV32 from and to shiftReg?

Yes, use an and(shiftreg, basereg, 0x3), then move32(shiftreg,shiftreg). The latter is optimized out if not necessary.

zherczeg · 2024-10-30T08:15:54Z

src/jit/MemoryInl.h

+
+        if (noShortAtomic && size <= 2) {
+            sljit_emit_op2(compiler, SLJIT_AND32, tmpReg, 0, SLJIT_TMP_DEST_REG, 0, maskReg, 0);
+            sljit_emit_op2(compiler, SLJIT_LSHR32, SLJIT_TMP_DEST_REG, 0, tmpReg, 0, shiftReg, 0);


If these operations are reserved, and used and immedate, then maskReg can be reused, if that helps in some code above.

zherczeg · 2024-10-30T08:18:16Z

Is the code tested on RISCV?

matetokodi · 2024-11-04T14:30:00Z

Is the code tested on RISCV?

I have now tested it on both 32 and 64 bit RISCV. (using vorosl's WIP RISCV pathes so the project compiles, because currently the base project does not compile on 32 or 64 bit RISCV.)

zherczeg

Please change the status from draft to ready for review

zherczeg · 2024-11-06T11:23:44Z

src/jit/MemoryInl.h

+        if (noShortAtomic && size <= 2) {
+            sljit_emit_op2(compiler, SLJIT_AND, shiftReg, 0, baseReg, 0, SLJIT_IMM, 0x3);
+            sljit_emit_op1(compiler, SLJIT_MOV32, shiftReg, 0, shiftReg, 0);
+            sljit_emit_op1(compiler, SLJIT_MOV32, shiftReg, 0, shiftReg, 0);


Doing it once is enough.

Since RISCV does not support short atomic RMW operations, only 32 and 64 bit ones, we must modify the correct values inside the 32 bit values ourselves. Signed-off-by: Máté Tokodi [email protected]

zherczeg

LGTM

clover2123

LGTM

matetokodi force-pushed the jit_threads_riscv_manual branch from 8253ce1 to 4dd002b Compare October 22, 2024 00:02

zherczeg reviewed Oct 22, 2024

View reviewed changes

zherczeg requested changes Oct 24, 2024

View reviewed changes

matetokodi force-pushed the jit_threads_riscv_manual branch from 4dd002b to 28b7aeb Compare October 29, 2024 12:58

matetokodi changed the title ~~WIP: Support Short (8/16 bit) atomic RMW operations on RISCV~~ Support Short (8/16 bit) atomic RMW operations on RISCV Oct 29, 2024

zherczeg requested changes Oct 30, 2024

View reviewed changes

matetokodi force-pushed the jit_threads_riscv_manual branch from 28b7aeb to 498e142 Compare November 4, 2024 14:23

matetokodi force-pushed the jit_threads_riscv_manual branch 2 times, most recently from c7e61d1 to c16ccd8 Compare November 5, 2024 17:49

zherczeg requested changes Nov 6, 2024

View reviewed changes

Support short (8/16 bit) atomic RMW operations on RISCV

bd9a5fa

Since RISCV does not support short atomic RMW operations, only 32 and 64 bit ones, we must modify the correct values inside the 32 bit values ourselves. Signed-off-by: Máté Tokodi [email protected]

matetokodi force-pushed the jit_threads_riscv_manual branch from c16ccd8 to bd9a5fa Compare November 6, 2024 12:20

matetokodi marked this pull request as ready for review November 6, 2024 12:20

matetokodi requested review from ksh8281 and clover2123 as code owners November 6, 2024 12:20

zherczeg approved these changes Nov 6, 2024

View reviewed changes

clover2123 approved these changes Nov 8, 2024

View reviewed changes

clover2123 merged commit 3c714e5 into Samsung:main Nov 8, 2024
14 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support Short (8/16 bit) atomic RMW operations on RISCV #297

Support Short (8/16 bit) atomic RMW operations on RISCV #297

matetokodi commented Oct 21, 2024 •

edited

Loading

zherczeg Oct 22, 2024

zherczeg Oct 24, 2024

zherczeg Oct 24, 2024

zherczeg Oct 30, 2024

matetokodi Nov 4, 2024

zherczeg Nov 4, 2024

matetokodi Nov 5, 2024 •

edited

Loading

zherczeg Oct 30, 2024

zherczeg Oct 30, 2024

zherczeg Oct 30, 2024

matetokodi Nov 4, 2024

zherczeg Nov 4, 2024

zherczeg Oct 30, 2024

zherczeg commented Oct 30, 2024

matetokodi commented Nov 4, 2024

zherczeg left a comment

zherczeg Nov 6, 2024

zherczeg left a comment

clover2123 left a comment

Support Short (8/16 bit) atomic RMW operations on RISCV #297

Support Short (8/16 bit) atomic RMW operations on RISCV #297

Conversation

matetokodi commented Oct 21, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

matetokodi Nov 5, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zherczeg commented Oct 30, 2024

matetokodi commented Nov 4, 2024

zherczeg left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zherczeg left a comment

Choose a reason for hiding this comment

clover2123 left a comment

Choose a reason for hiding this comment

matetokodi commented Oct 21, 2024 •

edited

Loading

matetokodi Nov 5, 2024 •

edited

Loading