cpu-o3: Transform the lsqunit #214

happy-lx · 2024-11-26T11:05:51Z

Transform the load/store execution logic into a multi-stage pipeline form

Change-Id: Iaf7558ad75ed8fe2bbf4a776359db113b6126453

tastynoob · 2024-11-27T10:45:27Z

Any performance test data here？

happy-lx · 2024-11-28T01:58:33Z

Any performance test data here？

Not yet, I'm about to test now

happy-lx · 2024-11-28T06:34:27Z

Transform the load/store execution logic into a multi-stage pipeline form Change-Id: Iaf7558ad75ed8fe2bbf4a776359db113b6126453

Originally, the fence instruction will be dispatched to mem's dispatchQueue, but its opType is No_OpClass, which will cause it to wait for the integer issue queue IQ2(IntMisc) to have free items before it can continue execution. If the subsequent instructions of the fence instruction occupy the intIQ2, the fence cannot be executed and cpu stucks. Therefore, change the opType of the fence instruction to MemReadOp to prevent this situation (in fact, the fence will not be dispatched to IQ) Change-Id: Ie38a901e038db9906c43f78675e69391e847c88b

Now initiateAcc only does tlb access and is located at s0 of the load/store pipeline. Load makes cache access and query violations at s1, receives the cache response at s2, and writes back at s3. Store updates sq and query violations at s1, and writes back at s4. AMO operations are now executed using `executeAmo`. Change-Id: Iac678b7de3a690329f279c70fdcd22be4ed22715

This commit is only for normal load. The uncache/amo load is the same as the original process. Change-Id: Idc98ee18a6e94a39774ebba0f772820699b834de

Add a fence before and after the LRSC instruction. Change-Id: I66021d0a5a653d2a7e30cd262166363a84184ed6

Change-Id: Ifc1a586df8beab65772d48a75106155f9e723cba

Adjust cache miss load replay logic: replay all loads cannot get data at load s2, now we don’t need cache to send `sendCustomSignal` when miss. Add RAW nuke replay at load s1&s2 Move most of the writeback logic to load s2 and actually writeback at s3 Change-Id: Idfd3480969958826f4820349168f17c9522f791e

use `hint_wakeup_ahead_cycles` in Cache.py to control it now `hint_wakeup_ahead_cycles` is set to 3 Change-Id: Ie93de7cbe66ce09988101a44db819d1cad1d27d2

set `EnableLdMissReplay` to True to enable replaying missed load from replayQ set `EnablePipeNukeCheck` to True to detect raw nuke replay in loadpipe NOTE: if `Enableldmissreplay` is False, `EnablePipeNukeCheck` can't be set as True Change-Id: Ic4235bffba01d5dc4c39cec8ae92f2d27b28d98a

store writeback at S4 by default when using --ideal-kmhv3, store writeback at S2 Change-Id: I6a318ff6c182daca0ab041840d76575a16e45d82

happy-lx added the do not merge label Nov 26, 2024

happy-lx force-pushed the split-lsu-pipe branch 2 times, most recently from fb23649 to d874976 Compare December 9, 2024 08:44

happy-lx force-pushed the split-lsu-pipe branch 4 times, most recently from cb37a72 to 2f898df Compare December 20, 2024 06:17

happy-lx added 10 commits December 20, 2024 14:17

cpu-o3: Transform the lsqunit

bab9b68

Transform the load/store execution logic into a multi-stage pipeline form Change-Id: Iaf7558ad75ed8fe2bbf4a776359db113b6126453

cpu-o3: replay cache missed load from replayQ

61adddf

This commit is only for normal load. The uncache/amo load is the same as the original process. Change-Id: Idc98ee18a6e94a39774ebba0f772820699b834de

arch: use strictly order-preserving LRSC

2adee2e

Add a fence before and after the LRSC instruction. Change-Id: I66021d0a5a653d2a7e30cd262166363a84184ed6

mem: let load has certain latency in ruby cahche

72a0a53

Change-Id: Ifc1a586df8beab65772d48a75106155f9e723cba

mem: send TimingResp in advance

df4e7a7

use `hint_wakeup_ahead_cycles` in Cache.py to control it now `hint_wakeup_ahead_cycles` is set to 3 Change-Id: Ie93de7cbe66ce09988101a44db819d1cad1d27d2

cpu-o3: make store wb stage configurable

2f898df

store writeback at S4 by default when using --ideal-kmhv3, store writeback at S2 Change-Id: I6a318ff6c182daca0ab041840d76575a16e45d82

happy-lx added do not merge and removed do not merge labels Dec 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cpu-o3: Transform the lsqunit #214

cpu-o3: Transform the lsqunit #214

happy-lx commented Nov 26, 2024

tastynoob commented Nov 27, 2024

happy-lx commented Nov 28, 2024

happy-lx commented Nov 28, 2024 •

edited

Loading

cpu-o3: Transform the lsqunit #214

Are you sure you want to change the base?

cpu-o3: Transform the lsqunit #214

Conversation

happy-lx commented Nov 26, 2024

tastynoob commented Nov 27, 2024

happy-lx commented Nov 28, 2024

happy-lx commented Nov 28, 2024 • edited Loading

happy-lx commented Nov 28, 2024 •

edited

Loading