Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cpu-o3: Transform the lsqunit #214

Open
wants to merge 10 commits into
base: xs-dev
Choose a base branch
from
Open

cpu-o3: Transform the lsqunit #214

wants to merge 10 commits into from

Conversation

happy-lx
Copy link
Contributor

Transform the load/store execution logic into a multi-stage pipeline form

Change-Id: Iaf7558ad75ed8fe2bbf4a776359db113b6126453

@tastynoob
Copy link
Collaborator

Any performance test data here?

@happy-lx
Copy link
Contributor Author

Any performance test data here?

Not yet, I'm about to test now

@happy-lx
Copy link
Contributor Author

happy-lx commented Nov 28, 2024

@happy-lx happy-lx force-pushed the split-lsu-pipe branch 2 times, most recently from fb23649 to d874976 Compare December 9, 2024 08:44
@happy-lx happy-lx force-pushed the split-lsu-pipe branch 4 times, most recently from cb37a72 to 2f898df Compare December 20, 2024 06:17
Transform the load/store execution logic into a multi-stage pipeline form

Change-Id: Iaf7558ad75ed8fe2bbf4a776359db113b6126453
Originally, the fence instruction will be dispatched to mem's dispatchQueue,
but its opType is No_OpClass, which will cause it to wait for the integer issue queue IQ2(IntMisc) to have free items before it can continue execution.

If the subsequent instructions of the fence instruction occupy the intIQ2, the fence cannot be executed and cpu stucks.

Therefore, change the opType of the fence instruction to MemReadOp to prevent this situation (in fact, the fence will not be dispatched to IQ)

Change-Id: Ie38a901e038db9906c43f78675e69391e847c88b
Now initiateAcc only does tlb access and is located at s0 of the load/store pipeline.

Load makes cache access and query violations at s1, receives the cache response at s2, and writes back at s3.
Store updates sq and query violations at s1, and writes back at s4.

AMO operations are now executed using `executeAmo`.

Change-Id: Iac678b7de3a690329f279c70fdcd22be4ed22715
This commit is only for normal load. The uncache/amo load is the same as the original process.

Change-Id: Idc98ee18a6e94a39774ebba0f772820699b834de
Add a fence before and after the LRSC instruction.

Change-Id: I66021d0a5a653d2a7e30cd262166363a84184ed6
Change-Id: Ifc1a586df8beab65772d48a75106155f9e723cba
Adjust cache miss load replay logic: replay all loads cannot get data at
load s2, now we don’t need cache to send `sendCustomSignal` when miss.

Add RAW nuke replay at load s1&s2

Move most of the writeback logic to load s2 and actually writeback at s3

Change-Id: Idfd3480969958826f4820349168f17c9522f791e
use `hint_wakeup_ahead_cycles` in Cache.py to control it
now `hint_wakeup_ahead_cycles` is set to 3

Change-Id: Ie93de7cbe66ce09988101a44db819d1cad1d27d2
set `EnableLdMissReplay` to True to enable replaying missed load from
replayQ

set `EnablePipeNukeCheck` to True to detect raw nuke replay in loadpipe

NOTE: if `Enableldmissreplay` is False, `EnablePipeNukeCheck` can't be
set as True

Change-Id: Ic4235bffba01d5dc4c39cec8ae92f2d27b28d98a
store writeback at S4 by default
when using --ideal-kmhv3, store writeback at S2

Change-Id: I6a318ff6c182daca0ab041840d76575a16e45d82
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants