Replay queue depth insufficient for RoCC accelerators #3653

PhilippKaesgen · 2024-06-29T13:12:23Z

Type of issue: other enhancement

Impact: no functional change

Development Phase: proposal

Other information

When a RoCC accelerator sends memory requests back-to-back, the two entries in the replay queue are not sufficient to handle a request every cycle.

rocket-chip/src/main/scala/rocket/SimpleHellaCacheIF.scala

Line 103 in dbcb06a

val replayq = Module(new SimpleHellaCacheIFReplayQueue(2))

This can be simply fixed by changing the depth of the replayq to 3.

If the current behavior is a bug, please provide the steps to reproduce the problem:

The problem and proposed fix can be explored by adding an accelerator which simply loads the same address multiple times and back-to-back. After the initial miss, the L1D should be able to service the loads every cycle since the data is in the L1D. In the current state, the replayq will cause back-pressure in one in three clock cycles. After applying the suggested fix, it can handle a memory request every cycle.

What is the current behavior?

At the moment, the insufficient replayq depth will cause back pressure to the accelerator in one in three cycles.

What is the expected behavior?

Handling memory requests every cycle without back-pressure due to the insufficient replayq depth.

Please tell us about your environment:

What is the use case for changing the behavior?

RoCC accelerators accessing the L1D might gain up to 50% performance when sending memory requests back-to-back.

jerryz123 · 2024-08-21T07:39:09Z

Good observation. I'd be happy to approve a PR with the fix implemented (Please PR to the dev branch)

increase depth of SimpleHellaCacheIFReplayQ #3653

caizixian · 2024-10-29T11:51:27Z

This can be closed now I think

PhilippKaesgen mentioned this issue Aug 21, 2024

increase depth of SimpleHellaCacheIFReplayQ #3653 #3678

Merged

jerryz123 added a commit that referenced this issue Aug 21, 2024

Merge pull request #3678 from PhilippKaesgen/fix_replayq

6cd2793

increase depth of SimpleHellaCacheIFReplayQ #3653

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replay queue depth insufficient for RoCC accelerators #3653

Replay queue depth insufficient for RoCC accelerators #3653

PhilippKaesgen commented Jun 29, 2024

jerryz123 commented Aug 21, 2024

caizixian commented Oct 29, 2024

Replay queue depth insufficient for RoCC accelerators #3653

Replay queue depth insufficient for RoCC accelerators #3653

Comments

PhilippKaesgen commented Jun 29, 2024

jerryz123 commented Aug 21, 2024

caizixian commented Oct 29, 2024