Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default branch direction prediction should be TAKEN (currently NOT TAKEN) #79

Open
ronan-lashermes opened this issue Jan 30, 2024 · 1 comment

Comments

@ronan-lashermes
Copy link

Rationale

Because our programs usually have lots of loops, the cold state of the branch direction predictor (GSHARE for Nax) should be TAKEN.
In the current state of affairs, we need to fill the GHB before we get correct TAKEN predictions in a loop.

Implementation

Got a ~2% performance gain on the small program I am actually working on (not a serious test, but an indication nonetheless)

Simulation

In GSharePlugin.scala, l71-73, replace the 0 default value with -1:

if(GenerationFlags.simulation) {
        counter.initBigInt(List.fill(counter.wordCount)(BigInt(- 1)), allowNegative = true)
}

Targeting synthesis

Here I am not sure since we want to initialize a Mem value and I am only simulating.
My tentative, in GSharePlugin.scala, l39-41, replace with:

val keys = new AreaRoot {
      val GSHARE_COUNTER = Stageable(Vec.fill(SLICE_COUNT)(U(pow(2, counterWidth).intValue - 1, counterWidth bits)))
}

and add import

import scala.math.pow

What's your opinion on this simple change ? I can create a merge request if need be.

@Dolu1990
Copy link
Member

In GSharePlugin.scala, l71-73, replace the 0 default value with -1:

Sound ok to me.

My tentative, in GSharePlugin.scala, l39-41, replace with:

I think what we would need is to have a little "state machine" to write over all the gshare memory on boot using a counter. A bit in the same way than the instruction cache is initialized. (see https://github.com/SpinalHDL/NaxRiscv/blob/main/src/main/scala/naxriscv/fetch/FetchCachePlugin.scala#L383)

The same should idealy be done for the BTB, in order to avoid 'X during ASIC simulation

That would solve all the cases (synthesis and simulation)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants