Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Branchless Instructions & Dependencies #385

Open
ThomasHaas opened this issue Jan 26, 2023 · 1 comment
Open

Branchless Instructions & Dependencies #385

ThomasHaas opened this issue Jan 26, 2023 · 1 comment

Comments

@ThomasHaas
Copy link
Collaborator

Some hardware supports branchless instructions like e.g. CMOV on x86 and CSEL on ARM64.
PowerPC and RISC-V seem to have less of those.

Such branchless instructions, as their name suggests, do not involve any control-flow branching and hence do not cause ctrl-dependencies. Instead, they do result in data-dependencies.
This can have subtle effects. Consider this code:

int r = load(&x);
int s = load(&y);
if (r == 0)
    s = 42;
store(&x, s);

With proper branching, assume r==0 holds. then store(&x, s); has a data-dep on s=42 which has a ctrl-dep on int r = load(&x);.
However, there is no dependency chain that connects int s = load(&y); and store(&x, s);, allowing reordering of those two operations.

Now consider the branchless version:

int r = load(&x);
int s = load(&y);
s = ITE(r==0, 42, s); // e.g. a CMOV on x86
store(&x, s);

Here we have no ctrl-deps anymore, but there is a data-dep chain int s = load(&y); -> s = ITE(r==0, 42, s); -> store(&x, s) connecting load and store. This disallows any reordering no matter the result of r==0.
The difference in behavior can be observed in the full example given in #362 .

Now, LLVM may generate those ITE instructions in its IR, and currently we keep the instructions as such.
However, it is not clear that if the code is lowered to hardware that the branchless-ness is preserved.
In particular, I think for PowerPC and RISC-V this may not be the case.
We might want to revise the compilation we do to those architectures, or at least allow the user to force branch-full compilation via options.
Another option would be to always create branching code and avoid ITE altogether.

@ThomasHaas
Copy link
Collaborator Author

As a short follow-up: We definitely need to avoid ITE for all language level compilation targets, because those certainly do not have such instructions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant