Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try to mitigate performance degradation when moving from 32- to 64-bit offset types when using bit-packed tile states in decoupled look-back #2136

Closed
Tracked by #1454
elstehle opened this issue Jul 31, 2024 · 1 comment
Assignees

Comments

@elstehle
Copy link
Collaborator

elstehle commented Jul 31, 2024

In #2055, we experimented with using bit-packed tile states in the decoupled look-back of algorithms that need to carry the offset type in the decoupled look-back.

While the overall the performance for 64-bit offset types improved when using bit-packed tile states compared to using regular tile states, performance of 64-bit offset types still lags a good bit behind 32-bit offset types.

We want to investigate where the remaining performance degradation comes from. One possibility to mitigate that performance degradation is to use two different offset types within the relevant algorithms: (1) one that is used for indexing items within a tile and (2) one that is used for indexing within global memory.

@elstehle elstehle changed the title Try to mitigate performance degradation when moving from 32- to 64-bit offset types in DeviceSelect Try to mitigate performance degradation when moving from 32- to 64-bit offset types when using bit-packed tile states in decoupled look-back Jul 31, 2024
@elstehle elstehle self-assigned this Jul 31, 2024
@elstehle
Copy link
Collaborator Author

For future reference, a draft PR is posted here elstehle#3. Despite efforts to mitigate the slowdowns, the worst-case slowdown from using i64 over i32 is still 1.35x using the bit-packed tile state.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

No branches or pull requests

1 participant