Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add atomic_ref support for 8 and 16b types. #2255

Open
wants to merge 30 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
9941a44
Support fetch_add and CAS on 8/16b
wmaxey Jul 25, 2024
bfd97ee
Add 16b test
wmaxey Jul 25, 2024
24bb639
Fix issues found when enabling 8/16b in a heterogeneous test, PTX see…
wmaxey Aug 2, 2024
f93e690
Remove 16b cas and use only 32b cas.
wmaxey Aug 8, 2024
ae24c8e
Get several tests passing for 8/16b atomics
wmaxey Aug 8, 2024
1af8667
Remove todo and ifdefs from tests covering 8b/16b atomics
wmaxey Aug 16, 2024
ff06fa1
Fix bug in 16b atomic load
wmaxey Aug 16, 2024
8daaad1
Move store close to fetch_update since it is a derived primitive
wmaxey Aug 16, 2024
f588332
Fix bug in minmax due to s64 overload missing for arithmetic types
wmaxey Aug 16, 2024
60e8c25
Add more 8/16b tests for atomic_ref
wmaxey Aug 16, 2024
e9a79f4
Fixup remove debug prints
wmaxey Aug 16, 2024
b713109
Cleanup bitmask hell, fix bug where lower mask was ignored
wmaxey Aug 17, 2024
1c2627d
Add test covering interleaved CAS onto same atomic window
wmaxey Aug 17, 2024
c0f52c8
Fixup documentation mistake.
wmaxey Aug 17, 2024
6d1beec
Make atomics enable_if uses match rest of libcudacxx.
wmaxey Aug 19, 2024
d5f8928
Verify fetch_add sequential load behavior in 8b/16b atomics
wmaxey Aug 19, 2024
c9ca506
Remove 8b/16b add PTX tests
wmaxey Aug 20, 2024
6f5d0b8
Optimize fetch_update CAS loops
wmaxey Aug 21, 2024
b7b944e
Fix name of preset for PTX codegen test
wmaxey Aug 21, 2024
2381c42
Fix signed/unsigned comparison
wmaxey Aug 21, 2024
07d1077
Fix atomics codegen tests not being built
wmaxey Aug 21, 2024
a2d19d1
Fix CMake target for libcudacxx ptx tests.
wmaxey Aug 21, 2024
70aa4a3
Make dump_and_check executable again
wmaxey Aug 27, 2024
e08ed80
Work around inconsistent parsing of [[[ in FileCheck versions
wmaxey Aug 28, 2024
1000597
Make min/max match algorith.min/max.
wmaxey Sep 4, 2024
f48a49b
Merge branch 'main' into fea/atomic_ref_8_16_bit_support
wmaxey Sep 10, 2024
b7baeef
Work around NVCC 11.X using different syntax for inline ptx
wmaxey Sep 11, 2024
c59b3ab
Fix warnings in the codegen tests.
wmaxey Sep 11, 2024
642b487
Use PTX 16b ld/st instead of 32b CAS
wmaxey Sep 11, 2024
b1901a2
Switch 8b ld/st to 16b ld
wmaxey Sep 11, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions CMakePresets.json
Original file line number Diff line number Diff line change
Expand Up @@ -357,7 +357,8 @@
"libcudacxx.test.internal_headers",
"libcudacxx.test.public_headers",
"libcudacxx.test.public_headers_host_only",
"libcudacxx.test.lit.precompile"
"libcudacxx.test.lit.precompile",
"libcudacxx.test.atomics.ptx"
]
},
{
Expand Down Expand Up @@ -479,7 +480,7 @@
],
"filter": {
"exclude": {
"name": "^libcudacxx\\.test\\.(lit|atomics\\.codegen\\.diff)$"
"name": "^libcudacxx\\.test\\.lit$"
}
}
},
Expand Down
1 change: 0 additions & 1 deletion libcudacxx/codegen/generators/compare_and_swap.h
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,6 @@ static inline _CCCL_DEVICE bool __cuda_atomic_compare_exchange(
};

constexpr size_t supported_sizes[] = {
16,
wmaxey marked this conversation as resolved.
Show resolved Hide resolved
32,
64,
128,
Expand Down
1 change: 0 additions & 1 deletion libcudacxx/codegen/generators/exchange.h
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,6 @@ static inline _CCCL_DEVICE void __cuda_atomic_exchange(
};

constexpr size_t supported_sizes[] = {
16,
32,
64,
128,
Expand Down
Loading
Loading