Add additional protections against undefined uses of CUDA extended lambdas #1004

jrhemstad · 2022-12-06T17:53:06Z

CUDA Extended Lambdas (lambdas with __device__ or __host__ __device__) annotations are a useful convenience, but they come with several restrictions that can cause silent or confusing failures in user's code.

For many of these restrictions, libcu++ is powerless to do anything to help users, but where possible, libcu++ should make an effort to detect invalid uses of extended lambdas and emit a more helpful diagnostic.

Some of the most pertinent restrictions stem from the fact that nvcc replaces extended lambdas with a placeholder type whose operator() is not equivalent to the lambda definition

As described above, the CUDA compiler replaces a device extended lambda defined in a host function with a placeholder type defined in namespace scope. This placeholder type does not define a operator() function equivalent to the original lambda declaration. An attempt to determine the return type or parameter types of the operator() function may therefore work incorrectly in host code, as the code processed by the host compiler will be semantically different than the input code processed by the CUDA compiler. However, it is OK to introspect the return type or parameter types of the operator() function within device code. Note that this restriction does not apply to host device extended lambdas.

As described previously, the CUDA compiler replaces an extended device or host device lambda expression with an instance of a placeholder type in the code sent to the host compiler. This placeholder type may define C++ special member functions (e.g. constructor, destructor). As a result, some standard C++ type traits may return different results for the closure type of the extended lambda, in the CUDA frontend compiler versus the host compiler.

nvcc provides the __nv_is_extended_device_lambda_closure_type(X) and __nv_is_extended_host_device_lambda_closure_type(X) built-in traits to detect a __device__ or __host__ __device__ lambda at compile time. This enables us to detect and emit compile time diagnostics for invalid uses with libcu++ constructs.

For example, one of the restrictions on extended lambdas is that you cannot query their return type in host code, so in NVIDIA/libcudacxx#284 we updated cuda::std::invoke_result_t to emit a compile time error when used in host code.

As mentioned in restriction 17, there are several other type traits where we should do similar changes as was done for cuda::std::invoke_result (note that unlike with invoke_result, the following traits should be guarded for both __device__ and __host__ __device__ lambdas).

Tasks

Give feedback

I also suspect there are changes we can/should make to things in <functional> like cuda::std::invoke, but that will require additional investigation.

The text was updated successfully, but these errors were encountered:

Barasakar · 2024-08-15T02:09:37Z

Hello everyone, I’m still a bit new to open source contributions, and out of the four unassigned "good first issues," this one seems manageable (I hope). I’ve successfully built the dev container and am ready to get started on this issue.

I noticed someone mentioned the dev Discord server in another issue. I’m wondering if the server is still active and if contributors are allowed to join?

pauleonix · 2024-08-15T20:10:19Z

See https://discord.com/invite/nvidiadeveloper it has a CCCL room.

jrhemstad mentioned this issue Dec 6, 2022

Unable to use transform_output_iterator for output of copy_if with CUDA NVIDIA/thrust#1650

Closed

jrhemstad added the good first issue Good for newcomers. label Dec 6, 2022

jrhemstad added thrust For all items related to Thrust. libcu++ For all items related to libcu++ and removed thrust For all items related to Thrust. labels Feb 22, 2023

jarmak-nv transferred this issue from NVIDIA/libcudacxx Nov 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add additional protections against undefined uses of CUDA extended lambdas #1004

Add additional protections against undefined uses of CUDA extended lambdas #1004

jrhemstad commented Dec 6, 2022 •

edited

Loading

Tasks

Barasakar commented Aug 15, 2024

pauleonix commented Aug 15, 2024

Add additional protections against undefined uses of CUDA extended lambdas #1004

Add additional protections against undefined uses of CUDA extended lambdas #1004

Comments

jrhemstad commented Dec 6, 2022 • edited Loading

Tasks

Barasakar commented Aug 15, 2024

pauleonix commented Aug 15, 2024

jrhemstad commented Dec 6, 2022 •

edited

Loading