-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[adag] Avoid deserialization during CompiledDAGRef's deallocation #47614
Comments
@jeffreyjeffreywang would you be interested in taking this? |
@rkooo567 Yup, definitely! Hey @stephanie-wang @ruisearch42, I just wanted to clarify a few things before I proceed with this issue. You've suggested to release the value of a CompiledDAGRef if ray/python/ray/experimental/compiled_dag_ref.py Lines 87 to 89 in 44dd9a7
With this code, another issue arises -- attempting to load a python library during program exit (when CompiledDAGRef is destructed) will fail. Please refer to #47305 (comment) for more context.
I'm thinking about removing the custom destructor entirely but wanted to understand the implications before doing so. |
I think this is because python cannot guarantee all modules exist when |
With this code, the python object is retrieved and then immediately goes out of scope. If there are any native buffers underneath, they will also be released. |
Thank you @ruisearch42, could you give me an example/repro when native buffers are used and therefore this destruction is necessary? I'd like to measure whether the deserialization is necessary. If deserialization is necessary, we still need to solve the module import issue. We might want to move the deserialization (stepping through the remaining steps in the DAG) to |
This is a duplicate of #46909. Will close both bugs once this issue is addressed. |
Thanks, @jeffreyjeffreywang for the great questions! The deserialization is necessary because the native buffer is reused for future data. If the reader does not explicitly read and release the buffer, then the buffer cannot be reused for future values. You can reproduce it by returning a numpy array as the DAG output; since numpy arrays are zero-copy, the buffer will be held until the np array in python goes out of scope. Note that you do not need to deserialize the data in order to release the buffer. We just need to make sure to call the We do a similar custom destructor for when ObjectRefs and actors go out of scope, so I think you can reuse a similar codepath to avoid the destruction ordering problem, see here. |
Thank you, Stephanie, for the thorough explanation. I'll dig a bit deeper and publish a PR. |
Thanks @jeffreyjeffreywang ! Btw, are you in OSS ray slack? We have regular sync up, and you are more than welcome to join! |
@rkooo567 Yeah, I just joined couple days ago. Thanks for inviting, I'll keep an eye on the sync up next time and hopefully I'll be able to join! 😄 |
@anyscalesam can you make sure @jeffreyjeffreywang is invited to next sync?! Thank you! |
What happened + What you expected to happen
Although we don't call ray.get, ray.get is called and deserialization still happens when the dag ref is deallocated because of the following code.
ray/python/ray/experimental/compiled_dag_ref.py
Line 80 in 4aea49f
Versions / Dependencies
ray master
Reproduction script
Issue Severity
None
The text was updated successfully, but these errors were encountered: