Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[core][aDAG] asyncio run hangs upon shutdown #47685

Open
rkooo567 opened this issue Sep 16, 2024 · 4 comments · May be fixed by #47702
Open

[core][aDAG] asyncio run hangs upon shutdown #47685

rkooo567 opened this issue Sep 16, 2024 · 4 comments · May be fixed by #47702
Labels
accelerated-dag bug Something that is supposed to be working; but isn't P0 Issues that should be fixed in short order

Comments

@rkooo567
Copy link
Contributor

What happened + What you expected to happen

When I ran the script below, it hangs. I don't know the root cause yet. Probably related to shutdown

Versions / Dependencies

master

Reproduction script

import asyncio
import ray
from ray.dag import InputNode, MultiOutputNode

async def main():
    @ray.remote
    class A:
        def f(self, i):
            return i

    a = A.remote()
    b = A.remote()

    with InputNode() as inp:
        x = a.f.bind(inp)
        y = b.f.bind(inp)
        dag =  MultiOutputNode([x, y])

    adag = dag.experimental_compile(enable_asyncio=True)
    refs = await adag.execute_async(1)
    outputs = []
    # works
    for ref in refs:
        outputs.append(await ref)
    # doesn't work.
    # outputs = await asyncio.gather(*refs)
    print(outputs)

asyncio.run(main())

Issue Severity

None

@rkooo567 rkooo567 added bug Something that is supposed to be working; but isn't triage Needs triage (eg: priority, bug/not-bug, and owning component) P0 Issues that should be fixed in short order accelerated-dag and removed triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Sep 16, 2024
@jeffreyjeffreywang
Copy link
Contributor

It seems like the program doesn't hang anymore if adag.teardown() is called after print(outputs).

@jeffreyjeffreywang
Copy link
Contributor

Perhaps we can tear down the aDAG implicitly when it goes out of scope which will be more intuitive to the client.

@rkooo567
Copy link
Contributor Author

actually the teardown is called on the destructor of CompiledDag class. Idk why it is not properly triggered..

@rkooo567 rkooo567 linked a pull request Sep 17, 2024 that will close this issue
8 tasks
@rkooo567
Copy link
Contributor Author

Potential fix here; #47685 (comment) waiting for CI to see if it passes all tests

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accelerated-dag bug Something that is supposed to be working; but isn't P0 Issues that should be fixed in short order
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants