Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pathos for multiprocessing? #330

Open
mcharytoniuk opened this issue Jun 17, 2024 · 4 comments
Open

Pathos for multiprocessing? #330

mcharytoniuk opened this issue Jun 17, 2024 · 4 comments

Comments

@mcharytoniuk
Copy link

Hello!

My application works fine when using fork, but pickle causes serialization issues when using spawn.

Will you be open to switching to either pathos instead of multiprocessing?

If so, I will be happy to contribute it. I just wanted to ask upfront to be sure it's ok. : )

Best wishes

@gi0baro
Copy link
Member

gi0baro commented Jun 21, 2024

Hey @mcharytoniuk, thank you for reaching out before opening a PR.

Whereas I'm not against introducing additional support for pickling resources, I'm not exactly thrilled of adding several additional dependencies into Granian: this will implicitly add the same deps on every framework/library/project using Granian, which is not (in my opinion) a good experience.

Also, given the default Granian loader imports the targeted application on workers startup, I don't really get why fork is needed in the first place, as under default circumstances you shouldn't share resources between processes.
I get the whole reduce memory usage theme, but at the same time I cannot guarantee – even considering Granian uses its own internal mutability strategy that won't rely on Python and GIL – things will work as expected when resources gets shared between workers: there's a reason behind the design of expecting the target is an importable resource and not a Python object, it is not just because we were lazy to do so :)
The whole memory-shared thing in my opinion will be revisited once Python 3.13 lands, given that we can actually avoid the GIL everywhere (which also means the entirety of the application itself and all its dependencies are no-gil compatible), but for now it just doesn't feel enough safe to me, especially if this means adding several dependencies to Granian.

Given all this, I won't object to an implementation which will just provide the necessary pickle hacks to move objects between workers, if that implementation won't relay on several 3rd party dependencies and it will also document the risks connected to this usage.

@mcharytoniuk
Copy link
Author

mcharytoniuk commented Jun 21, 2024

@gi0baro Will you be okay with factoring multiprocessing into a strategy so it can be replaced with something like pathos in the libraries that use Granian? That way, Granian won't have an additional dependency and will allow custom multiprocessing implementations if necessary.

Also, when it comes to the use cases, I want to experiment with hosting a Pytorch ML model. I want to load it into VRAM and only then fork the child processes so each one can access a handle through some mutex lock. An experiment-true, but I have some other ideas that would benefit from being able to swap multiprocessing with something else.

@gi0baro
Copy link
Member

gi0baro commented Jun 22, 2024

@gi0baro Will you be okay with factoring multiprocessing into a strategy so it can be replaced with something like pathos in the libraries that use Granian?

Looks reasonable.

Also, when it comes to the use cases, I want to experiment with hosting a Pytorch ML model. I want to load it into VRAM and only then fork the child processes so each one can access a handle through some mutex lock. An experiment-true, but I have some other ideas that would benefit from being able to swap multiprocessing with something else.

Given that's the use-case, I hardly doubt spawning multiple Granian workers will produce any benefit.
A single worker can handle tents of thousands of RPS, so unless your inference time is <= 0.1ms Granian will never be the bottleneck there. Especially if you have a lock: you will still have just a single process able to interact with the model, which is presumably the point on which your application will spend the vast majority of time. Sure, you can still have deserialization/serialization handled in parallel between the different processes, but again: what's the weight of that in the entire request-response time? if that weights 0.1%, you won't ever notice the difference in end-to-end, as even if you have a 10x boost performance on that part, 99% of the request time will still be there.

@mcharytoniuk
Copy link
Author

@gi0baro OK, I will prepare a PR that factors multiprocessing into a strategy then. :)

I agree with your point about performance—Granian won't be a bottleneck. I want to experiment some more with it and explore the options. I am a fan of having the entire application in a single codebase, and Granian allows me to do just that.

Thanks for the answer!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants