Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

oneshot services causes boot failures due to lack of timeout #68

Open
gauravjuvekar opened this issue May 7, 2024 · 0 comments
Open

Comments

@gauravjuvekar
Copy link
Member

nvidia-mig-manager.service is Type=oneshot.
DefaultTimeoutStartSec is not used for oneshot services, which causes the entire system to fail to boot stuck for nvidia-mig-manager.service to complete.

Boot failure is worse than a failed / degraded service.

A TimeoutStartSec should be added to this to at least allow the system to boot in a degraded state (for debug / recovery without OOB BMC / IPMI / KVM).

The root cause may be #11 , but a timeout addition will make this more resilient.

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant