Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DeepSpeed integration breaks existing DeepSpeed logic #183

Open
muellerzr opened this issue Aug 16, 2024 · 2 comments
Open

DeepSpeed integration breaks existing DeepSpeed logic #183

muellerzr opened this issue Aug 16, 2024 · 2 comments

Comments

@muellerzr
Copy link

muellerzr commented Aug 16, 2024

What's the issue, what's expected?:

There are attributes inside of regular deepspeed.runtime that are missing in this repo, and the monkey-patch doesn't cover, such as:

from deepspeed.runtime.lr_schedules import VALID_LR_SCHEDULES

How to reproduce it?:

from msamp import deepspeed
deepspeed.runtime.lr_schedules.VALID_LR_SCHEDULES

Log message or shapshot?:

Additional information:

Still trying to figure out a good solution here to make sure old namespaces can remain intact (as calling normal deepspeed later has consequences during training)

@muellerzr
Copy link
Author

Generally I think this repo might want to recommend users do something like:

# Order doesn't really matter
import deepspeed
from msamp import deepspeed as msamp_deepspeed

Since otherwise they may face issues like this.

Not per-se a problem, but just something users should know about

@muellerzr
Copy link
Author

For completeness, here's what my solution wound up being in the end:

import deepspeed
ds_initialize = deepspeed.initialize
if use_msamp:
    # MS-AMP requires DeepSpeed patches
    from msamp import deepspeed as msamp_deepspeed
    ds_initialize = msamp_deepspeed.initialize

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant