Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skipping ERROR caught in nll = model.optimize_parameters(current_step): svd_cuda: the updating process of SBDSDC did not converge (error: 23) #20

Open
flybiubiu opened this issue Feb 19, 2021 · 3 comments

Comments

@flybiubiu
Copy link

Thx author!I train x4 is ok!
But when I train x8:
Skipping ERROR caught in nll = model.optimize_parameters(current_step):
svd_cuda: the updating process of SBDSDC did not converge (error: 23)

Python 3.8.5 (default, Sep 4 2020, 07:30:14)
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.

import torch
print(torch.version)
1.7.1+cu110
print(torch.version.cuda)
11.0

print(torch.backends.cudnn.version())
8005

············································································································
My GPU is 3090.I run setup code and find the cuda version is not compare.After that I reinstall with (pip install torch==1.7.1+cu110 torchvision==0.8.2+cu110 torchaudio===0.7.2

About iters is 10000.

@RedRAINXXXX
Copy link

I encountered the same problem as you. When this error occurs, subsequent data will have this error

@JingzheLyp
Copy link

Hi, I encountered the same problem as you. Have you solved the problem? @flybiubiu, @RedRAINXXXX

@RedRAINXXXX
Copy link

Hi, I encountered the same problem as you. Have you solved the problem? @flybiubiu, @RedRAINXXXX

Perhaps because the learning rate is too high, you can try warm up or lower the learning rate directly

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants