-
Notifications
You must be signed in to change notification settings - Fork 153
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
out of memory when fine-tune the ICX2.5 #399
Comments
You may
|
Thanks for your reply. Fine tuning by LoRA solves the OOD error. However, even when I reduce |
@yuhangzang @cool-xuan The methods you provided are very useful for avoiding OOM at startup. I have tried them. However, now it always suddenly appears OOM after running dozens of steps. I have no idea what parameter configuration is really effective. |
Do not forget to install the flash-attention 2. You may need 8 A100 80G GPUs for full parameters tuning. If you use LoRA fine-tuning, you can also decrease the value of |
Following your fine-tune instruction, my finetune.sh is as
All batch sizes are set to 1 and only two images are encoded for each conversation.
I run this shell on 4 A100 with 80G GPU memory, with out of memory in the first iteration.
All packages are same with your env docs/install.md, except torch=2.10 and cuda=12.1.
Any advice for this wired OOD?
The text was updated successfully, but these errors were encountered: