Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Out of memory with image smaller than the one you said you did without tiling #5

Open
jonathancolledge opened this issue Jan 21, 2024 · 1 comment

Comments

@jonathancolledge
Copy link

Hi,
I have a 3090 with 24 Gb VRAM and I tried a 1265 x 846 image and I got the below:
(Of note installation was a bit tricky I had to use the fixes for long file lengths as per the other issues, but also, it could not find a matching torch to install so I had to use: pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

Is there something else I did wrong?

Loading pipeline components...: 100%|████████████████████████████████████████████████████| 6/6 [00:01<00:00, 3.71it/s]
Resizing image to a square...
Determining background color...
Background color is... (255, 255, 255, 255)
Exporting image tile: image_0.png
0%| | 0/75 [00:14<?, ?it/s]
Traceback (most recent call last):
File "C:\Users\jonat\sd-x4-wui\sdupx4\lib\site-packages\gradio\routes.py", line 321, in run_predict
output = await app.blocks.process_api(
File "C:\Users\jonat\sd-x4-wui\sdupx4\lib\site-packages\gradio\blocks.py", line 1015, in process_api
result = await self.call_function(fn_index, inputs, iterator, request)
File "C:\Users\jonat\sd-x4-wui\sdupx4\lib\site-packages\gradio\blocks.py", line 856, in call_function
prediction = await anyio.to_thread.run_sync(
File "C:\Users\jonat\sd-x4-wui\sdupx4\lib\site-packages\anyio\to_thread.py", line 56, in run_sync
return await get_async_backend().run_sync_in_worker_thread(
File "C:\Users\jonat\sd-x4-wui\sdupx4\lib\site-packages\anyio_backends_asyncio.py", line 2134, in run_sync_in_worker_thread
return await future
File "C:\Users\jonat\sd-x4-wui\sdupx4\lib\site-packages\anyio_backends_asyncio.py", line 851, in run
result = context.run(func, *args)
File "C:\Users\jonat\sd-x4-wui\gradio_gui.py", line 8, in upscale_image
output_image = upscaler.upscale_image(image, int(rows), int(cols),int(seed), prompt,negative_prompt,xformers_input,cpu_offload_input,attention_slicing_input,enable_custom_sliders,guidance,iterations)
File "C:\Users\jonat\sd-x4-wui\upscaler.py", line 86, in upscale_image
ups_tile = pipeline(prompt=prompt,negative_prompt=negative_prompt, image=x.convert("RGB"),generator=generator).images[0]
File "C:\Users\jonat\sd-x4-wui\sdupx4\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "C:\Users\jonat\sd-x4-wui\sdupx4\lib\site-packages\diffusers\pipelines\stable_diffusion\pipeline_stable_diffusion_upscale.py", line 775, in call
noise_pred = self.unet(
File "C:\Users\jonat\sd-x4-wui\sdupx4\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "C:\Users\jonat\sd-x4-wui\sdupx4\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "C:\Users\jonat\sd-x4-wui\sdupx4\lib\site-packages\accelerate\hooks.py", line 165, in new_forward
output = module._old_forward(*args, **kwargs)
File "C:\Users\jonat\sd-x4-wui\sdupx4\lib\site-packages\diffusers\models\unet_2d_condition.py", line 1177, in forward
sample = upsample_block(
File "C:\Users\jonat\sd-x4-wui\sdupx4\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "C:\Users\jonat\sd-x4-wui\sdupx4\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "C:\Users\jonat\sd-x4-wui\sdupx4\lib\site-packages\diffusers\models\unet_2d_blocks.py", line 2354, in forward
hidden_states = attn(
File "C:\Users\jonat\sd-x4-wui\sdupx4\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "C:\Users\jonat\sd-x4-wui\sdupx4\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "C:\Users\jonat\sd-x4-wui\sdupx4\lib\site-packages\diffusers\models\transformer_2d.py", line 392, in forward
hidden_states = block(
File "C:\Users\jonat\sd-x4-wui\sdupx4\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "C:\Users\jonat\sd-x4-wui\sdupx4\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "C:\Users\jonat\sd-x4-wui\sdupx4\lib\site-packages\diffusers\models\attention.py", line 393, in forward
ff_output = self.ff(norm_hidden_states, scale=lora_scale)
File "C:\Users\jonat\sd-x4-wui\sdupx4\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "C:\Users\jonat\sd-x4-wui\sdupx4\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "C:\Users\jonat\sd-x4-wui\sdupx4\lib\site-packages\diffusers\models\attention.py", line 665, in forward
hidden_states = module(hidden_states, scale)
File "C:\Users\jonat\sd-x4-wui\sdupx4\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "C:\Users\jonat\sd-x4-wui\sdupx4\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "C:\Users\jonat\sd-x4-wui\sdupx4\lib\site-packages\diffusers\models\activations.py", line 103, in forward
return hidden_states * self.gelu(gate)
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 3.05 GiB. GPU 0 has a total capacty of 24.00 GiB of which 2.11 GiB is free. Of the allocated memory 20.28 GiB is allocated by PyTorch, and 46.82 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

@jonathancolledge
Copy link
Author

I can't get it to install according to the instructions so I fiddled about. This is my latest install with conda where I am hoping everything installed ok and I get all the optimisations. Currently it is running on a 1024 x 684 image but I think it will run out of memory when it comes to the last step - saving the image. It is running along between 5 Gb and 18 Gb of VRAM in use:

git clone https://github.com/Subarasheese/sd-x4-wui

cd sd-x4-wui

conda create -n sdup python=3.10

git config --system core.longpaths true

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

pip install https://huggingface.co/r4ziel/xformers_pre_built/resolve/main/triton-2.0.0-cp310-cp310-win_amd64.whl

pip3 install -U xformers --index-url https://download.pytorch.org/whl/cu121

pip3 install accelerate

I edited requirements.txt to only have the following:

Pillow == 9.4.0
diffusers
gradio == 3.15.0
split_image == 2.0.1
transformers

Then I ran with

python gradio_gui.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant