Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OSError: Error no file named diffusion_pytorch_model.bin found in directory pretrained_models/MagicAnimate/appearance_encoder #10

Open
leochoo opened this issue Dec 11, 2023 · 4 comments

Comments

@leochoo
Copy link

leochoo commented Dec 11, 2023

My pretrained_models/MagicAnimate/appearance_encoder is empty after running install.ps1

Running ./run_gui.ps1 will give the below error

(venv) PS C:\Users\leona\dev\magic-animate-for-windows> ./run_gui.ps1
C:\Users\leona\dev\magic-animate-for-windows\magicanimate\pipelines\pipeline_animation.py:43: FutureWarning: Importing `DiffusionPipeline` or `ImagePipelineOutput` from diffusers.pipeline_utils is deprecated. Please import from diffusers.pipelines.pipeline_utils instead.
  from diffusers.pipeline_utils import DiffusionPipeline
Initializing MagicAnimate Pipeline...
### missing keys: 1246; 
### unexpected keys: 51;
Traceback (most recent call last):
  File "C:\Users\leona\dev\magic-animate-for-windows\demo\gradio_animate.py", line 19, in <module>
    animator = MagicAnimate()
  File "C:\Users\leona\dev\magic-animate-for-windows\demo\animate.py", line 90, in __init__
    self.appearance_encoder = AppearanceEncoderModel.from_pretrained(
  File "C:\Users\leona\dev\magic-animate-for-windows\venv\lib\site-packages\diffusers\models\modeling_utils.py", line 618, in from_pretrained
    model_file = _get_model_file(
  File "C:\Users\leona\dev\magic-animate-for-windows\venv\lib\site-packages\diffusers\utils\hub_utils.py", line 284, in _get_model_file
    raise EnvironmentError(
OSError: Error no file named diffusion_pytorch_model.bin found in directory pretrained_models/MagicAnimate/appearance_encoder.

I am following this guide to install
https://www.youtube.com/watch?v=jHfxCD0W5es&t=320s

@leochoo
Copy link
Author

leochoo commented Dec 11, 2023

I tried to manually download diffusion_pytorch_model.bin from the repo,
https://huggingface.co/bdsqlsz/stable-diffusion-v1-5/tree/main/vae
but it still fails with the below error.

C:\Users\leona\dev\magic-animate-for-windows\magicanimate\pipelines\pipeline_animation.py:43: FutureWarning: Importing `DiffusionPipeline` or `ImagePipelineOutput` from diffusers.pipeline_utils is deprecated. Please import from diffusers.pipelines.pipeline_utils instead.
  from diffusers.pipeline_utils import DiffusionPipeline
Initializing MagicAnimate Pipeline...
### missing keys: 1246; 
### unexpected keys: 51;
Traceback (most recent call last):
  File "C:\Users\leona\dev\magic-animate-for-windows\demo\gradio_animate.py", line 19, in <module>
    animator = MagicAnimate()
  File "C:\Users\leona\dev\magic-animate-for-windows\demo\animate.py", line 90, in __init__
    self.appearance_encoder = AppearanceEncoderModel.from_pretrained(
  File "C:\Users\leona\dev\magic-animate-for-windows\venv\lib\site-packages\diffusers\models\modeling_utils.py", line 646, in from_pretrained
    raise ValueError(
ValueError: Cannot load <class 'magicanimate.models.appearance_encoder.AppearanceEncoderModel'> from pretrained_models/MagicAnimate/appearance_encoder because the following keys are missing: 
 mid_block.attentions.0.transformer_blocks.0.attn1.to_v.weight, down_blocks.2.attentions.1.transformer_blocks.0.ff.net.2.bias, up_blocks.1.attentions.1.transformer_blocks.0.attn2.to_out.0.weight, mid_block.attentions.0.norm.bias, mid_block.attentions.0.transformer_blocks.0.ff.net.0.proj.weight, up_blocks.1.attentions.0.proj_in.weight, up_blocks.1.resnets.1.conv_shortcut.weight, down_blocks.0.attentions.0.proj_out.weight, up_blocks.0.resnets.1.conv1.weight, up_blocks.3.resnets.2.norm2.bias, mid_block.attentions.0.transformer_blocks.0.attn2.to_out.0.bias, up_blocks.2.attentions.1.transformer_blocks.0.norm3.bias, up_blocks.1.attentions.1.transformer_blocks.0.ff.net.0.proj.weight, up_blocks.1.resnets.0.norm2.weight, up_blocks.3.attentions.1.transformer_blocks.0.attn2.to_out.0.weight, up_blocks.3.resnets.0.conv_shortcut.bias, up_blocks.2.attentions.1.transformer_blocks.0.norm2.bias, up_blocks.3.attentions.1.norm.bias, mid_block.resnets.1.time_emb_proj.weight, up_blocks.3.attentions.0.transformer_blocks.0.attn2.to_out.0.bias, down_blocks.0.resnets.0.conv2.weight, up_blocks.3.attentions.1.transformer_blocks.0.attn1.to_q.weight, down_blocks.0.attentions.1.transformer_blocks.0.ff.net.0.proj.bias, up_blocks.2.resnets.0.norm2.weight, up_blocks.2.resnets.2.conv1.weight, mid_block.attentions.0.proj_in.weight, up_blocks.2.attentions.0.transformer_blocks.0.ff.net.2.bias, mid_block.resnets.1.conv1.weight, mid_block.resnets.0.norm1.bias, up_blocks.2.attentions.1.transformer_blocks.0.attn2.to_k.weight, conv_in.weight, up_blocks.3.resnets.1.norm1.bias, up_blocks.1.resnets.0.conv2.bias, up_blocks.1.attentions.0.transformer_blocks.0.attn1.to_out.0.bias, down_blocks.0.resnets.0.time_emb_proj.bias, down_blocks.0.attentions.0.transformer_blocks.0.ff.net.2.bias, down_blocks.3.resnets.0.conv2.weight, down_blocks.0.attentions.1.transformer_blocks.0.attn1.to_q.weight, down_blocks.0.resnets.1.conv2.weight, up_blocks.2.attentions.0.proj_out.weight, up_blocks.2.resnets.0.conv2.bias, up_blocks.3.resnets.2.norm1.bias, down_blocks.0.attentions.0.transformer_blocks.0.norm3.weight, up_blocks.2.attentions.0.transformer_blocks.0.attn2.to_out.0.bias, down_blocks.0.attentions.0.transformer_blocks.0.attn1.to_out.0.weight, down_blocks.1.attentions.0.transformer_blocks.0.attn2.to_out.0.bias, up_blocks.3.attentions.2.transformer_blocks.0.norm1.weight, mid_block.resnets.1.conv1.bias, down_blocks.2.attentions.1.transformer_blocks.0.ff.net.2.weight, down_blocks.3.resnets.0.norm2.bias, down_blocks.2.attentions.1.transformer_blocks.0.norm3.bias, down_blocks.0.resnets.0.norm2.bias, up_blocks.3.attentions.2.norm.bias, up_blocks.3.resnets.1.conv2.bias, down_blocks.2.resnets.0.conv_shortcut.bias, up_blocks.3.attentions.0.transformer_blocks.0.attn1.to_k.weight, down_blocks.0.resnets.1.norm1.bias, down_blocks.2.resnets.1.conv1.bias, up_blocks.1.upsamplers.0.conv.weight, down_blocks.1.resnets.1.time_emb_proj.bias, up_blocks.3.resnets.1.conv1.bias, up_blocks.1.attentions.2.transformer_blocks.0.norm2.weight, mid_block.attentions.0.transformer_blocks.0.norm1.weight, up_blocks.2.attentions.0.transformer_blocks.0.norm2.bias, down_blocks.0.attentions.0.transformer_blocks.0.attn2.to_q.weight, down_blocks.2.attentions.0.transformer_blocks.0.ff.net.0.proj.weight, down_blocks.2.attentions.1.transformer_blocks.0.attn1.to_q.weight, down_blocks.0.attentions.1.proj_out.bias, down_blocks.3.resnets.0.norm1.bias, down_blocks.1.attentions.1.transformer_blocks.0.ff.net.0.proj.bias, up_blocks.3.attentions.1.proj_in.bias, down_blocks.0.resnets.1.time_emb_proj.bias, up_blocks.3.attentions.0.transformer_blocks.0.attn1.to_out.0.weight, down_blocks.2.attentions.0.transformer_blocks.0.ff.net.2.bias, up_blocks.1.attentions.0.transformer_blocks.0.ff.net.2.bias, up_blocks.1.resnets.2.norm1.weight, down_blocks.0.attentions.1.transformer_blocks.0.attn2.to_out.0.bias, down_blocks.1.attentions.1.transformer_blocks.0.norm1.weight, up_blocks.2.attentions.0.transformer_blocks.0.attn1.to_k.weight, up_blocks.0.resnets.1.time_emb_proj.weight, down_blocks.1.attentions.1.transformer_blocks.0.ff.net.0.proj.weight, up_blocks.0.resnets.2.conv1.weight, down_blocks.0.resnets.1.norm1.weight, up_blocks.0.resnets.1.norm1.weight, mid_block.resnets.1.norm1.weight, down_blocks.1.resnets.1.conv2.bias, down_blocks.1.attentions.1.transformer_blocks.0.attn1.to_q.weight, down_blocks.0.attentions.1.proj_in.weight, down_blocks.2.attentions.1.proj_in.bias, down_blocks.3.resnets.1.norm1.weight, up_blocks.2.resnets.0.conv2.weight, down_blocks.1.attentions.0.proj_in.bias, down_blocks.2.resnets.0.conv1.bias, down_blocks.0.attentions.0.transformer_blocks.0.ff.net.0.proj.weight, up_blocks.2.attentions.2.transformer_blocks.0.ff.net.0.proj.weight, up_blocks.1.attentions.1.transformer_blocks.0.norm2.weight, up_blocks.2.attentions.1.transformer_blocks.0.attn1.to_out.0.bias, up_blocks.2.attentions.2.transformer_blocks.0.ff.net.2.bias, up_blocks.2.resnets.1.norm2.bias, up_blocks.3.resnets.0.norm1.weight, mid_block.resnets.0.conv2.bias, up_blocks.2.resnets.1.norm2.weight, up_blocks.3.attentions.1.transformer_blocks.0.attn1.to_out.0.weight, up_blocks.3.attentions.1.proj_out.bias, down_blocks.2.downsamplers.0.conv.weight, down_blocks.0.resnets.0.norm1.weight, down_blocks.2.attentions.1.proj_out.weight, up_blocks.0.upsamplers.0.conv.weight, up_blocks.1.attentions.1.transformer_blocks.0.norm1.weight, up_blocks.3.attentions.0.norm.bias, down_blocks.1.attentions.1.transformer_blocks.0.attn2.to_q.weight, up_blocks.1.resnets.0.conv2.weight, up_blocks.3.attentions.0.transformer_blocks.0.norm2.bias, up_blocks.1.resnets.2.norm2.weight, down_blocks.2.attentions.1.transformer_blocks.0.attn1.to_out.0.weight, up_blocks.2.attentions.1.transformer_blocks.0.norm1.bias, up_blocks.3.attentions.1.transformer_blocks.0.attn1.to_out.0.bias, up_blocks.1.attentions.1.transformer_blocks.0.attn2.to_q.weight, mid_block.attentions.0.transformer_blocks.0.ff.net.2.weight, up_blocks.3.attentions.0.transformer_blocks.0.norm3.weight, down_blocks.3.resnets.1.time_emb_proj.bias, up_blocks.1.resnets.1.norm1.bias, down_blocks.3.resnets.0.conv2.bias, mid_block.resnets.1.conv2.bias, up_blocks.2.attentions.2.transformer_blocks.0.norm1.bias, up_blocks.1.resnets.1.conv1.bias, up_blocks.2.attentions.1.proj_out.bias, down_blocks.2.resnets.0.conv1.weight, up_blocks.2.resnets.0.norm1.weight, up_blocks.1.resnets.0.conv1.bias, down_blocks.3.resnets.1.conv2.weight, down_blocks.1.attentions.0.transformer_blocks.0.norm3.bias, up_blocks.2.attentions.0.transformer_blocks.0.norm3.bias, up_blocks.2.resnets.1.conv_shortcut.weight, down_blocks.2.attentions.0.transformer_blocks.0.attn2.to_v.weight, up_blocks.1.resnets.1.time_emb_proj.weight, up_blocks.1.resnets.2.norm2.bias, down_blocks.0.attentions.0.transformer_blocks.0.attn1.to_q.weight, down_blocks.1.attentions.0.transformer_blocks.0.attn1.to_out.0.bias, up_blocks.2.resnets.0.conv1.bias, down_blocks.2.attentions.1.transformer_blocks.0.norm1.weight, down_blocks.0.resnets.0.norm1.bias, down_blocks.1.attentions.1.proj_in.weight, up_blocks.1.resnets.2.conv2.weight, up_blocks.2.attentions.1.transformer_blocks.0.ff.net.2.bias, down_blocks.0.attentions.1.transformer_blocks.0.norm1.bias, up_blocks.2.attentions.2.transformer_blocks.0.norm3.weight, down_blocks.0.attentions.0.norm.weight, up_blocks.2.resnets.0.time_emb_proj.bias, up_blocks.2.upsamplers.0.conv.bias, mid_block.attentions.0.norm.weight, down_blocks.2.attentions.1.norm.weight, mid_block.resnets.1.norm1.bias, up_blocks.0.resnets.1.conv2.weight, up_blocks.2.attentions.2.transformer_blocks.0.attn2.to_k.weight, up_blocks.1.attentions.1.transformer_blocks.0.ff.net.2.bias, down_blocks.2.resnets.0.conv_shortcut.weight, down_blocks.2.attentions.0.transformer_blocks.0.norm1.bias, up_blocks.1.attentions.1.transformer_blocks.0.attn2.to_v.weight, up_blocks.1.attentions.2.norm.weight, down_blocks.1.attentions.1.transformer_blocks.0.attn2.to_out.0.bias, down_blocks.1.attentions.0.transformer_blocks.0.attn1.to_v.weight, down_blocks.1.resnets.1.norm1.weight, up_blocks.0.resnets.2.norm1.bias, down_blocks.2.attentions.0.transformer_blocks.0.attn2.to_q.weight, up_blocks.0.resnets.2.conv1.bias, up_blocks.2.attentions.0.norm.weight, up_blocks.3.resnets.2.time_emb_proj.bias, up_blocks.1.attentions.0.transformer_blocks.0.norm1.weight, down_blocks.0.attentions.0.transformer_blocks.0.attn2.to_out.0.bias, up_blocks.1.attentions.2.transformer_blocks.0.attn2.to_out.0.weight, up_blocks.3.resnets.0.time_emb_proj.weight, up_blocks.0.resnets.2.norm1.weight, up_blocks.1.attentions.0.transformer_blocks.0.attn2.to_out.0.weight, up_blocks.2.attentions.1.transformer_blocks.0.attn2.to_out.0.weight, up_blocks.3.resnets.2.norm2.weight, up_blocks.1.resnets.0.time_emb_proj.weight, up_blocks.3.resnets.0.conv_shortcut.weight, up_blocks.1.resnets.2.conv2.bias, down_blocks.2.attentions.1.transformer_blocks.0.norm2.bias, up_blocks.2.attentions.1.transformer_blocks.0.ff.net.0.proj.bias, down_blocks.2.attentions.0.transformer_blocks.0.ff.net.0.proj.bias, down_blocks.1.downsamplers.0.conv.bias, mid_block.resnets.1.conv2.weight, down_blocks.0.attentions.0.proj_in.bias, down_blocks.2.attentions.0.norm.bias, up_blocks.1.attentions.2.transformer_blocks.0.attn1.to_k.weight, up_blocks.3.attentions.0.norm.weight, up_blocks.1.attentions.2.transformer_blocks.0.norm3.weight, up_blocks.0.resnets.0.norm1.weight, down_blocks.3.resnets.1.time_emb_proj.weight, down_blocks.1.resnets.1.norm1.bias, down_blocks.0.attentions.0.proj_in.weight, up_blocks.1.attentions.0.transformer_blocks.0.ff.net.0.proj.weight, up_blocks.3.attentions.0.transformer_blocks.0.attn2.to_v.weight, up_blocks.1.attentions.0.transformer_blocks.0.attn1.to_out.0.weight, up_blocks.2.resnets.1.conv2.weight, up_blocks.3.attentions.0.proj_out.bias, up_blocks.0.upsamplers.0.conv.bias, up_blocks.0.resnets.2.conv_shortcut.weight, down_blocks.2.attentions.0.transformer_blocks.0.attn1.to_v.weight, up_blocks.3.attentions.0.proj_out.weight, up_blocks.3.attentions.2.proj_in.weight, up_blocks.3.resnets.1.conv1.weight, mid_block.resnets.0.conv2.weight, up_blocks.3.attentions.1.transformer_blocks.0.attn2.to_v.weight, down_blocks.2.attentions.0.transformer_blocks.0.norm2.bias, up_blocks.1.attentions.2.transformer_blocks.0.attn2.to_out.0.bias, mid_block.attentions.0.transformer_blocks.0.ff.net.0.proj.bias, down_blocks.2.attentions.0.transformer_blocks.0.attn2.to_out.0.bias, up_blocks.1.attentions.1.transformer_blocks.0.attn1.to_out.0.bias, up_blocks.1.attentions.1.transformer_blocks.0.attn2.to_k.weight, up_blocks.3.attentions.0.transformer_blocks.0.ff.net.2.bias, mid_block.resnets.0.norm2.weight, up_blocks.3.attentions.0.proj_in.bias, up_blocks.3.resnets.2.norm1.weight, down_blocks.1.attentions.0.transformer_blocks.0.norm1.bias, up_blocks.0.resnets.2.conv2.bias, down_blocks.2.attentions.0.transformer_blocks.0.attn1.to_k.weight, up_blocks.1.attentions.1.proj_in.bias, up_blocks.1.attentions.2.proj_in.bias, down_blocks.2.attentions.1.transformer_blocks.0.attn2.to_k.weight, up_blocks.0.resnets.0.conv2.bias, up_blocks.1.resnets.0.norm1.bias, up_blocks.3.resnets.0.time_emb_proj.bias, up_blocks.0.resnets.1.norm1.bias, down_blocks.2.attentions.1.proj_out.bias, up_blocks.3.attentions.1.transformer_blocks.0.ff.net.2.weight, up_blocks.3.attentions.0.transformer_blocks.0.attn1.to_q.weight, mid_block.attentions.0.transformer_blocks.0.norm1.bias, down_blocks.0.attentions.0.transformer_blocks.0.attn1.to_v.weight, up_blocks.1.attentions.2.transformer_blocks.0.norm1.bias, up_blocks.3.resnets.1.conv2.weight, down_blocks.1.attentions.0.transformer_blocks.0.norm2.weight, up_blocks.0.resnets.1.conv_shortcut.weight, up_blocks.3.attentions.1.transformer_blocks.0.ff.net.0.proj.weight, down_blocks.2.attentions.0.transformer_blocks.0.norm1.weight, up_blocks.2.attentions.2.transformer_blocks.0.attn1.to_v.weight, up_blocks.2.attentions.2.transformer_blocks.0.attn2.to_q.weight, up_blocks.0.resnets.1.time_emb_proj.bias, up_blocks.1.attentions.2.proj_out.weight, up_blocks.2.attentions.2.proj_in.bias, down_blocks.2.attentions.0.transformer_blocks.0.attn2.to_out.0.weight, down_blocks.0.resnets.0.conv2.bias, up_blocks.2.resnets.0.time_emb_proj.weight, down_blocks.1.attentions.1.transformer_blocks.0.attn1.to_out.0.bias, mid_block.attentions.0.transformer_blocks.0.attn2.to_v.weight, down_blocks.2.attentions.1.transformer_blocks.0.attn2.to_v.weight, up_blocks.2.resnets.2.time_emb_proj.weight, up_blocks.3.attentions.0.transformer_blocks.0.norm1.weight, up_blocks.3.resnets.2.conv_shortcut.bias, up_blocks.2.attentions.2.transformer_blocks.0.attn1.to_out.0.bias, up_blocks.3.attentions.2.proj_in.bias, up_blocks.3.attentions.1.transformer_blocks.0.attn1.to_k.weight, down_blocks.3.resnets.0.time_emb_proj.bias, down_blocks.0.attentions.1.norm.bias, down_blocks.2.attentions.0.proj_in.weight, up_blocks.0.resnets.0.conv1.weight, down_blocks.1.attentions.0.transformer_blocks.0.attn1.to_q.weight, up_blocks.3.attentions.0.transformer_blocks.0.attn2.to_out.0.weight, down_blocks.1.resnets.0.conv2.bias, up_blocks.1.attentions.0.proj_in.bias, up_blocks.1.attentions.0.proj_out.bias, up_blocks.1.attentions.2.transformer_blocks.0.attn2.to_q.weight, up_blocks.1.resnets.2.time_emb_proj.bias, up_blocks.1.resnets.0.conv_shortcut.bias, down_blocks.0.attentions.1.transformer_blocks.0.attn1.to_v.weight, up_blocks.0.resnets.0.norm2.bias, down_blocks.0.resnets.0.conv1.bias, down_blocks.1.resnets.0.norm2.bias, down_blocks.1.attentions.1.transformer_blocks.0.norm3.weight, down_blocks.0.attentions.0.transformer_blocks.0.norm1.weight, up_blocks.2.attentions.2.transformer_blocks.0.attn1.to_out.0.weight, up_blocks.2.resnets.1.conv_shortcut.bias, up_blocks.3.attentions.1.transformer_blocks.0.attn2.to_k.weight, up_blocks.2.resnets.1.conv1.weight, up_blocks.3.resnets.1.norm1.weight, up_blocks.1.attentions.2.transformer_blocks.0.attn1.to_q.weight, down_blocks.0.attentions.1.transformer_blocks.0.attn2.to_k.weight, up_blocks.3.resnets.1.time_emb_proj.weight, up_blocks.1.attentions.1.norm.bias, down_blocks.1.attentions.0.transformer_blocks.0.ff.net.0.proj.weight, up_blocks.0.resnets.0.norm1.bias, down_blocks.0.attentions.0.transformer_blocks.0.norm1.bias, up_blocks.2.attentions.1.transformer_blocks.0.attn1.to_v.weight, mid_block.resnets.0.time_emb_proj.bias, down_blocks.1.attentions.0.transformer_blocks.0.attn2.to_k.weight, up_blocks.1.attentions.0.transformer_blocks.0.attn1.to_q.weight, up_blocks.1.attentions.2.transformer_blocks.0.attn1.to_out.0.bias, up_blocks.3.attentions.0.transformer_blocks.0.norm3.bias, down_blocks.1.attentions.1.transformer_blocks.0.attn1.to_k.weight, up_blocks.2.resnets.1.norm1.bias, up_blocks.1.attentions.1.transformer_blocks.0.norm3.weight, down_blocks.0.resnets.1.conv1.weight, down_blocks.0.downsamplers.0.conv.bias, down_blocks.2.attentions.0.transformer_blocks.0.norm3.weight, down_blocks.0.resnets.1.norm2.bias, down_blocks.1.attentions.1.transformer_blocks.0.ff.net.2.bias, down_blocks.2.attentions.1.transformer_blocks.0.ff.net.0.proj.weight, down_blocks.0.attentions.0.transformer_blocks.0.attn2.to_out.0.weight, up_blocks.2.attentions.1.transformer_blocks.0.attn2.to_v.weight, up_blocks.2.resnets.2.norm2.bias, down_blocks.0.resnets.1.norm2.weight, down_blocks.2.resnets.0.conv2.weight, up_blocks.2.attentions.2.transformer_blocks.0.attn2.to_out.0.bias, down_blocks.0.attentions.1.transformer_blocks.0.attn2.to_q.weight, up_blocks.2.resnets.0.conv1.weight, up_blocks.0.resnets.0.norm2.weight, down_blocks.1.resnets.0.norm1.weight, up_blocks.1.attentions.2.proj_out.bias, up_blocks.1.attentions.0.transformer_blocks.0.attn1.to_v.weight, up_blocks.1.resnets.1.conv2.weight, up_blocks.2.attentions.0.transformer_blocks.0.ff.net.0.proj.weight, down_blocks.2.resnets.0.time_emb_proj.bias, up_blocks.1.upsamplers.0.conv.bias, up_blocks.2.upsamplers.0.conv.weight, up_blocks.2.resnets.2.norm1.weight, down_blocks.0.attentions.1.proj_in.bias, up_blocks.2.attentions.0.transformer_blocks.0.norm1.bias, up_blocks.2.attentions.1.proj_out.weight, up_blocks.1.attentions.2.transformer_blocks.0.attn1.to_out.0.weight, up_blocks.3.attentions.1.proj_out.weight, up_blocks.2.attentions.0.transformer_blocks.0.attn2.to_out.0.weight, down_blocks.2.attentions.1.transformer_blocks.0.norm3.weight, up_blocks.1.attentions.0.proj_out.weight, up_blocks.2.resnets.2.norm1.bias, up_blocks.2.attentions.0.transformer_blocks.0.attn1.to_out.0.weight, mid_block.attentions.0.proj_in.bias, up_blocks.3.attentions.0.transformer_blocks.0.attn1.to_out.0.bias, up_blocks.1.attentions.0.transformer_blocks.0.ff.net.0.proj.bias, down_blocks.2.attentions.0.proj_in.bias, up_blocks.3.attentions.1.transformer_blocks.0.norm2.bias, up_blocks.2.attentions.1.transformer_blocks.0.norm1.weight, down_blocks.1.resnets.0.conv1.bias, down_blocks.3.resnets.1.norm1.bias, up_blocks.2.attentions.2.proj_out.bias, down_blocks.2.attentions.1.transformer_blocks.0.norm2.weight, down_blocks.1.resnets.1.conv2.weight, down_blocks.1.attentions.0.proj_in.weight, up_blocks.2.attentions.2.transformer_blocks.0.norm3.bias, up_blocks.3.resnets.1.norm2.bias, up_blocks.0.resnets.1.conv2.bias, up_blocks.1.resnets.2.time_emb_proj.weight, up_blocks.3.attentions.1.transformer_blocks.0.ff.net.2.bias, down_blocks.0.attentions.0.norm.bias, up_blocks.2.attentions.2.transformer_blocks.0.norm2.bias, up_blocks.2.attentions.0.transformer_blocks.0.ff.net.2.weight, down_blocks.1.attentions.1.transformer_blocks.0.attn2.to_out.0.weight, up_blocks.3.resnets.2.conv1.weight, up_blocks.3.resnets.0.conv1.bias, down_blocks.1.attentions.0.transformer_blocks.0.ff.net.2.weight, up_blocks.2.attentions.2.transformer_blocks.0.norm1.weight, up_blocks.1.resnets.2.conv1.bias, down_blocks.2.resnets.0.norm1.weight, up_blocks.2.attentions.2.norm.bias, down_blocks.1.resnets.0.conv_shortcut.bias, up_blocks.2.attentions.2.transformer_blocks.0.attn2.to_out.0.weight, down_blocks.0.resnets.0.norm2.weight, up_blocks.2.resnets.1.time_emb_proj.weight, up_blocks.2.attentions.1.transformer_blocks.0.attn2.to_q.weight, up_blocks.2.attentions.0.transformer_blocks.0.attn2.to_v.weight, up_blocks.2.attentions.2.transformer_blocks.0.attn1.to_k.weight, down_blocks.0.resnets.1.conv2.bias, down_blocks.2.resnets.1.norm1.bias, down_blocks.1.attentions.0.transformer_blocks.0.norm2.bias, up_blocks.1.attentions.2.transformer_blocks.0.norm1.weight, down_blocks.1.resnets.1.norm2.weight, up_blocks.1.attentions.1.proj_out.weight, up_blocks.2.attentions.1.transformer_blocks.0.attn1.to_q.weight, up_blocks.2.attentions.1.transformer_blocks.0.attn1.to_k.weight, up_blocks.2.attentions.0.transformer_blocks.0.attn2.to_q.weight, down_blocks.0.attentions.1.transformer_blocks.0.ff.net.2.bias, down_blocks.2.attentions.0.transformer_blocks.0.attn1.to_out.0.bias, down_blocks.1.resnets.0.conv_shortcut.weight, down_blocks.1.attentions.1.transformer_blocks.0.norm2.weight, down_blocks.2.resnets.0.conv2.bias, up_blocks.2.attentions.2.proj_in.weight, down_blocks.1.downsamplers.0.conv.weight, down_blocks.2.attentions.1.transformer_blocks.0.attn1.to_out.0.bias, up_blocks.1.resnets.2.conv1.weight, down_blocks.2.resnets.1.conv2.weight, up_blocks.1.attentions.0.transformer_blocks.0.attn2.to_k.weight, up_blocks.1.attentions.2.transformer_blocks.0.ff.net.2.weight, up_blocks.2.attentions.1.norm.bias, mid_block.attentions.0.transformer_blocks.0.ff.net.2.bias, down_blocks.0.attentions.1.transformer_blocks.0.norm1.weight, up_blocks.2.attentions.1.transformer_blocks.0.norm2.weight, down_blocks.1.attentions.1.proj_out.weight, up_blocks.1.attentions.0.transformer_blocks.0.attn2.to_v.weight, up_blocks.1.attentions.1.transformer_blocks.0.attn2.to_out.0.bias, up_blocks.3.attentions.2.norm.weight, up_blocks.1.attentions.1.transformer_blocks.0.norm3.bias, down_blocks.1.attentions.0.norm.weight, down_blocks.1.resnets.1.time_emb_proj.weight, up_blocks.2.resnets.0.norm2.bias, down_blocks.1.attentions.0.transformer_blocks.0.attn2.to_q.weight, down_blocks.2.attentions.1.transformer_blocks.0.norm1.bias, up_blocks.0.resnets.1.norm2.bias, down_blocks.2.attentions.0.transformer_blocks.0.ff.net.2.weight, down_blocks.1.attentions.1.transformer_blocks.0.norm2.bias, down_blocks.2.attentions.0.transformer_blocks.0.attn1.to_out.0.weight, down_blocks.3.resnets.1.norm2.bias, up_blocks.1.attentions.2.transformer_blocks.0.norm2.bias, up_blocks.1.attentions.2.transformer_blocks.0.ff.net.0.proj.bias, up_blocks.2.attentions.0.transformer_blocks.0.norm3.weight, up_blocks.3.attentions.0.transformer_blocks.0.norm2.weight, down_blocks.1.resnets.1.norm2.bias, up_blocks.3.resnets.1.norm2.weight, down_blocks.2.attentions.1.transformer_blocks.0.attn1.to_v.weight, up_blocks.1.attentions.0.transformer_blocks.0.norm3.bias, mid_block.attentions.0.proj_out.bias, up_blocks.3.attentions.1.transformer_blocks.0.norm3.bias, up_blocks.3.attentions.1.transformer_blocks.0.ff.net.0.proj.bias, down_blocks.0.downsamplers.0.conv.weight, down_blocks.3.resnets.0.norm2.weight, up_blocks.2.attentions.0.transformer_blocks.0.attn1.to_q.weight, down_blocks.1.attentions.0.transformer_blocks.0.attn1.to_k.weight, up_blocks.3.attentions.2.transformer_blocks.0.norm1.bias, up_blocks.2.resnets.1.conv2.bias, up_blocks.2.attentions.2.proj_out.weight, up_blocks.3.resnets.1.time_emb_proj.bias, down_blocks.2.attentions.0.transformer_blocks.0.norm3.bias, up_blocks.1.attentions.0.transformer_blocks.0.norm3.weight, up_blocks.1.attentions.0.transformer_blocks.0.norm2.weight, up_blocks.3.attentions.0.transformer_blocks.0.norm1.bias, up_blocks.2.resnets.2.time_emb_proj.bias, down_blocks.0.resnets.1.conv1.bias, up_blocks.3.resnets.0.conv1.weight, down_blocks.2.resnets.0.time_emb_proj.weight, up_blocks.1.attentions.1.transformer_blocks.0.norm2.bias, up_blocks.2.attentions.1.transformer_blocks.0.norm3.weight, up_blocks.1.resnets.2.conv_shortcut.weight, up_blocks.1.attentions.0.transformer_blocks.0.attn2.to_out.0.bias, down_blocks.0.attentions.1.norm.weight, up_blocks.2.attentions.1.proj_in.weight, up_blocks.3.resnets.2.conv1.bias, down_blocks.3.resnets.1.conv1.weight, up_blocks.1.attentions.2.transformer_blocks.0.attn2.to_v.weight, down_blocks.2.resnets.1.norm1.weight, up_blocks.2.attentions.2.norm.weight, up_blocks.3.attentions.1.transformer_blocks.0.norm3.weight, up_blocks.3.resnets.1.conv_shortcut.bias, down_blocks.2.attentions.0.transformer_blocks.0.attn1.to_q.weight, up_blocks.1.resnets.0.norm2.bias, up_blocks.2.resnets.0.norm1.bias, down_blocks.1.attentions.0.transformer_blocks.0.attn2.to_out.0.weight, up_blocks.3.attentions.1.transformer_blocks.0.norm1.weight, up_blocks.1.attentions.1.transformer_blocks.0.attn1.to_q.weight, up_blocks.3.attentions.1.transformer_blocks.0.norm1.bias, mid_block.attentions.0.transformer_blocks.0.norm3.bias, up_blocks.2.resnets.2.conv2.weight, down_blocks.1.attentions.1.proj_out.bias, up_blocks.3.attentions.0.transformer_blocks.0.attn2.to_q.weight, down_blocks.1.attentions.0.transformer_blocks.0.attn1.to_out.0.weight, down_blocks.0.attentions.1.transformer_blocks.0.norm2.bias, down_blocks.0.attentions.0.transformer_blocks.0.ff.net.2.weight, down_blocks.3.resnets.0.conv1.bias, up_blocks.1.attentions.2.transformer_blocks.0.norm3.bias, up_blocks.1.resnets.1.time_emb_proj.bias, mid_block.resnets.0.norm1.weight, down_blocks.2.resnets.1.norm2.weight, up_blocks.2.resnets.2.conv_shortcut.bias, down_blocks.1.attentions.1.norm.weight, up_blocks.1.resnets.0.conv1.weight, up_blocks.3.resnets.2.time_emb_proj.weight, up_blocks.2.attentions.0.transformer_blocks.0.attn1.to_v.weight, up_blocks.1.resnets.1.norm1.weight, down_blocks.2.attentions.0.transformer_blocks.0.norm2.weight, up_blocks.3.attentions.1.transformer_blocks.0.attn2.to_q.weight, down_blocks.2.resnets.1.conv1.weight, up_blocks.2.attentions.0.transformer_blocks.0.norm2.weight, up_blocks.1.attentions.1.transformer_blocks.0.attn1.to_out.0.weight, down_blocks.2.resnets.1.conv2.bias, down_blocks.2.attentions.0.proj_out.weight, down_blocks.1.attentions.1.norm.bias, up_blocks.0.resnets.2.time_emb_proj.weight, up_blocks.2.attentions.2.transformer_blocks.0.ff.net.0.proj.bias, mid_block.attentions.0.transformer_blocks.0.attn1.to_q.weight, mid_block.resnets.1.time_emb_proj.bias, up_blocks.1.attentions.1.transformer_blocks.0.ff.net.0.proj.bias, up_blocks.1.attentions.2.transformer_blocks.0.attn1.to_v.weight, up_blocks.2.resnets.1.norm1.weight, up_blocks.1.resnets.0.time_emb_proj.bias, up_blocks.3.attentions.0.transformer_blocks.0.attn2.to_k.weight, up_blocks.1.resnets.1.conv2.bias, up_blocks.1.resnets.2.norm1.bias, up_blocks.3.attentions.1.proj_in.weight, up_blocks.3.resnets.0.norm2.weight, down_blocks.2.attentions.1.transformer_blocks.0.attn2.to_out.0.weight, down_blocks.2.downsamplers.0.conv.bias, up_blocks.2.resnets.1.conv1.bias, mid_block.resnets.0.conv1.bias, down_blocks.0.attentions.0.transformer_blocks.0.attn1.to_k.weight, down_blocks.1.attentions.0.proj_out.bias, down_blocks.0.resnets.0.conv1.weight, down_blocks.1.attentions.1.transformer_blocks.0.ff.net.2.weight, down_blocks.0.attentions.1.transformer_blocks.0.ff.net.2.weight, down_blocks.2.resnets.1.norm2.bias, down_blocks.1.resnets.1.conv1.bias, up_blocks.2.attentions.2.transformer_blocks.0.attn2.to_v.weight, down_blocks.0.attentions.1.transformer_blocks.0.ff.net.0.proj.weight, mid_block.attentions.0.transformer_blocks.0.norm3.weight, mid_block.attentions.0.transformer_blocks.0.attn1.to_k.weight, up_blocks.1.resnets.1.norm2.weight, up_blocks.3.attentions.1.transformer_blocks.0.attn2.to_out.0.bias, down_blocks.1.attentions.1.transformer_blocks.0.attn2.to_v.weight, conv_in.bias, down_blocks.2.attentions.1.transformer_blocks.0.attn2.to_q.weight, up_blocks.3.attentions.0.proj_in.weight, down_blocks.1.attentions.0.transformer_blocks.0.norm1.weight, up_blocks.1.attentions.2.proj_in.weight, up_blocks.2.resnets.2.conv_shortcut.weight, up_blocks.1.attentions.1.transformer_blocks.0.attn1.to_k.weight, up_blocks.2.attentions.0.norm.bias, up_blocks.3.resnets.2.conv2.weight, down_blocks.2.resnets.0.norm1.bias, down_blocks.1.attentions.0.proj_out.weight, down_blocks.1.attentions.1.transformer_blocks.0.attn2.to_k.weight, up_blocks.1.attentions.0.transformer_blocks.0.ff.net.2.weight, up_blocks.1.attentions.0.transformer_blocks.0.attn2.to_q.weight, time_embedding.linear_1.weight, up_blocks.2.resnets.2.conv2.bias, down_blocks.3.resnets.0.time_emb_proj.weight, up_blocks.0.resnets.0.time_emb_proj.weight, down_blocks.0.attentions.0.transformer_blocks.0.attn2.to_v.weight, down_blocks.1.resnets.0.conv2.weight, down_blocks.1.resnets.0.norm1.bias, down_blocks.1.attentions.0.transformer_blocks.0.norm3.weight, down_blocks.0.attentions.1.proj_out.weight, up_blocks.1.attentions.0.norm.weight, up_blocks.3.resnets.2.conv_shortcut.weight, down_blocks.0.attentions.1.transformer_blocks.0.attn2.to_out.0.weight, down_blocks.1.resnets.0.time_emb_proj.bias, up_blocks.3.attentions.1.transformer_blocks.0.attn1.to_v.weight, up_blocks.3.resnets.2.conv2.bias, down_blocks.0.attentions.1.transformer_blocks.0.attn1.to_k.weight, up_blocks.1.resnets.1.norm2.bias, down_blocks.2.attentions.1.proj_in.weight, down_blocks.0.attentions.0.transformer_blocks.0.norm2.weight, up_blocks.0.resnets.0.conv2.weight, down_blocks.1.attentions.1.transformer_blocks.0.attn1.to_v.weight, up_blocks.0.resnets.1.norm2.weight, up_blocks.2.attentions.2.transformer_blocks.0.attn1.to_q.weight, time_embedding.linear_2.bias, down_blocks.1.resnets.1.conv1.weight, mid_block.attentions.0.transformer_blocks.0.norm2.weight, up_blocks.3.resnets.1.conv_shortcut.weight, mid_block.resnets.0.conv1.weight, down_blocks.2.attentions.1.transformer_blocks.0.attn1.to_k.weight, down_blocks.2.attentions.0.proj_out.bias, down_blocks.3.resnets.1.conv1.bias, up_blocks.3.resnets.0.conv2.bias, mid_block.resnets.1.norm2.bias, up_blocks.2.attentions.1.transformer_blocks.0.attn2.to_out.0.bias, down_blocks.2.resnets.0.norm2.bias, up_blocks.2.resnets.1.time_emb_proj.bias, mid_block.attentions.0.transformer_blocks.0.norm2.bias, down_blocks.1.resnets.0.norm2.weight, up_blocks.2.attentions.1.proj_in.bias, up_blocks.2.attentions.1.norm.weight, up_blocks.2.resnets.2.conv1.bias, up_blocks.2.attentions.2.transformer_blocks.0.norm2.weight, up_blocks.0.resnets.2.norm2.bias, up_blocks.2.attentions.0.proj_in.bias, up_blocks.0.resnets.1.conv_shortcut.bias, down_blocks.0.attentions.0.transformer_blocks.0.norm2.bias, down_blocks.1.attentions.1.transformer_blocks.0.norm1.bias, up_blocks.2.attentions.0.transformer_blocks.0.attn2.to_k.weight, down_blocks.3.resnets.1.conv2.bias, mid_block.resnets.0.norm2.bias, down_blocks.0.attentions.0.transformer_blocks.0.attn2.to_k.weight, up_blocks.1.attentions.0.transformer_blocks.0.norm1.bias, up_blocks.2.attentions.0.transformer_blocks.0.attn1.to_out.0.bias, up_blocks.2.attentions.1.transformer_blocks.0.attn1.to_out.0.weight, down_blocks.0.attentions.1.transformer_blocks.0.attn2.to_v.weight, up_blocks.3.resnets.0.norm1.bias, up_blocks.0.resnets.2.conv2.weight, up_blocks.2.attentions.0.proj_out.bias, up_blocks.3.attentions.1.norm.weight, mid_block.attentions.0.transformer_blocks.0.attn2.to_q.weight, up_blocks.0.resnets.0.time_emb_proj.bias, down_blocks.2.attentions.1.transformer_blocks.0.attn2.to_out.0.bias, down_blocks.2.attentions.0.transformer_blocks.0.attn2.to_k.weight, down_blocks.0.attentions.0.transformer_blocks.0.ff.net.0.proj.bias, up_blocks.3.resnets.0.norm2.bias, up_blocks.1.attentions.1.transformer_blocks.0.attn1.to_v.weight, up_blocks.1.attentions.1.transformer_blocks.0.norm1.bias, down_blocks.2.resnets.0.norm2.weight, down_blocks.1.resnets.0.time_emb_proj.weight, up_blocks.0.resnets.0.conv1.bias, down_blocks.1.attentions.1.proj_in.bias, up_blocks.1.resnets.0.norm1.weight, up_blocks.3.attentions.0.transformer_blocks.0.ff.net.0.proj.bias, up_blocks.1.attentions.1.proj_out.bias, down_blocks.0.attentions.1.transformer_blocks.0.norm3.weight, up_blocks.1.attentions.1.proj_in.weight, down_blocks.1.attentions.1.transformer_blocks.0.attn1.to_out.0.weight, mid_block.attentions.0.transformer_blocks.0.attn1.to_out.0.bias, up_blocks.0.resnets.0.conv_shortcut.bias, up_blocks.0.resnets.1.conv1.bias, down_blocks.2.attentions.1.norm.bias, up_blocks.1.attentions.2.transformer_blocks.0.ff.net.0.proj.weight, mid_block.resnets.0.time_emb_proj.weight, down_blocks.1.attentions.1.transformer_blocks.0.norm3.bias, mid_block.attentions.0.transformer_blocks.0.attn2.to_k.weight, up_blocks.2.attentions.1.transformer_blocks.0.ff.net.0.proj.weight, down_blocks.3.resnets.1.norm2.weight, up_blocks.2.attentions.0.transformer_blocks.0.ff.net.0.proj.bias, time_embedding.linear_1.bias, down_blocks.1.attentions.0.transformer_blocks.0.ff.net.0.proj.bias, down_blocks.3.resnets.0.conv1.weight, down_blocks.0.resnets.0.time_emb_proj.weight, up_blocks.1.attentions.0.transformer_blocks.0.norm2.bias, up_blocks.1.attentions.1.norm.weight, down_blocks.2.attentions.0.norm.weight, up_blocks.1.resnets.1.conv_shortcut.bias, down_blocks.0.attentions.1.transformer_blocks.0.attn1.to_out.0.bias, down_blocks.2.resnets.1.time_emb_proj.bias, up_blocks.0.resnets.2.time_emb_proj.bias, up_blocks.1.attentions.0.norm.bias, down_blocks.0.attentions.1.transformer_blocks.0.attn1.to_out.0.weight, up_blocks.2.resnets.2.norm2.weight, down_blocks.0.attentions.1.transformer_blocks.0.norm2.weight, up_blocks.0.resnets.0.conv_shortcut.weight, down_blocks.0.attentions.0.transformer_blocks.0.attn1.to_out.0.bias, up_blocks.3.resnets.0.conv2.weight, up_blocks.3.attentions.0.transformer_blocks.0.ff.net.0.proj.weight, up_blocks.0.resnets.2.conv_shortcut.bias, mid_block.resnets.1.norm2.weight, down_blocks.1.resnets.0.conv1.weight, up_blocks.1.resnets.0.conv_shortcut.weight, up_blocks.1.attentions.0.transformer_blocks.0.attn1.to_k.weight, up_blocks.2.attentions.1.transformer_blocks.0.ff.net.2.weight, mid_block.attentions.0.transformer_blocks.0.attn1.to_out.0.weight, down_blocks.1.attentions.0.norm.bias, up_blocks.2.attentions.0.transformer_blocks.0.norm1.weight, up_blocks.1.attentions.2.transformer_blocks.0.attn2.to_k.weight, up_blocks.3.attentions.0.transformer_blocks.0.attn1.to_v.weight, down_blocks.3.resnets.0.norm1.weight, up_blocks.3.attentions.1.transformer_blocks.0.norm2.weight, up_blocks.2.resnets.0.conv_shortcut.bias, up_blocks.0.resnets.2.norm2.weight, down_blocks.0.attentions.1.transformer_blocks.0.norm3.bias, up_blocks.1.attentions.2.norm.bias, up_blocks.1.attentions.1.transformer_blocks.0.ff.net.2.weight, up_blocks.2.attentions.2.transformer_blocks.0.ff.net.2.weight, up_blocks.2.resnets.0.conv_shortcut.weight, mid_block.attentions.0.transformer_blocks.0.attn2.to_out.0.weight, down_blocks.0.attentions.0.transformer_blocks.0.norm3.bias, mid_block.attentions.0.proj_out.weight, down_blocks.1.attentions.0.transformer_blocks.0.attn2.to_v.weight, down_blocks.1.attentions.0.transformer_blocks.0.ff.net.2.bias, up_blocks.1.resnets.2.conv_shortcut.bias, time_embedding.linear_2.weight, down_blocks.2.resnets.1.time_emb_proj.weight, up_blocks.1.resnets.1.conv1.weight, up_blocks.2.attentions.0.proj_in.weight, down_blocks.0.attentions.0.proj_out.bias, down_blocks.0.resnets.1.time_emb_proj.weight, down_blocks.2.attentions.1.transformer_blocks.0.ff.net.0.proj.bias, up_blocks.1.attentions.2.transformer_blocks.0.ff.net.2.bias, up_blocks.3.attentions.0.transformer_blocks.0.ff.net.2.weight.
 Please make sure to pass `low_cpu_mem_usage=False` and `device_map=None` if you want to randomly initialize those weights or else make sure your checkpoint file is correct.

@sdbds
Copy link
Owner

sdbds commented Dec 12, 2023

Because of different download speeds, the first time you download the model may be slow.
We recommend deleting the MagicAnimate folder and re-running the install script, then waiting for the download to complete.

@sdbds
Copy link
Owner

sdbds commented Dec 12, 2023

I tried to manually download diffusion_pytorch_model.bin from the repo, https://huggingface.co/bdsqlsz/stable-diffusion-v1-5/tree/main/vae but it still fails with the below error.

C:\Users\leona\dev\magic-animate-for-windows\magicanimate\pipelines\pipeline_animation.py:43: FutureWarning: Importing `DiffusionPipeline` or `ImagePipelineOutput` from diffusers.pipeline_utils is deprecated. Please import from diffusers.pipelines.pipeline_utils instead.
  from diffusers.pipeline_utils import DiffusionPipeline
Initializing MagicAnimate Pipeline...
### missing keys: 1246; 
### unexpected keys: 51;
Traceback (most recent call last):
  File "C:\Users\leona\dev\magic-animate-for-windows\demo\gradio_animate.py", line 19, in <module>
    animator = MagicAnimate()
  File "C:\Users\leona\dev\magic-animate-for-windows\demo\animate.py", line 90, in __init__
    self.appearance_encoder = AppearanceEncoderModel.from_pretrained(
  File "C:\Users\leona\dev\magic-animate-for-windows\venv\lib\site-packages\diffusers\models\modeling_utils.py", line 646, in from_pretrained
    raise ValueError(
ValueError: Cannot load <class 'magicanimate.models.appearance_encoder.AppearanceEncoderModel'> from pretrained_models/MagicAnimate/appearance_encoder because the following keys are missing: 
 mid_block.attentions.0.transformer_blocks.0.attn1.to_v.weight, down_blocks.2.attentions.1.transformer_blocks.0.ff.net.2.bias, up_blocks.1.attentions.1.transformer_blocks.0.attn2.to_out.0.weight, mid_block.attentions.0.norm.bias, mid_block.attentions.0.transformer_blocks.0.ff.net.0.proj.weight, up_blocks.1.attentions.0.proj_in.weight, up_blocks.1.resnets.1.conv_shortcut.weight, down_blocks.0.attentions.0.proj_out.weight, up_blocks.0.resnets.1.conv1.weight, up_blocks.3.resnets.2.norm2.bias, mid_block.attentions.0.transformer_blocks.0.attn2.to_out.0.bias, up_blocks.2.attentions.1.transformer_blocks.0.norm3.bias, up_blocks.1.attentions.1.transformer_blocks.0.ff.net.0.proj.weight, up_blocks.1.resnets.0.norm2.weight, up_blocks.3.attentions.1.transformer_blocks.0.attn2.to_out.0.weight, up_blocks.3.resnets.0.conv_shortcut.bias, up_blocks.2.attentions.1.transformer_blocks.0.norm2.bias, up_blocks.3.attentions.1.norm.bias, mid_block.resnets.1.time_emb_proj.weight, up_blocks.3.attentions.0.transformer_blocks.0.attn2.to_out.0.bias, down_blocks.0.resnets.0.conv2.weight, up_blocks.3.attentions.1.transformer_blocks.0.attn1.to_q.weight, down_blocks.0.attentions.1.transformer_blocks.0.ff.net.0.proj.bias, up_blocks.2.resnets.0.norm2.weight, up_blocks.2.resnets.2.conv1.weight, mid_block.attentions.0.proj_in.weight, up_blocks.2.attentions.0.transformer_blocks.0.ff.net.2.bias, mid_block.resnets.1.conv1.weight, mid_block.resnets.0.norm1.bias, up_blocks.2.attentions.1.transformer_blocks.0.attn2.to_k.weight, conv_in.weight, up_blocks.3.resnets.1.norm1.bias, up_blocks.1.resnets.0.conv2.bias, up_blocks.1.attentions.0.transformer_blocks.0.attn1.to_out.0.bias, down_blocks.0.resnets.0.time_emb_proj.bias, down_blocks.0.attentions.0.transformer_blocks.0.ff.net.2.bias, down_blocks.3.resnets.0.conv2.weight, down_blocks.0.attentions.1.transformer_blocks.0.attn1.to_q.weight, down_blocks.0.resnets.1.conv2.weight, up_blocks.2.attentions.0.proj_out.weight, up_blocks.2.resnets.0.conv2.bias, up_blocks.3.resnets.2.norm1.bias, down_blocks.0.attentions.0.transformer_blocks.0.norm3.weight, up_blocks.2.attentions.0.transformer_blocks.0.attn2.to_out.0.bias, down_blocks.0.attentions.0.transformer_blocks.0.attn1.to_out.0.weight, down_blocks.1.attentions.0.transformer_blocks.0.attn2.to_out.0.bias, up_blocks.3.attentions.2.transformer_blocks.0.norm1.weight, mid_block.resnets.1.conv1.bias, down_blocks.2.attentions.1.transformer_blocks.0.ff.net.2.weight, down_blocks.3.resnets.0.norm2.bias, down_blocks.2.attentions.1.transformer_blocks.0.norm3.bias, down_blocks.0.resnets.0.norm2.bias, up_blocks.3.attentions.2.norm.bias, up_blocks.3.resnets.1.conv2.bias, down_blocks.2.resnets.0.conv_shortcut.bias, up_blocks.3.attentions.0.transformer_blocks.0.attn1.to_k.weight, down_blocks.0.resnets.1.norm1.bias, down_blocks.2.resnets.1.conv1.bias, up_blocks.1.upsamplers.0.conv.weight, down_blocks.1.resnets.1.time_emb_proj.bias, up_blocks.3.resnets.1.conv1.bias, up_blocks.1.attentions.2.transformer_blocks.0.norm2.weight, mid_block.attentions.0.transformer_blocks.0.norm1.weight, up_blocks.2.attentions.0.transformer_blocks.0.norm2.bias, down_blocks.0.attentions.0.transformer_blocks.0.attn2.to_q.weight, down_blocks.2.attentions.0.transformer_blocks.0.ff.net.0.proj.weight, down_blocks.2.attentions.1.transformer_blocks.0.attn1.to_q.weight, down_blocks.0.attentions.1.proj_out.bias, down_blocks.3.resnets.0.norm1.bias, down_blocks.1.attentions.1.transformer_blocks.0.ff.net.0.proj.bias, up_blocks.3.attentions.1.proj_in.bias, down_blocks.0.resnets.1.time_emb_proj.bias, up_blocks.3.attentions.0.transformer_blocks.0.attn1.to_out.0.weight, down_blocks.2.attentions.0.transformer_blocks.0.ff.net.2.bias, up_blocks.1.attentions.0.transformer_blocks.0.ff.net.2.bias, up_blocks.1.resnets.2.norm1.weight, down_blocks.0.attentions.1.transformer_blocks.0.attn2.to_out.0.bias, down_blocks.1.attentions.1.transformer_blocks.0.norm1.weight, up_blocks.2.attentions.0.transformer_blocks.0.attn1.to_k.weight, up_blocks.0.resnets.1.time_emb_proj.weight, down_blocks.1.attentions.1.transformer_blocks.0.ff.net.0.proj.weight, up_blocks.0.resnets.2.conv1.weight, down_blocks.0.resnets.1.norm1.weight, up_blocks.0.resnets.1.norm1.weight, mid_block.resnets.1.norm1.weight, down_blocks.1.resnets.1.conv2.bias, down_blocks.1.attentions.1.transformer_blocks.0.attn1.to_q.weight, down_blocks.0.attentions.1.proj_in.weight, down_blocks.2.attentions.1.proj_in.bias, down_blocks.3.resnets.1.norm1.weight, up_blocks.2.resnets.0.conv2.weight, down_blocks.1.attentions.0.proj_in.bias, down_blocks.2.resnets.0.conv1.bias, down_blocks.0.attentions.0.transformer_blocks.0.ff.net.0.proj.weight, up_blocks.2.attentions.2.transformer_blocks.0.ff.net.0.proj.weight, up_blocks.1.attentions.1.transformer_blocks.0.norm2.weight, up_blocks.2.attentions.1.transformer_blocks.0.attn1.to_out.0.bias, up_blocks.2.attentions.2.transformer_blocks.0.ff.net.2.bias, up_blocks.2.resnets.1.norm2.bias, up_blocks.3.resnets.0.norm1.weight, mid_block.resnets.0.conv2.bias, up_blocks.2.resnets.1.norm2.weight, up_blocks.3.attentions.1.transformer_blocks.0.attn1.to_out.0.weight, up_blocks.3.attentions.1.proj_out.bias, down_blocks.2.downsamplers.0.conv.weight, down_blocks.0.resnets.0.norm1.weight, down_blocks.2.attentions.1.proj_out.weight, up_blocks.0.upsamplers.0.conv.weight, up_blocks.1.attentions.1.transformer_blocks.0.norm1.weight, up_blocks.3.attentions.0.norm.bias, down_blocks.1.attentions.1.transformer_blocks.0.attn2.to_q.weight, up_blocks.1.resnets.0.conv2.weight, up_blocks.3.attentions.0.transformer_blocks.0.norm2.bias, up_blocks.1.resnets.2.norm2.weight, down_blocks.2.attentions.1.transformer_blocks.0.attn1.to_out.0.weight, up_blocks.2.attentions.1.transformer_blocks.0.norm1.bias, up_blocks.3.attentions.1.transformer_blocks.0.attn1.to_out.0.bias, up_blocks.1.attentions.1.transformer_blocks.0.attn2.to_q.weight, mid_block.attentions.0.transformer_blocks.0.ff.net.2.weight, up_blocks.3.attentions.0.transformer_blocks.0.norm3.weight, down_blocks.3.resnets.1.time_emb_proj.bias, up_blocks.1.resnets.1.norm1.bias, down_blocks.3.resnets.0.conv2.bias, mid_block.resnets.1.conv2.bias, up_blocks.2.attentions.2.transformer_blocks.0.norm1.bias, up_blocks.1.resnets.1.conv1.bias, up_blocks.2.attentions.1.proj_out.bias, down_blocks.2.resnets.0.conv1.weight, up_blocks.2.resnets.0.norm1.weight, up_blocks.1.resnets.0.conv1.bias, down_blocks.3.resnets.1.conv2.weight, down_blocks.1.attentions.0.transformer_blocks.0.norm3.bias, up_blocks.2.attentions.0.transformer_blocks.0.norm3.bias, up_blocks.2.resnets.1.conv_shortcut.weight, down_blocks.2.attentions.0.transformer_blocks.0.attn2.to_v.weight, up_blocks.1.resnets.1.time_emb_proj.weight, up_blocks.1.resnets.2.norm2.bias, down_blocks.0.attentions.0.transformer_blocks.0.attn1.to_q.weight, down_blocks.1.attentions.0.transformer_blocks.0.attn1.to_out.0.bias, up_blocks.2.resnets.0.conv1.bias, down_blocks.2.attentions.1.transformer_blocks.0.norm1.weight, down_blocks.0.resnets.0.norm1.bias, down_blocks.1.attentions.1.proj_in.weight, up_blocks.1.resnets.2.conv2.weight, up_blocks.2.attentions.1.transformer_blocks.0.ff.net.2.bias, down_blocks.0.attentions.1.transformer_blocks.0.norm1.bias, up_blocks.2.attentions.2.transformer_blocks.0.norm3.weight, down_blocks.0.attentions.0.norm.weight, up_blocks.2.resnets.0.time_emb_proj.bias, up_blocks.2.upsamplers.0.conv.bias, mid_block.attentions.0.norm.weight, down_blocks.2.attentions.1.norm.weight, mid_block.resnets.1.norm1.bias, up_blocks.0.resnets.1.conv2.weight, up_blocks.2.attentions.2.transformer_blocks.0.attn2.to_k.weight, up_blocks.1.attentions.1.transformer_blocks.0.ff.net.2.bias, down_blocks.2.resnets.0.conv_shortcut.weight, down_blocks.2.attentions.0.transformer_blocks.0.norm1.bias, up_blocks.1.attentions.1.transformer_blocks.0.attn2.to_v.weight, up_blocks.1.attentions.2.norm.weight, down_blocks.1.attentions.1.transformer_blocks.0.attn2.to_out.0.bias, down_blocks.1.attentions.0.transformer_blocks.0.attn1.to_v.weight, down_blocks.1.resnets.1.norm1.weight, up_blocks.0.resnets.2.norm1.bias, down_blocks.2.attentions.0.transformer_blocks.0.attn2.to_q.weight, up_blocks.0.resnets.2.conv1.bias, up_blocks.2.attentions.0.norm.weight, up_blocks.3.resnets.2.time_emb_proj.bias, up_blocks.1.attentions.0.transformer_blocks.0.norm1.weight, down_blocks.0.attentions.0.transformer_blocks.0.attn2.to_out.0.bias, up_blocks.1.attentions.2.transformer_blocks.0.attn2.to_out.0.weight, up_blocks.3.resnets.0.time_emb_proj.weight, up_blocks.0.resnets.2.norm1.weight, up_blocks.1.attentions.0.transformer_blocks.0.attn2.to_out.0.weight, up_blocks.2.attentions.1.transformer_blocks.0.attn2.to_out.0.weight, up_blocks.3.resnets.2.norm2.weight, up_blocks.1.resnets.0.time_emb_proj.weight, up_blocks.3.resnets.0.conv_shortcut.weight, up_blocks.1.resnets.2.conv2.bias, down_blocks.2.attentions.1.transformer_blocks.0.norm2.bias, up_blocks.2.attentions.1.transformer_blocks.0.ff.net.0.proj.bias, down_blocks.2.attentions.0.transformer_blocks.0.ff.net.0.proj.bias, down_blocks.1.downsamplers.0.conv.bias, mid_block.resnets.1.conv2.weight, down_blocks.0.attentions.0.proj_in.bias, down_blocks.2.attentions.0.norm.bias, up_blocks.1.attentions.2.transformer_blocks.0.attn1.to_k.weight, up_blocks.3.attentions.0.norm.weight, up_blocks.1.attentions.2.transformer_blocks.0.norm3.weight, up_blocks.0.resnets.0.norm1.weight, down_blocks.3.resnets.1.time_emb_proj.weight, down_blocks.1.resnets.1.norm1.bias, down_blocks.0.attentions.0.proj_in.weight, up_blocks.1.attentions.0.transformer_blocks.0.ff.net.0.proj.weight, up_blocks.3.attentions.0.transformer_blocks.0.attn2.to_v.weight, up_blocks.1.attentions.0.transformer_blocks.0.attn1.to_out.0.weight, up_blocks.2.resnets.1.conv2.weight, up_blocks.3.attentions.0.proj_out.bias, up_blocks.0.upsamplers.0.conv.bias, up_blocks.0.resnets.2.conv_shortcut.weight, down_blocks.2.attentions.0.transformer_blocks.0.attn1.to_v.weight, up_blocks.3.attentions.0.proj_out.weight, up_blocks.3.attentions.2.proj_in.weight, up_blocks.3.resnets.1.conv1.weight, mid_block.resnets.0.conv2.weight, up_blocks.3.attentions.1.transformer_blocks.0.attn2.to_v.weight, down_blocks.2.attentions.0.transformer_blocks.0.norm2.bias, up_blocks.1.attentions.2.transformer_blocks.0.attn2.to_out.0.bias, mid_block.attentions.0.transformer_blocks.0.ff.net.0.proj.bias, down_blocks.2.attentions.0.transformer_blocks.0.attn2.to_out.0.bias, up_blocks.1.attentions.1.transformer_blocks.0.attn1.to_out.0.bias, up_blocks.1.attentions.1.transformer_blocks.0.attn2.to_k.weight, up_blocks.3.attentions.0.transformer_blocks.0.ff.net.2.bias, mid_block.resnets.0.norm2.weight, up_blocks.3.attentions.0.proj_in.bias, up_blocks.3.resnets.2.norm1.weight, down_blocks.1.attentions.0.transformer_blocks.0.norm1.bias, up_blocks.0.resnets.2.conv2.bias, down_blocks.2.attentions.0.transformer_blocks.0.attn1.to_k.weight, up_blocks.1.attentions.1.proj_in.bias, up_blocks.1.attentions.2.proj_in.bias, down_blocks.2.attentions.1.transformer_blocks.0.attn2.to_k.weight, up_blocks.0.resnets.0.conv2.bias, up_blocks.1.resnets.0.norm1.bias, up_blocks.3.resnets.0.time_emb_proj.bias, up_blocks.0.resnets.1.norm1.bias, down_blocks.2.attentions.1.proj_out.bias, up_blocks.3.attentions.1.transformer_blocks.0.ff.net.2.weight, up_blocks.3.attentions.0.transformer_blocks.0.attn1.to_q.weight, mid_block.attentions.0.transformer_blocks.0.norm1.bias, down_blocks.0.attentions.0.transformer_blocks.0.attn1.to_v.weight, up_blocks.1.attentions.2.transformer_blocks.0.norm1.bias, up_blocks.3.resnets.1.conv2.weight, down_blocks.1.attentions.0.transformer_blocks.0.norm2.weight, up_blocks.0.resnets.1.conv_shortcut.weight, up_blocks.3.attentions.1.transformer_blocks.0.ff.net.0.proj.weight, down_blocks.2.attentions.0.transformer_blocks.0.norm1.weight, up_blocks.2.attentions.2.transformer_blocks.0.attn1.to_v.weight, up_blocks.2.attentions.2.transformer_blocks.0.attn2.to_q.weight, up_blocks.0.resnets.1.time_emb_proj.bias, up_blocks.1.attentions.2.proj_out.weight, up_blocks.2.attentions.2.proj_in.bias, down_blocks.2.attentions.0.transformer_blocks.0.attn2.to_out.0.weight, down_blocks.0.resnets.0.conv2.bias, up_blocks.2.resnets.0.time_emb_proj.weight, down_blocks.1.attentions.1.transformer_blocks.0.attn1.to_out.0.bias, mid_block.attentions.0.transformer_blocks.0.attn2.to_v.weight, down_blocks.2.attentions.1.transformer_blocks.0.attn2.to_v.weight, up_blocks.2.resnets.2.time_emb_proj.weight, up_blocks.3.attentions.0.transformer_blocks.0.norm1.weight, up_blocks.3.resnets.2.conv_shortcut.bias, up_blocks.2.attentions.2.transformer_blocks.0.attn1.to_out.0.bias, up_blocks.3.attentions.2.proj_in.bias, up_blocks.3.attentions.1.transformer_blocks.0.attn1.to_k.weight, down_blocks.3.resnets.0.time_emb_proj.bias, down_blocks.0.attentions.1.norm.bias, down_blocks.2.attentions.0.proj_in.weight, up_blocks.0.resnets.0.conv1.weight, down_blocks.1.attentions.0.transformer_blocks.0.attn1.to_q.weight, up_blocks.3.attentions.0.transformer_blocks.0.attn2.to_out.0.weight, down_blocks.1.resnets.0.conv2.bias, up_blocks.1.attentions.0.proj_in.bias, up_blocks.1.attentions.0.proj_out.bias, up_blocks.1.attentions.2.transformer_blocks.0.attn2.to_q.weight, up_blocks.1.resnets.2.time_emb_proj.bias, up_blocks.1.resnets.0.conv_shortcut.bias, down_blocks.0.attentions.1.transformer_blocks.0.attn1.to_v.weight, up_blocks.0.resnets.0.norm2.bias, down_blocks.0.resnets.0.conv1.bias, down_blocks.1.resnets.0.norm2.bias, down_blocks.1.attentions.1.transformer_blocks.0.norm3.weight, down_blocks.0.attentions.0.transformer_blocks.0.norm1.weight, up_blocks.2.attentions.2.transformer_blocks.0.attn1.to_out.0.weight, up_blocks.2.resnets.1.conv_shortcut.bias, up_blocks.3.attentions.1.transformer_blocks.0.attn2.to_k.weight, up_blocks.2.resnets.1.conv1.weight, up_blocks.3.resnets.1.norm1.weight, up_blocks.1.attentions.2.transformer_blocks.0.attn1.to_q.weight, down_blocks.0.attentions.1.transformer_blocks.0.attn2.to_k.weight, up_blocks.3.resnets.1.time_emb_proj.weight, up_blocks.1.attentions.1.norm.bias, down_blocks.1.attentions.0.transformer_blocks.0.ff.net.0.proj.weight, up_blocks.0.resnets.0.norm1.bias, down_blocks.0.attentions.0.transformer_blocks.0.norm1.bias, up_blocks.2.attentions.1.transformer_blocks.0.attn1.to_v.weight, mid_block.resnets.0.time_emb_proj.bias, down_blocks.1.attentions.0.transformer_blocks.0.attn2.to_k.weight, up_blocks.1.attentions.0.transformer_blocks.0.attn1.to_q.weight, up_blocks.1.attentions.2.transformer_blocks.0.attn1.to_out.0.bias, up_blocks.3.attentions.0.transformer_blocks.0.norm3.bias, down_blocks.1.attentions.1.transformer_blocks.0.attn1.to_k.weight, up_blocks.2.resnets.1.norm1.bias, up_blocks.1.attentions.1.transformer_blocks.0.norm3.weight, down_blocks.0.resnets.1.conv1.weight, down_blocks.0.downsamplers.0.conv.bias, down_blocks.2.attentions.0.transformer_blocks.0.norm3.weight, down_blocks.0.resnets.1.norm2.bias, down_blocks.1.attentions.1.transformer_blocks.0.ff.net.2.bias, down_blocks.2.attentions.1.transformer_blocks.0.ff.net.0.proj.weight, down_blocks.0.attentions.0.transformer_blocks.0.attn2.to_out.0.weight, up_blocks.2.attentions.1.transformer_blocks.0.attn2.to_v.weight, up_blocks.2.resnets.2.norm2.bias, down_blocks.0.resnets.1.norm2.weight, down_blocks.2.resnets.0.conv2.weight, up_blocks.2.attentions.2.transformer_blocks.0.attn2.to_out.0.bias, down_blocks.0.attentions.1.transformer_blocks.0.attn2.to_q.weight, up_blocks.2.resnets.0.conv1.weight, up_blocks.0.resnets.0.norm2.weight, down_blocks.1.resnets.0.norm1.weight, up_blocks.1.attentions.2.proj_out.bias, up_blocks.1.attentions.0.transformer_blocks.0.attn1.to_v.weight, up_blocks.1.resnets.1.conv2.weight, up_blocks.2.attentions.0.transformer_blocks.0.ff.net.0.proj.weight, down_blocks.2.resnets.0.time_emb_proj.bias, up_blocks.1.upsamplers.0.conv.bias, up_blocks.2.upsamplers.0.conv.weight, up_blocks.2.resnets.2.norm1.weight, down_blocks.0.attentions.1.proj_in.bias, up_blocks.2.attentions.0.transformer_blocks.0.norm1.bias, up_blocks.2.attentions.1.proj_out.weight, up_blocks.1.attentions.2.transformer_blocks.0.attn1.to_out.0.weight, up_blocks.3.attentions.1.proj_out.weight, up_blocks.2.attentions.0.transformer_blocks.0.attn2.to_out.0.weight, down_blocks.2.attentions.1.transformer_blocks.0.norm3.weight, up_blocks.1.attentions.0.proj_out.weight, up_blocks.2.resnets.2.norm1.bias, up_blocks.2.attentions.0.transformer_blocks.0.attn1.to_out.0.weight, mid_block.attentions.0.proj_in.bias, up_blocks.3.attentions.0.transformer_blocks.0.attn1.to_out.0.bias, up_blocks.1.attentions.0.transformer_blocks.0.ff.net.0.proj.bias, down_blocks.2.attentions.0.proj_in.bias, up_blocks.3.attentions.1.transformer_blocks.0.norm2.bias, up_blocks.2.attentions.1.transformer_blocks.0.norm1.weight, down_blocks.1.resnets.0.conv1.bias, down_blocks.3.resnets.1.norm1.bias, up_blocks.2.attentions.2.proj_out.bias, down_blocks.2.attentions.1.transformer_blocks.0.norm2.weight, down_blocks.1.resnets.1.conv2.weight, down_blocks.1.attentions.0.proj_in.weight, up_blocks.2.attentions.2.transformer_blocks.0.norm3.bias, up_blocks.3.resnets.1.norm2.bias, up_blocks.0.resnets.1.conv2.bias, up_blocks.1.resnets.2.time_emb_proj.weight, up_blocks.3.attentions.1.transformer_blocks.0.ff.net.2.bias, down_blocks.0.attentions.0.norm.bias, up_blocks.2.attentions.2.transformer_blocks.0.norm2.bias, up_blocks.2.attentions.0.transformer_blocks.0.ff.net.2.weight, down_blocks.1.attentions.1.transformer_blocks.0.attn2.to_out.0.weight, up_blocks.3.resnets.2.conv1.weight, up_blocks.3.resnets.0.conv1.bias, down_blocks.1.attentions.0.transformer_blocks.0.ff.net.2.weight, up_blocks.2.attentions.2.transformer_blocks.0.norm1.weight, up_blocks.1.resnets.2.conv1.bias, down_blocks.2.resnets.0.norm1.weight, up_blocks.2.attentions.2.norm.bias, down_blocks.1.resnets.0.conv_shortcut.bias, up_blocks.2.attentions.2.transformer_blocks.0.attn2.to_out.0.weight, down_blocks.0.resnets.0.norm2.weight, up_blocks.2.resnets.1.time_emb_proj.weight, up_blocks.2.attentions.1.transformer_blocks.0.attn2.to_q.weight, up_blocks.2.attentions.0.transformer_blocks.0.attn2.to_v.weight, up_blocks.2.attentions.2.transformer_blocks.0.attn1.to_k.weight, down_blocks.0.resnets.1.conv2.bias, down_blocks.2.resnets.1.norm1.bias, down_blocks.1.attentions.0.transformer_blocks.0.norm2.bias, up_blocks.1.attentions.2.transformer_blocks.0.norm1.weight, down_blocks.1.resnets.1.norm2.weight, up_blocks.1.attentions.1.proj_out.weight, up_blocks.2.attentions.1.transformer_blocks.0.attn1.to_q.weight, up_blocks.2.attentions.1.transformer_blocks.0.attn1.to_k.weight, up_blocks.2.attentions.0.transformer_blocks.0.attn2.to_q.weight, down_blocks.0.attentions.1.transformer_blocks.0.ff.net.2.bias, down_blocks.2.attentions.0.transformer_blocks.0.attn1.to_out.0.bias, down_blocks.1.resnets.0.conv_shortcut.weight, down_blocks.1.attentions.1.transformer_blocks.0.norm2.weight, down_blocks.2.resnets.0.conv2.bias, up_blocks.2.attentions.2.proj_in.weight, down_blocks.1.downsamplers.0.conv.weight, down_blocks.2.attentions.1.transformer_blocks.0.attn1.to_out.0.bias, up_blocks.1.resnets.2.conv1.weight, down_blocks.2.resnets.1.conv2.weight, up_blocks.1.attentions.0.transformer_blocks.0.attn2.to_k.weight, up_blocks.1.attentions.2.transformer_blocks.0.ff.net.2.weight, up_blocks.2.attentions.1.norm.bias, mid_block.attentions.0.transformer_blocks.0.ff.net.2.bias, down_blocks.0.attentions.1.transformer_blocks.0.norm1.weight, up_blocks.2.attentions.1.transformer_blocks.0.norm2.weight, down_blocks.1.attentions.1.proj_out.weight, up_blocks.1.attentions.0.transformer_blocks.0.attn2.to_v.weight, up_blocks.1.attentions.1.transformer_blocks.0.attn2.to_out.0.bias, up_blocks.3.attentions.2.norm.weight, up_blocks.1.attentions.1.transformer_blocks.0.norm3.bias, down_blocks.1.attentions.0.norm.weight, down_blocks.1.resnets.1.time_emb_proj.weight, up_blocks.2.resnets.0.norm2.bias, down_blocks.1.attentions.0.transformer_blocks.0.attn2.to_q.weight, down_blocks.2.attentions.1.transformer_blocks.0.norm1.bias, up_blocks.0.resnets.1.norm2.bias, down_blocks.2.attentions.0.transformer_blocks.0.ff.net.2.weight, down_blocks.1.attentions.1.transformer_blocks.0.norm2.bias, down_blocks.2.attentions.0.transformer_blocks.0.attn1.to_out.0.weight, down_blocks.3.resnets.1.norm2.bias, up_blocks.1.attentions.2.transformer_blocks.0.norm2.bias, up_blocks.1.attentions.2.transformer_blocks.0.ff.net.0.proj.bias, up_blocks.2.attentions.0.transformer_blocks.0.norm3.weight, up_blocks.3.attentions.0.transformer_blocks.0.norm2.weight, down_blocks.1.resnets.1.norm2.bias, up_blocks.3.resnets.1.norm2.weight, down_blocks.2.attentions.1.transformer_blocks.0.attn1.to_v.weight, up_blocks.1.attentions.0.transformer_blocks.0.norm3.bias, mid_block.attentions.0.proj_out.bias, up_blocks.3.attentions.1.transformer_blocks.0.norm3.bias, up_blocks.3.attentions.1.transformer_blocks.0.ff.net.0.proj.bias, down_blocks.0.downsamplers.0.conv.weight, down_blocks.3.resnets.0.norm2.weight, up_blocks.2.attentions.0.transformer_blocks.0.attn1.to_q.weight, down_blocks.1.attentions.0.transformer_blocks.0.attn1.to_k.weight, up_blocks.3.attentions.2.transformer_blocks.0.norm1.bias, up_blocks.2.resnets.1.conv2.bias, up_blocks.2.attentions.2.proj_out.weight, up_blocks.3.resnets.1.time_emb_proj.bias, down_blocks.2.attentions.0.transformer_blocks.0.norm3.bias, up_blocks.1.attentions.0.transformer_blocks.0.norm3.weight, up_blocks.1.attentions.0.transformer_blocks.0.norm2.weight, up_blocks.3.attentions.0.transformer_blocks.0.norm1.bias, up_blocks.2.resnets.2.time_emb_proj.bias, down_blocks.0.resnets.1.conv1.bias, up_blocks.3.resnets.0.conv1.weight, down_blocks.2.resnets.0.time_emb_proj.weight, up_blocks.1.attentions.1.transformer_blocks.0.norm2.bias, up_blocks.2.attentions.1.transformer_blocks.0.norm3.weight, up_blocks.1.resnets.2.conv_shortcut.weight, up_blocks.1.attentions.0.transformer_blocks.0.attn2.to_out.0.bias, down_blocks.0.attentions.1.norm.weight, up_blocks.2.attentions.1.proj_in.weight, up_blocks.3.resnets.2.conv1.bias, down_blocks.3.resnets.1.conv1.weight, up_blocks.1.attentions.2.transformer_blocks.0.attn2.to_v.weight, down_blocks.2.resnets.1.norm1.weight, up_blocks.2.attentions.2.norm.weight, up_blocks.3.attentions.1.transformer_blocks.0.norm3.weight, up_blocks.3.resnets.1.conv_shortcut.bias, down_blocks.2.attentions.0.transformer_blocks.0.attn1.to_q.weight, up_blocks.1.resnets.0.norm2.bias, up_blocks.2.resnets.0.norm1.bias, down_blocks.1.attentions.0.transformer_blocks.0.attn2.to_out.0.weight, up_blocks.3.attentions.1.transformer_blocks.0.norm1.weight, up_blocks.1.attentions.1.transformer_blocks.0.attn1.to_q.weight, up_blocks.3.attentions.1.transformer_blocks.0.norm1.bias, mid_block.attentions.0.transformer_blocks.0.norm3.bias, up_blocks.2.resnets.2.conv2.weight, down_blocks.1.attentions.1.proj_out.bias, up_blocks.3.attentions.0.transformer_blocks.0.attn2.to_q.weight, down_blocks.1.attentions.0.transformer_blocks.0.attn1.to_out.0.weight, down_blocks.0.attentions.1.transformer_blocks.0.norm2.bias, down_blocks.0.attentions.0.transformer_blocks.0.ff.net.2.weight, down_blocks.3.resnets.0.conv1.bias, up_blocks.1.attentions.2.transformer_blocks.0.norm3.bias, up_blocks.1.resnets.1.time_emb_proj.bias, mid_block.resnets.0.norm1.weight, down_blocks.2.resnets.1.norm2.weight, up_blocks.2.resnets.2.conv_shortcut.bias, down_blocks.1.attentions.1.norm.weight, up_blocks.1.resnets.0.conv1.weight, up_blocks.3.resnets.2.time_emb_proj.weight, up_blocks.2.attentions.0.transformer_blocks.0.attn1.to_v.weight, up_blocks.1.resnets.1.norm1.weight, down_blocks.2.attentions.0.transformer_blocks.0.norm2.weight, up_blocks.3.attentions.1.transformer_blocks.0.attn2.to_q.weight, down_blocks.2.resnets.1.conv1.weight, up_blocks.2.attentions.0.transformer_blocks.0.norm2.weight, up_blocks.1.attentions.1.transformer_blocks.0.attn1.to_out.0.weight, down_blocks.2.resnets.1.conv2.bias, down_blocks.2.attentions.0.proj_out.weight, down_blocks.1.attentions.1.norm.bias, up_blocks.0.resnets.2.time_emb_proj.weight, up_blocks.2.attentions.2.transformer_blocks.0.ff.net.0.proj.bias, mid_block.attentions.0.transformer_blocks.0.attn1.to_q.weight, mid_block.resnets.1.time_emb_proj.bias, up_blocks.1.attentions.1.transformer_blocks.0.ff.net.0.proj.bias, up_blocks.1.attentions.2.transformer_blocks.0.attn1.to_v.weight, up_blocks.2.resnets.1.norm1.weight, up_blocks.1.resnets.0.time_emb_proj.bias, up_blocks.3.attentions.0.transformer_blocks.0.attn2.to_k.weight, up_blocks.1.resnets.1.conv2.bias, up_blocks.1.resnets.2.norm1.bias, up_blocks.3.attentions.1.proj_in.weight, up_blocks.3.resnets.0.norm2.weight, down_blocks.2.attentions.1.transformer_blocks.0.attn2.to_out.0.weight, down_blocks.2.downsamplers.0.conv.bias, up_blocks.2.resnets.1.conv1.bias, mid_block.resnets.0.conv1.bias, down_blocks.0.attentions.0.transformer_blocks.0.attn1.to_k.weight, down_blocks.1.attentions.0.proj_out.bias, down_blocks.0.resnets.0.conv1.weight, down_blocks.1.attentions.1.transformer_blocks.0.ff.net.2.weight, down_blocks.0.attentions.1.transformer_blocks.0.ff.net.2.weight, down_blocks.2.resnets.1.norm2.bias, down_blocks.1.resnets.1.conv1.bias, up_blocks.2.attentions.2.transformer_blocks.0.attn2.to_v.weight, down_blocks.0.attentions.1.transformer_blocks.0.ff.net.0.proj.weight, mid_block.attentions.0.transformer_blocks.0.norm3.weight, mid_block.attentions.0.transformer_blocks.0.attn1.to_k.weight, up_blocks.1.resnets.1.norm2.weight, up_blocks.3.attentions.1.transformer_blocks.0.attn2.to_out.0.bias, down_blocks.1.attentions.1.transformer_blocks.0.attn2.to_v.weight, conv_in.bias, down_blocks.2.attentions.1.transformer_blocks.0.attn2.to_q.weight, up_blocks.3.attentions.0.proj_in.weight, down_blocks.1.attentions.0.transformer_blocks.0.norm1.weight, up_blocks.1.attentions.2.proj_in.weight, up_blocks.2.resnets.2.conv_shortcut.weight, up_blocks.1.attentions.1.transformer_blocks.0.attn1.to_k.weight, up_blocks.2.attentions.0.norm.bias, up_blocks.3.resnets.2.conv2.weight, down_blocks.2.resnets.0.norm1.bias, down_blocks.1.attentions.0.proj_out.weight, down_blocks.1.attentions.1.transformer_blocks.0.attn2.to_k.weight, up_blocks.1.attentions.0.transformer_blocks.0.ff.net.2.weight, up_blocks.1.attentions.0.transformer_blocks.0.attn2.to_q.weight, time_embedding.linear_1.weight, up_blocks.2.resnets.2.conv2.bias, down_blocks.3.resnets.0.time_emb_proj.weight, up_blocks.0.resnets.0.time_emb_proj.weight, down_blocks.0.attentions.0.transformer_blocks.0.attn2.to_v.weight, down_blocks.1.resnets.0.conv2.weight, down_blocks.1.resnets.0.norm1.bias, down_blocks.1.attentions.0.transformer_blocks.0.norm3.weight, down_blocks.0.attentions.1.proj_out.weight, up_blocks.1.attentions.0.norm.weight, up_blocks.3.resnets.2.conv_shortcut.weight, down_blocks.0.attentions.1.transformer_blocks.0.attn2.to_out.0.weight, down_blocks.1.resnets.0.time_emb_proj.bias, up_blocks.3.attentions.1.transformer_blocks.0.attn1.to_v.weight, up_blocks.3.resnets.2.conv2.bias, down_blocks.0.attentions.1.transformer_blocks.0.attn1.to_k.weight, up_blocks.1.resnets.1.norm2.bias, down_blocks.2.attentions.1.proj_in.weight, down_blocks.0.attentions.0.transformer_blocks.0.norm2.weight, up_blocks.0.resnets.0.conv2.weight, down_blocks.1.attentions.1.transformer_blocks.0.attn1.to_v.weight, up_blocks.0.resnets.1.norm2.weight, up_blocks.2.attentions.2.transformer_blocks.0.attn1.to_q.weight, time_embedding.linear_2.bias, down_blocks.1.resnets.1.conv1.weight, mid_block.attentions.0.transformer_blocks.0.norm2.weight, up_blocks.3.resnets.1.conv_shortcut.weight, mid_block.resnets.0.conv1.weight, down_blocks.2.attentions.1.transformer_blocks.0.attn1.to_k.weight, down_blocks.2.attentions.0.proj_out.bias, down_blocks.3.resnets.1.conv1.bias, up_blocks.3.resnets.0.conv2.bias, mid_block.resnets.1.norm2.bias, up_blocks.2.attentions.1.transformer_blocks.0.attn2.to_out.0.bias, down_blocks.2.resnets.0.norm2.bias, up_blocks.2.resnets.1.time_emb_proj.bias, mid_block.attentions.0.transformer_blocks.0.norm2.bias, down_blocks.1.resnets.0.norm2.weight, up_blocks.2.attentions.1.proj_in.bias, up_blocks.2.attentions.1.norm.weight, up_blocks.2.resnets.2.conv1.bias, up_blocks.2.attentions.2.transformer_blocks.0.norm2.weight, up_blocks.0.resnets.2.norm2.bias, up_blocks.2.attentions.0.proj_in.bias, up_blocks.0.resnets.1.conv_shortcut.bias, down_blocks.0.attentions.0.transformer_blocks.0.norm2.bias, down_blocks.1.attentions.1.transformer_blocks.0.norm1.bias, up_blocks.2.attentions.0.transformer_blocks.0.attn2.to_k.weight, down_blocks.3.resnets.1.conv2.bias, mid_block.resnets.0.norm2.bias, down_blocks.0.attentions.0.transformer_blocks.0.attn2.to_k.weight, up_blocks.1.attentions.0.transformer_blocks.0.norm1.bias, up_blocks.2.attentions.0.transformer_blocks.0.attn1.to_out.0.bias, up_blocks.2.attentions.1.transformer_blocks.0.attn1.to_out.0.weight, down_blocks.0.attentions.1.transformer_blocks.0.attn2.to_v.weight, up_blocks.3.resnets.0.norm1.bias, up_blocks.0.resnets.2.conv2.weight, up_blocks.2.attentions.0.proj_out.bias, up_blocks.3.attentions.1.norm.weight, mid_block.attentions.0.transformer_blocks.0.attn2.to_q.weight, up_blocks.0.resnets.0.time_emb_proj.bias, down_blocks.2.attentions.1.transformer_blocks.0.attn2.to_out.0.bias, down_blocks.2.attentions.0.transformer_blocks.0.attn2.to_k.weight, down_blocks.0.attentions.0.transformer_blocks.0.ff.net.0.proj.bias, up_blocks.3.resnets.0.norm2.bias, up_blocks.1.attentions.1.transformer_blocks.0.attn1.to_v.weight, up_blocks.1.attentions.1.transformer_blocks.0.norm1.bias, down_blocks.2.resnets.0.norm2.weight, down_blocks.1.resnets.0.time_emb_proj.weight, up_blocks.0.resnets.0.conv1.bias, down_blocks.1.attentions.1.proj_in.bias, up_blocks.1.resnets.0.norm1.weight, up_blocks.3.attentions.0.transformer_blocks.0.ff.net.0.proj.bias, up_blocks.1.attentions.1.proj_out.bias, down_blocks.0.attentions.1.transformer_blocks.0.norm3.weight, up_blocks.1.attentions.1.proj_in.weight, down_blocks.1.attentions.1.transformer_blocks.0.attn1.to_out.0.weight, mid_block.attentions.0.transformer_blocks.0.attn1.to_out.0.bias, up_blocks.0.resnets.0.conv_shortcut.bias, up_blocks.0.resnets.1.conv1.bias, down_blocks.2.attentions.1.norm.bias, up_blocks.1.attentions.2.transformer_blocks.0.ff.net.0.proj.weight, mid_block.resnets.0.time_emb_proj.weight, down_blocks.1.attentions.1.transformer_blocks.0.norm3.bias, mid_block.attentions.0.transformer_blocks.0.attn2.to_k.weight, up_blocks.2.attentions.1.transformer_blocks.0.ff.net.0.proj.weight, down_blocks.3.resnets.1.norm2.weight, up_blocks.2.attentions.0.transformer_blocks.0.ff.net.0.proj.bias, time_embedding.linear_1.bias, down_blocks.1.attentions.0.transformer_blocks.0.ff.net.0.proj.bias, down_blocks.3.resnets.0.conv1.weight, down_blocks.0.resnets.0.time_emb_proj.weight, up_blocks.1.attentions.0.transformer_blocks.0.norm2.bias, up_blocks.1.attentions.1.norm.weight, down_blocks.2.attentions.0.norm.weight, up_blocks.1.resnets.1.conv_shortcut.bias, down_blocks.0.attentions.1.transformer_blocks.0.attn1.to_out.0.bias, down_blocks.2.resnets.1.time_emb_proj.bias, up_blocks.0.resnets.2.time_emb_proj.bias, up_blocks.1.attentions.0.norm.bias, down_blocks.0.attentions.1.transformer_blocks.0.attn1.to_out.0.weight, up_blocks.2.resnets.2.norm2.weight, down_blocks.0.attentions.1.transformer_blocks.0.norm2.weight, up_blocks.0.resnets.0.conv_shortcut.weight, down_blocks.0.attentions.0.transformer_blocks.0.attn1.to_out.0.bias, up_blocks.3.resnets.0.conv2.weight, up_blocks.3.attentions.0.transformer_blocks.0.ff.net.0.proj.weight, up_blocks.0.resnets.2.conv_shortcut.bias, mid_block.resnets.1.norm2.weight, down_blocks.1.resnets.0.conv1.weight, up_blocks.1.resnets.0.conv_shortcut.weight, up_blocks.1.attentions.0.transformer_blocks.0.attn1.to_k.weight, up_blocks.2.attentions.1.transformer_blocks.0.ff.net.2.weight, mid_block.attentions.0.transformer_blocks.0.attn1.to_out.0.weight, down_blocks.1.attentions.0.norm.bias, up_blocks.2.attentions.0.transformer_blocks.0.norm1.weight, up_blocks.1.attentions.2.transformer_blocks.0.attn2.to_k.weight, up_blocks.3.attentions.0.transformer_blocks.0.attn1.to_v.weight, down_blocks.3.resnets.0.norm1.weight, up_blocks.3.attentions.1.transformer_blocks.0.norm2.weight, up_blocks.2.resnets.0.conv_shortcut.bias, up_blocks.0.resnets.2.norm2.weight, down_blocks.0.attentions.1.transformer_blocks.0.norm3.bias, up_blocks.1.attentions.2.norm.bias, up_blocks.1.attentions.1.transformer_blocks.0.ff.net.2.weight, up_blocks.2.attentions.2.transformer_blocks.0.ff.net.2.weight, up_blocks.2.resnets.0.conv_shortcut.weight, mid_block.attentions.0.transformer_blocks.0.attn2.to_out.0.weight, down_blocks.0.attentions.0.transformer_blocks.0.norm3.bias, mid_block.attentions.0.proj_out.weight, down_blocks.1.attentions.0.transformer_blocks.0.attn2.to_v.weight, down_blocks.1.attentions.0.transformer_blocks.0.ff.net.2.bias, up_blocks.1.resnets.2.conv_shortcut.bias, time_embedding.linear_2.weight, down_blocks.2.resnets.1.time_emb_proj.weight, up_blocks.1.resnets.1.conv1.weight, up_blocks.2.attentions.0.proj_in.weight, down_blocks.0.attentions.0.proj_out.bias, down_blocks.0.resnets.1.time_emb_proj.weight, down_blocks.2.attentions.1.transformer_blocks.0.ff.net.0.proj.bias, up_blocks.1.attentions.2.transformer_blocks.0.ff.net.2.bias, up_blocks.3.attentions.0.transformer_blocks.0.ff.net.2.weight.
 Please make sure to pass `low_cpu_mem_usage=False` and `device_map=None` if you want to randomly initialize those weights or else make sure your checkpoint file is correct.

Did u have enough memory?

@leochoo
Copy link
Author

leochoo commented Dec 17, 2023

@sdbds My device has 64GB ram and 12GB VRAM so I think it should be more than enough I think?

okay let me try to re-download again.
( I tried fresh-install multiple times but did not work tho. )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants