Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with trex_inference.py 'size mismatch for encoder.sentence_encoder.byte_combine.projection.weight' #44

Open
caroline-mrs opened this issue May 2, 2024 · 1 comment
Assignees
Labels
bug Something isn't working

Comments

@caroline-mrs
Copy link

I'm unable to execute the trex_inference.py file as it's returning this error:

[D] Loading function traces
[D] Loading Trex model
Traceback (most recent call last):
  File "trex_inference.py", line 102, in <module>
    main(input_pairs, input_traces, model_checkpoint_dir,data_bin_dir, output_dir)
  File "trex_inference.py", line 60, in main
    trex = TrexModel.from_pretrained(f'checkpoints/similarity',
  File "/fairseq/models/trex/model.py", line 308, in from_pretrained
    x = hub_utils.from_pretrained(
  File "/fairseq/hub_utils.py", line 73, in from_pretrained
    models, args, task = checkpoint_utils.load_model_ensemble_and_task(
  File "/fairseq/checkpoint_utils.py", line 389, in load_model_ensemble_and_task
    model.load_state_dict(state["model"], strict=strict, model_cfg=cfg.model)
  File "/fairseq/models/fairseq_model.py", line 125, in load_state_dict
    return super().load_state_dict(new_state_dict, strict)
  File "/opt/conda/envs/trex/lib/python3.8/site-packages/torch/nn/modules/module.py", line 2189, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for TrexModel:
        Unexpected key(s) in state_dict: "encoder.sentence_encoder.embed_bytes.weight". 
        size mismatch for encoder.sentence_encoder.byte_combine.convolutions.0.weight: copying a param with shape torch.Size([64, 768, 1]) from checkpoint, the shape in current model is torch.Size([4, 1, 1]).
        size mismatch for encoder.sentence_encoder.byte_combine.convolutions.0.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([4]).
        size mismatch for encoder.sentence_encoder.byte_combine.convolutions.1.weight: copying a param with shape torch.Size([128, 768, 2]) from checkpoint, the shape in current model is torch.Size([8, 1, 2]).
        size mismatch for encoder.sentence_encoder.byte_combine.convolutions.1.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([8]).
        size mismatch for encoder.sentence_encoder.byte_combine.convolutions.2.weight: copying a param with shape torch.Size([192, 768, 3]) from checkpoint, the shape in current model is torch.Size([12, 1, 3]).
        size mismatch for encoder.sentence_encoder.byte_combine.convolutions.2.bias: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([12]).
        size mismatch for encoder.sentence_encoder.byte_combine.highway.layers.0.weight: copying a param with shape torch.Size([768, 384]) from checkpoint, the shape in current model is torch.Size([48, 24]).
        size mismatch for encoder.sentence_encoder.byte_combine.highway.layers.0.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([48]).
        size mismatch for encoder.sentence_encoder.byte_combine.highway.layers.1.weight: copying a param with shape torch.Size([768, 384]) from checkpoint, the shape in current model is torch.Size([48, 24]).
        size mismatch for encoder.sentence_encoder.byte_combine.highway.layers.1.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([48]).
        size mismatch for encoder.sentence_encoder.byte_combine.projection.weight: copying a param with shape torch.Size([768, 384]) from checkpoint, the shape in current model is torch.Size([768, 24]).

Could someone please assist me in resolving this issue? I would be very appreciate.

@jimmy-sonny jimmy-sonny self-assigned this May 3, 2024
@jimmy-sonny jimmy-sonny added the bug Something isn't working label May 3, 2024
@jimmy-sonny
Copy link
Contributor

Hi @caroline-mrs , thank you for opening this issue.

I cannot reproduce the error that you are facing, can you please try again with the updated Trex Dockerfile - related to Issue #56?

Are you using the docker container as described in the Trex README?

Can you please provide the detailed steps and instructions that you are following?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants