Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Foldmason not running on large set of proteins #11

Open
rrw1007 opened this issue Sep 4, 2024 · 11 comments
Open

Foldmason not running on large set of proteins #11

rrw1007 opened this issue Sep 4, 2024 · 11 comments

Comments

@rrw1007
Copy link

rrw1007 commented Sep 4, 2024

Expected Behavior

I am running foldmason with the command below:
easy-msa /workspace/protein/structs /workspace/results_foldmason/protein/result /workspace/results_foldmason/protein/tmpFolder --report-mode 1 --precluster --max-seq-len 4000

I have about 2000 proteins of approx length 280 amino acids.

Current Behavior

I am getting memory errors.

Steps to Reproduce (for bugs)

Just run easy-msa on a large set of sequences.

Foldseek Output (for bugs)

I get the output below (last few lines):

Size of the sequence database: 3588
Size of the alignment database: 3588
Number of clusters: 1487

Writing results 0h 0m 0s 0ms
Time for merging to clu: 0h 0m 0s 428ms
Time for processing: 0h 0m 36s 725ms
Error: structuremsa died
Segmentation fault (core dumped)

Context

The --max-seq-len parameter doesn't seem to make a difference. I'm still getting the memory error.

Your Environment

I've been running foldmason via the docker image created from the dockerfile. I am running on a kubernetes cluster and provide 64Gb of RAM, and 6 cpus.

@rrw1007 rrw1007 changed the title Foldmason not running on set of 2076 proteins Foldmason not running on large set of proteins Sep 4, 2024
@gamcil
Copy link
Collaborator

gamcil commented Sep 5, 2024

Do you also get the same behaviour with pre-clustering disabled (--precluster 0)?

@rrw1007
Copy link
Author

rrw1007 commented Sep 5, 2024 via email

@milot-mirdita
Copy link
Member

How did you build the container? I just realized that we have not been automatically building containers.

@milot-mirdita
Copy link
Member

@gamcil commited a fix earlier today. Could you check if the issue is still happening for you? You can download precompiled binaries at https://mmseqs.com/foldmason.

@rrw1007
Copy link
Author

rrw1007 commented Sep 5, 2024

How did you build the container? I just realized that we have not been automatically building containers.

I used the Dockerfile provided in the repository

@rrw1007
Copy link
Author

rrw1007 commented Sep 5, 2024

@gamcil commited a fix earlier today. Could you check if the issue is still happening for you? You can download precompiled binaries at https://mmseqs.com/foldmason.

Running this now. It seems to have gone further than before. I removed --report-mode 1. Could that be causing the issue?

@rrw1007
Copy link
Author

rrw1007 commented Sep 5, 2024

@gamcil commited a fix earlier today. Could you check if the issue is still happening for you? You can download precompiled binaries at https://mmseqs.com/foldmason.

Running this now. It seems to have gone further than before. I removed --report-mode 1. Could that be causing the issue?

Confirmed that running with the precompiled binaries and using the command:
./foldmason easy-msa /workspace/protein/structs /workspace/results_foldmason/protein/result /workspace/results_foldmason/protein/tmpFolder --precluster ran without any errors. So the issue might be including the --report-mode 1 parameter.

@gamcil
Copy link
Collaborator

gamcil commented Sep 6, 2024

Were you getting segfaults also with --report-mode 1?

@rrw1007
Copy link
Author

rrw1007 commented Sep 6, 2024

Were you getting segfaults also with --report-mode 1?

It seems like the problem is --report-mode 1. If I remove that and run the command, I don't get segfaults. If I include it, I get segfaults.

@milot-mirdita
Copy link
Member

Would it be possible to share the inputs so that we can try to reproduce the new issue?

@rrw1007
Copy link
Author

rrw1007 commented Sep 6, 2024

Would it be possible to share the inputs so that we can try to reproduce the new issue?

I tried it again with --report-mode 1 and it seems to be working. Thank you for the assistance. I'll reach out in case I face any problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants