Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace EFS with S3 #577

Draft
wants to merge 24 commits into
base: master
Choose a base branch
from

Conversation

munishchouhan
Copy link
Member

This PR will Replace usgae of file system with S3

@munishchouhan munishchouhan linked an issue Jul 24, 2024 that may be closed by this pull request
@munishchouhan munishchouhan marked this pull request as draft July 24, 2024 23:03
@munishchouhan munishchouhan added the WIP Work In Progress label Jul 24, 2024
@munishchouhan munishchouhan self-assigned this Jul 29, 2024
@munishchouhan
Copy link
Member Author

getting this error, while running new buildkit image on k8s

12:08PM WRN Error running in a new user namespace - fork/exec /usr/bin/fusion: invalid argument

12:08PM WRN cannot apply Nextflow profile :: unknown remote store for '' work prefix
/usr/bin/fusermount3: fuse device not found, try 'modprobe fuse' first
12:08PM FTL mount.go:251 > mounting filesystem error="fusermount exited with code 256\n"

@pditommaso
Copy link
Contributor

@fntlnz is the guru

@fntlnz
Copy link

fntlnz commented Aug 23, 2024

@munishchouhan something is off accessing the /dev/fuse device. How are you running this?

@fntlnz
Copy link

fntlnz commented Aug 23, 2024

Found, you need to add

--device /dev/fuse

@fntlnz
Copy link

fntlnz commented Aug 23, 2024

If this needs to be done in Kubernetes there are two possible solutions:

@pditommaso
Copy link
Contributor

Better using the fuse plugin

@munishchouhan
Copy link
Member Author

I have started the Daemonset

(base) munish.chouhan@Munishs-MacBook-Pro ~ % kubectl get DaemonSets -n kube-system
NAME                           DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
fuse-device-plugin-daemonset   1         1         1       1            1           <none>          18m

Add added limit as per https://github.com/nextflow-io/k8s-fuse-plugin/blob/master/README.md#usage
in here

requests.limits(Map.of("nextflow.io/fuse", new Quantity("1")))

But still getting the same error

2:12PM WRN Error running in a new user namespace - fork/exec /usr/bin/fusion: invalid argument

2:12PM WRN cannot apply Nextflow profile :: unknown remote store for '' work prefix
/usr/bin/fusermount3: mount failed: Operation not permitted
2:12PM FTL mount.go:251 > mounting filesystem error="fusermount exited with code 256\n"

cc @jordeu

@jordeu
Copy link
Member

jordeu commented Aug 27, 2024

From the invalid argument message I suspect that it's more related to an invalid Fusion command invocation. What is the full command line that it's executed?

@jordeu
Copy link
Member

jordeu commented Aug 27, 2024

or, can you share an example image that will be executed?

@munishchouhan
Copy link
Member Author

or, can you share an example image that will be executed?

I am using this example to test using wave-cli
wave --conda-package bwa --wave-endpoint http://localhost:9090

@munishchouhan
Copy link
Member Author

From the invalid argument message I suspect that it's more related to an invalid Fusion command invocation. What is the full command line that it's executed?

same command is working with docker, but i will debug further to check this

@jordeu
Copy link
Member

jordeu commented Aug 27, 2024

From the invalid argument message I suspect that it's more related to an invalid Fusion command invocation. What is the full command line that it's executed?

same command is working with docker, but i will debug further to check this

But with docker, there is no need to clone the namespace, so Fusion behaves differently. If you can get the full command line that is executed I can check if there is something strange.

@munishchouhan
Copy link
Member Author

From the invalid argument message I suspect that it's more related to an invalid Fusion command invocation. What is the full command line that it's executed?

same command is working with docker, but i will debug further to check this

But with docker, there is no need to clone the namespace, so Fusion behaves differently. If you can get the full command line that is executed I can check if there is something strange.

do you want command line for docker or k8s?

@jordeu
Copy link
Member

jordeu commented Aug 27, 2024

do you want command line for docker or k8s?

It's better if it's the k8s, but they should be the same, so if it's easier for you, send me the docker one.

@jordeu
Copy link
Member

jordeu commented Aug 27, 2024

Also, the environment variables are important. Send me the docker inspect and we can assume that it's the same on k8s.

@munishchouhan
Copy link
Member Author

It's better if it's the k8s, but they should be the same, so if it's easier for you, send me the docker one.

here is the docker command, I will work on getting you the k8s info

docker run --rm --privileged \
-e AWS_ACCESS_KEY_ID=<AWS_ACCESS_KEY_ID> \
-e AWS_SECRET_ACCESS_KEY=<AWS_SECRET_ACCESS_KEY> \
-e DOCKER_CONFIG=/fusion/s3/s3-bucket/workspace/9a3c69098bd07c17_1 \
--platform linux/amd64 cr.seqera.io/public/wave/buildkit:ef67f15426f36b72 \
buildctl-daemonless.sh build \
--frontend dockerfile.v0 \
--local dockerfile=/fusion/s3/s3-bucket/workspace/9a3c69098bd07c17_1 \
--opt filename=Containerfile \
--local context=/fusion/s3/s3-bucket/workspace/9a3c69098bd07c17_1/context \
--output type=image,name=docker.io/hrma017/dev:bwa--9a3c69098bd07c17,push=true,oci-mediatypes=true \
--opt platform=linux/amd64 \
--export-cache type=registry,image-manifest=true,ref=docker.io/hrma017/cache:9a3c69098bd07c17,mode=max,ignore-error=true,oci-mediatypes=true,compression=gzip,force-compression=false \
--import-cache type=registry,ref=docker.io/hrma017/cache:9a3c69098bd07c17

@munishchouhan
Copy link
Member Author

Also, the environment variables are important. Send me the docker inspect and we can assume that it's the same on k8s.

here is the inspect in file
docker_inspect.txt

@jordeu
Copy link
Member

jordeu commented Aug 27, 2024

What is this inspect file? In the inspect I see a command different than the one in the docker command line:

"Cmd": [
                "trivy",
                "--quiet",
                "image",
                "--timeout",
                "10m",
                "--format",
                "json",
                "--output",
                "/fusion/s3/s3-bucket/workspace/scan-9f342c61b284/report.json",
                "docker.io/hrma017/dev:bwa--9a3c69098bd07c17"
            ],

@munishchouhan
Copy link
Member Author

@jordeu apologies, i shared the scan inspect, here is for the build
docker_build_inspect.json

@munishchouhan
Copy link
Member Author

hi @jordeu
Any advice or pointers to move forward with k8s-fuse-plugin?

@munishchouhan munishchouhan added blocker and removed WIP Work In Progress labels Sep 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow the use of S3 bucket to host container build assets
4 participants