Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enabling ChecksumMode when calling getObject increases the response time #6497

Open
2 tasks
trivikr opened this issue Sep 19, 2024 · 0 comments
Open
2 tasks
Labels
feature-request New feature or enhancement. May require GitHub community feedback. p2 This is a standard priority issue queued This issues is on the AWS team's backlog

Comments

@trivikr
Copy link
Member

trivikr commented Sep 19, 2024

Describe the feature

Enabling ChecksumMode when calling getObject increases the response time.

This happens as the response stream is consumed for validating the checksum during the API call.
This was done in #5043 to ensure that checksum validation error is thrown during the API call and an error is thrown. However, since we return a stream, the checksum validation should be delayed till stream is consumed by the callee.

Use Case

Test case

import { S3 } from "@aws-sdk/client-s3"; // v3.654.0
import { equal } from "assert";

const client = new S3();
const Bucket = "test-checksum-mode"; // Replace with your test bucket name.
const Key = "hello-world.txt";
const Body = "Hello World\n".repeat(100_000); // File of size ~1 MB.
const ChecksumAlgorithm = "CRC32";

// The putObject call be commented out for subsequent calls for benchmarking getObject.
await client.putObject({ Bucket, Key, Body, ChecksumAlgorithm });

console.time("getObject");
const response = await client.getObject({
  Bucket,
  Key,
  // ChecksumMode: "ENABLED",
});
console.timeEnd("getObject");

equal(Body, await response.Body.transformToString());

API call times compared with file size/repeatations

File size Repeat # for putObject getObject without ChecksumMode getObject with ChecksumMode
~1 MB 100_000 27ms 85ms
~10 MB 1_000_000 28ms 450ms
~100 MB 10_000_000 39ms 3676ms

The numbers will differ depending on your network speed, but the difference will remain

Proposed Solution

Write a checksum stream wrapper class which consumes the stream for validation. only when end user consumes it.

Other Information

No response

Acknowledgements

  • I may be able to implement this feature request
  • This feature might incur a breaking change

SDK version used

3.654.0

Environment details (OS name and version, etc.)

v20.10.0

@trivikr trivikr added feature-request New feature or enhancement. May require GitHub community feedback. needs-triage This issue or PR still needs to be triaged. p2 This is a standard priority issue queued This issues is on the AWS team's backlog and removed needs-triage This issue or PR still needs to be triaged. labels Sep 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request New feature or enhancement. May require GitHub community feedback. p2 This is a standard priority issue queued This issues is on the AWS team's backlog
Projects
None yet
Development

No branches or pull requests

1 participant