Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Throughput improvements to the AWS SDK S3 client were released in v3.649.0 #6423

Closed
1 task done
kuhe opened this issue Aug 30, 2024 · 2 comments
Closed
1 task done
Assignees
Labels
announcement This is an announcement issue feature-request New feature or enhancement. May require GitHub community feedback. p1 This is a high priority issue

Comments

@kuhe
Copy link
Contributor

kuhe commented Aug 30, 2024

Describe the feature

Investigate and improve throughput performance of the S3Client in this SDK.

Use Case

Improve throughput of parallel or high-volume of operations, including S3 HeadObject.

Proposed Solution

improve request handler creation time

avoid FS / loadNodeConfig calls in the request path

improve endpoint resolution time

defer calls to socket.setKeepAlive and other socket event listener attachments

Acknowledgements

  • I may be able to implement this feature request

SDK version used

3.600.0+

Environment details (OS name and version, etc.)

Node.js

@kuhe
Copy link
Contributor Author

kuhe commented Sep 10, 2024

The set of changes were released in https://github.com/aws/aws-sdk-js-v3/releases/tag/v3.649.0.

https://www.npmjs.com/package/@aws-sdk/client-s3/v/3.649.0

There is an additional client configuration field related to this.

  const s3 = new S3Client({
    cacheMiddleware: true
  });

cacheMiddleware caches the middleware resolver function stack per Client+Command combination.

When, for example, making 400x HeadObject requests with the same S3Client, this can be enabled for a performance boost. The caveat is that modification of the middlewareStack on client or Command after the first request will be ignored.

@kuhe kuhe removed the pending-release This issue will be fixed by an approved PR that hasn't been released yet. label Sep 10, 2024
@kuhe
Copy link
Contributor Author

kuhe commented Sep 10, 2024

The expected performance profile of S3 HeadObject with e.g. parallel batch size 10 and total size 400 objects is that with v3.649.0 and cacheMiddleware=true, the v3 SDK runs at a similar throughput to the v2 SDK.

sdk v2 x 55.51 ops/sec ±0.57% (87 runs sampled)
sdk v3 x 55.62 ops/sec ±0.75% (87 runs sampled)
Fastest is sdk v3
- sdk v2 is 1.00x slower

Using https://github.com/bolt-juri-gavshin/aws-sdk-v3-vs-v2-perf-comparison

Note, the node-http2 requestHandler should not be used with S3, I've excluded it from the sample.

Disclaimer: due to the greater number of features in the JSv3 SDK and the fact that the v2 SDK sacrifices a lot of correctness (e.g. endpoint resolution) for simplicity, the v2 SDK will still be faster than the v3 SDK in many cases. However, JSv2 has entered maintenance mode and development efforts are focused on this iteration of the AWS SDK for JavaScript.

@kuhe kuhe added closing-soon This issue will automatically close in 4 days unless further comments are made. announcement This is an announcement issue and removed closing-soon This issue will automatically close in 4 days unless further comments are made. labels Sep 10, 2024
@kuhe kuhe changed the title Throughput improvements to the AWS SDK S3 client Throughput improvements to the AWS SDK S3 client were released in v3.649.0 Sep 12, 2024
@kuhe kuhe pinned this issue Sep 12, 2024
@kuhe kuhe closed this as completed Sep 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
announcement This is an announcement issue feature-request New feature or enhancement. May require GitHub community feedback. p1 This is a high priority issue
Projects
None yet
Development

No branches or pull requests

1 participant