Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add "unsafe" AsyncRequestBody constructors for better performance when concurrent modification isn't a worry #3925

Conversation

StephenFlavin
Copy link
Contributor

@StephenFlavin StephenFlavin commented Apr 20, 2023

The static methods in AsyncRequestBody do some defensive copying and transforming of types that hamper performance and utilise more memory than needed if the user has already ensured either that concurrent modification won't happen or that it doesn't matter, this pr addresses that with new AsyncRequestBody constructors.

This also fixes an unnecessary copy being made in AsyncRequestBody#fromByteBuffer (#1735) and AsyncRequestBody#fromString as a side effect.

Motivation and Context

Memory consumption and performance.

Related Issues:

Modifications

Adds AsyncRequestBody#fromBytesUnsafe(byte[]), AsyncRequestBody#fromByteBufferUnsafe(ByteBuffer) and AsyncRequestBody#fromByteBuffersUnsafe(ByteBuffer...) as no copy alternatives to the existing methods.

Adds AsyncRequestBody#fromByteBuffers(ByteBuffer...), AsyncRequestBody#fromRemainingByteBuffer, AsyncRequestBody#fromRemainingByteBufferUnsafe, AsyncRequestBody#fromRemainingByteBuffers and AsyncRequestBody#fromRemainingByteBuffersUnsafe methods as implementations that respect the current buffer positions.

Updates AsyncRequestBodyTest to junit 5.

Removes ByteArrayAsyncRequestBody

Adds ByteBuffersAsyncRequestBody with static builders for ByteBuffer, ByteBuffer[] and byte[].

  • calls to ByteBuffersAsyncRequestBody#subscribe are repeatable (not sure if this is a requirement but would be a breaking change vs ByteArrayAsyncRequestBody)
  • ByteBuffersAsyncRequestBody#subscribe creates a thread safe subscription.
  • ByteBuffer[] allows a single file > Integer.MAX_VALUE to be transferred.
  • A benefit of the implementation is ByteBuffersAsyncRequestBody doesn't copy an off heap ByteBuffer until it's requested by the subscription so depending on the subscriber less data will be on heap at a time.

Adds BinaryUtils#immutableCopyOf, BinaryUtils#immutableCopyOfRemaining and BinaryUtils#toNonDirectBuffer convenience methods.

Testing

Unit tests

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)

Checklist

  • I have read the CONTRIBUTING document
  • Local run of mvn install succeeds Unstable tests on master #3926
  • My code follows the code style of this project
  • My change requires a change to the Javadoc documentation
  • I have updated the Javadoc documentation accordingly
  • I have added tests to cover my changes
  • All new and existing tests passed
  • I have added a changelog entry. Adding a new entry must be accomplished by running the scripts/new-change script and following the instructions. Commit the new file created by the script in .changes/next-release with your changes.
  • My change is to implement 1.11 parity feature and I have updated LaunchChangelog

License

  • I confirm that this pull request can be released under the Apache 2 license

@StephenFlavin StephenFlavin force-pushed the skip-redundant-copies-and-transforms-in-asyncrequestbody branch from 1f41881 to 697eb81 Compare April 20, 2023 14:54
@StephenFlavin StephenFlavin force-pushed the skip-redundant-copies-and-transforms-in-asyncrequestbody branch 2 times, most recently from 0f9c7f4 to f8862e2 Compare April 21, 2023 13:09
@StephenFlavin StephenFlavin marked this pull request as ready for review April 21, 2023 13:10
@StephenFlavin StephenFlavin requested a review from a team as a code owner April 21, 2023 13:10
@StephenFlavin StephenFlavin force-pushed the skip-redundant-copies-and-transforms-in-asyncrequestbody branch 2 times, most recently from e1164ae to 29281b9 Compare April 21, 2023 13:16
@debora-ito debora-ito added the needs-review This issue or PR needs review from the team. label Apr 25, 2023
@StephenFlavin StephenFlavin force-pushed the skip-redundant-copies-and-transforms-in-asyncrequestbody branch 4 times, most recently from cfb4737 to 65f8c7c Compare April 28, 2023 08:51
@StephenFlavin StephenFlavin force-pushed the skip-redundant-copies-and-transforms-in-asyncrequestbody branch from 65f8c7c to 71d4ce4 Compare May 17, 2023 12:07
@StephenFlavin
Copy link
Contributor Author

StephenFlavin commented May 17, 2023

@millems I've updated the PR with your suggestions and some additions (mainly new util methods in BinaryUtils)
PR title and description updated

@StephenFlavin StephenFlavin changed the title Skip defensive copies and transforms in AsyncRequestBody Add "unsafe" AsyncRequestBody constructors for better performance when concurrent modification isn't a worry May 17, 2023
@StephenFlavin StephenFlavin force-pushed the skip-redundant-copies-and-transforms-in-asyncrequestbody branch 5 times, most recently from 7b0cc5c to 4f64fd9 Compare May 17, 2023 12:39
@StephenFlavin StephenFlavin force-pushed the skip-redundant-copies-and-transforms-in-asyncrequestbody branch 3 times, most recently from 97ec660 to 21b3c4b Compare May 18, 2023 09:21
@StephenFlavin StephenFlavin force-pushed the skip-redundant-copies-and-transforms-in-asyncrequestbody branch 3 times, most recently from 1fd1420 to beb7833 Compare May 22, 2023 18:46
@millems
Copy link
Contributor

millems commented May 23, 2023

@StephenFlavin Awesome, let me kick off the suites again!

@StephenFlavin
Copy link
Contributor Author

It looks like some of the builds have failed, is there a way for me to see the output @millems?
I'm assuming this is because of the flakey tests I mentioned

@millems
Copy link
Contributor

millems commented May 25, 2023

@StephenFlavin Sorry, I was off yesterday. I wish there was a way for you to see the output. We're still iterating on the security of our testing solution, trying to figure out how to expose the logs.

It looks like test failures in :protocol-tests, and :codegen-generated-classes-test this time. Those modules do functional tests of the SDK clients, so they sometimes catch issues that appear in integration of components. Are you able to reproduce those errors yourself by building the modules? Otherwise I can pull the specific test failures.

@StephenFlavin
Copy link
Contributor Author

👋 no worries, I've not been able to get enough of the build to finish to be able to run these specific tests, I'll keep plugging away at it when I have time, I have a feeling some of my local setup with python and the fact that I'm running on an m1 mac is making things a bit more painful than it needs to be.

@StephenFlavin
Copy link
Contributor Author

there does appear to just be some unstable tests on master e.g. c015a19

@StephenFlavin
Copy link
Contributor Author

StephenFlavin commented May 30, 2023

I've been able to get them running, :protocol-tests the only failing tests I have are integration tests because I don't have IAM credentials I can use re: https://github.com/aws/aws-sdk-java-v2/blob/master/docs/GettingStarted.md#integration-tests

The :codegen-generated-classes-test tests do seem to have genuine failures in AsyncStreamingCoreMetricsTest, I'll have a closer look at this one.

Request content was only 0 bytes, but the specified content-length was 10 bytes.

@StephenFlavin
Copy link
Contributor Author

I've not had a chance to spend time on this, will try to get it done this week

@StephenFlavin StephenFlavin force-pushed the skip-redundant-copies-and-transforms-in-asyncrequestbody branch 2 times, most recently from 2177bf9 to f6bc363 Compare June 9, 2023 11:25
@StephenFlavin
Copy link
Contributor Author

StephenFlavin commented Jun 9, 2023

@millems test failures fixed.

The issues was ByteBuffersAsyncRequestBody didn't support multiple subscribers properly as it was just passing along the same ByteBuffer which it was holding a reference to, so if one subscriber read the buffer the other subscriber would attempt to read the same buffer but it had already been consumed, the Subscriber#onNext call now does buffer.asReadOnlyBuffer() and I've added a test to verify this requirement in future.

This is a change in comparison to how ByteArrayAsyncRequestBody worked as it would have been passing along mutable copies but I don't think this is an issue.

To support this I had to fix an issue in ChunkBuffer.java where it was expecting ByteBuffer#array to work which is an optional operation for ByteBuffer implementations.

@StephenFlavin StephenFlavin force-pushed the skip-redundant-copies-and-transforms-in-asyncrequestbody branch from f6bc363 to a2a3067 Compare June 9, 2023 14:19
@millems
Copy link
Contributor

millems commented Jun 9, 2023

Re-running tests.

@sonarqubecloud
Copy link

sonarqubecloud bot commented Jun 9, 2023

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 26 Code Smells

92.5% 92.5% Coverage
0.0% 0.0% Duplication

@millems millems merged commit be45eb0 into aws:master Jun 9, 2023
@millems
Copy link
Contributor

millems commented Jun 9, 2023

Thank you for your contribution! It has been merged and will go out with the next regular release.

wendigo added a commit to wendigo/trino that referenced this pull request Jun 21, 2023
davidh44 added a commit that referenced this pull request Jun 21, 2023
* Fixed issue with leased connection leaks when threads executing HTTP … (#4066)

* Fixed issue with leased connection leaks when threads executing HTTP connections with Apache HttpClient were interrupted while the connection was in progress.

* Added logic in MakeHttpRequestStage to check and abort request if interrupted

* Add test cases for UrlConnectionHttpClient

* Moved the fix to AfterTransmissionExecutionInterceptorsStage to just close the stream instaed of aborting the reqyest in MakeHttpRequestStage

* Removing test cases related to UrlConnectionHttp since adding depenency in protocol-test for urlConnectionClient cause failues since it uses default Client all the places

* Updated after Zoe's comments

* Now it's possible to configure NettyNioAsyncHttpClient for non blocking DNS (#3990)

* Now it's possible to configure NettyNioAsyncHttpClient in order to use a
non blocking DNS resolver.

* Add package mapping for netty-resolver-dns.

---------

Co-authored-by: Matthew Miller <millem@amazon.com>

* Amazon Connect Service Update: This release adds search APIs for Prompts, Quick Connects and Hours of Operations, which can be used to search for those resources within a Connect Instance.

* AWS Certificate Manager Private Certificate Authority Update: Document-only update to refresh CLI documentation for AWS Private CA. No change to the service.

* Release 2.20.83. Updated CHANGELOG.md, README.md and all pom.xml.

* Add "unsafe" AsyncRequestBody constructors for byte[] and ByteBuffers (#3925)

* Update to next snapshot version: 2.20.84-SNAPSHOT

* Use WeakHashMap in IdleConenctionReaper  (#4087)

* Use WeakHashMap in IdleConenctionReaper to not prevent connection manager from getting GC'd

* Checkstyle fix

* Update S3IntegrationTestBase.java (#4079)

* Amazon Rekognition Update: This release adds support for improved accuracy with user vector in Amazon Rekognition Face Search. Adds new APIs: AssociateFaces, CreateUser, DeleteUser, DisassociateFaces, ListUsers, SearchUsers, SearchUsersByImage. Also adds new face metadata that can be stored: user vector.

* Amazon DynamoDB Update: Documentation updates for DynamoDB

* Amazon FSx Update: Amazon FSx for NetApp ONTAP now supports joining a storage virtual machine (SVM) to Active Directory after the SVM has been created.

* Amazon SageMaker Service Update: Sagemaker Neo now supports compilation for inferentia2 (ML_INF2) and Trainium1 (ML_TRN1) as available targets. With these devices, you can run your workloads at highest performance with lowest cost. inferentia2 (ML_INF2) is available in CMH and Trainium1 (ML_TRN1) is available in IAD currently

* AWS Amplify UI Builder Update: AWS Amplify UIBuilder is launching Codegen UI, a new feature that enables you to generate your amplify uibuilder components and forms.

* Amazon OpenSearch Service Update: This release adds support for SkipUnavailable connection property for cross cluster search

* Amazon DynamoDB Streams Update: Documentation updates for DynamoDB Streams

* Updated endpoints.json and partitions.json.

* Release 2.20.84. Updated CHANGELOG.md, README.md and all pom.xml.

* Update to next snapshot version: 2.20.85-SNAPSHOT

* docs: add scrocquesel as a contributor for code (#4091)

* docs: update README.md [skip ci]

* docs: update .all-contributorsrc [skip ci]

---------

Co-authored-by: allcontributors[bot] <46447321+allcontributors[bot]@users.noreply.github.com>
Co-authored-by: Debora N. Ito <476307+debora-ito@users.noreply.github.com>

* AWS CloudTrail Update: This feature allows users to view dashboards for CloudTrail Lake event data stores.

* AWS WAFV2 Update: You can now detect and block fraudulent account creation attempts with the new AWS WAF Fraud Control account creation fraud prevention (ACFP) managed rule group AWSManagedRulesACFPRuleSet.

* AWS Well-Architected Tool Update: AWS Well-Architected now supports Profiles that help customers prioritize which questions to focus on first by providing a list of prioritized questions that are better aligned with their business goals and outcomes.

* Amazon Lightsail Update: This release adds pagination for the Get Certificates API operation.

* Amazon Verified Permissions Update: GA release of Amazon Verified Permissions.

* EC2 Image Builder Update: Change the Image Builder ImagePipeline dateNextRun field to more accurately describe the data.

* Amazon CodeGuru Security Update: Initial release of Amazon CodeGuru Security APIs

* Amazon Simple Storage Service Update: Integrate double encryption feature to SDKs.

* Elastic Disaster Recovery Service Update: Added APIs to support network replication and recovery using AWS Elastic Disaster Recovery.

* AWS SimSpace Weaver Update: This release fixes using aws-us-gov ARNs in API calls and adds documentation for snapshot APIs.

* AWS SecurityHub Update: Add support for Security Hub Automation Rules

* Amazon Elastic Compute Cloud Update: This release introduces a new feature, EC2 Instance Connect Endpoint, that enables you to connect to a resource over TCP, without requiring the resource to have a public IPv4 address.

* Updated endpoints.json and partitions.json.

* Release 2.20.85. Updated CHANGELOG.md, README.md and all pom.xml.

* Update to next snapshot version: 2.20.86-SNAPSHOT

* Create secondary indices based on table bean annotations (#3923) (#4004)

* Create secondary indices based on table bean annotations (#3923)

* detect and group indices present in table schema into LSIs and GSIs
* pass request with indices information appended further

* Remove specifying provisioned throughput for GSIs (#3923)

* If there's no information about the billing mode of the new table,
  then it'll be using the PAY_PER_REQUEST one. It means that all
  GSIs related to this table will be doing the same and there's
  no need to hard code any provisioned throughput like it was done

* Allow passing empty indices list to CreateTableOperation (#3923)

* CreateTableRequest cannot handle empty list of indices of any type. It
  throws exception when given such a list. At the same time, it nicely
  handles the cases when indices lists are null. Make sure then that
  when empty indices list is passed CreateTableOperation, then in the
  CreateTableRequest it's just reflected as null.

---------

Co-authored-by: Adrian Chlebosz <Adrian.Chlebosz@stepstone.com>
Co-authored-by: Olivier L Applin <olapplin@amazon.com>

* Add EnhancedType parameters to static builder methods of StaticTableSchema and StaticImmitableTableSchema (#4077)

* Amazon Elastic File System Update: Documentation updates for EFS.

* Amazon GuardDuty Update: Updated descriptions for some APIs.

* Amazon Location Service Update: Amazon Location Service adds categories to places, including filtering on those categories in searches. Also, you can now add metadata properties to your geofences.

* AWS Audit Manager Update: This release introduces 2 Audit Manager features: CSV exports and new manual evidence options. You can now export your evidence finder results in CSV format. In addition, you can now add manual evidence to a control by entering free-form text or uploading a file from your browser.

* Updated endpoints.json and partitions.json.

* Release 2.20.86. Updated CHANGELOG.md, README.md and all pom.xml.

* Update to next snapshot version: 2.20.87-SNAPSHOT

* EnumAttributeConverter: enums can be identified by toString() or name(). toString() is the default for backward compatibility (#3971)

Co-authored-by: Zoe Wang <33073555+zoewangg@users.noreply.github.com>

* AWS Application Discovery Service Update: Add Amazon EC2 instance recommendations export

* AWS Account Update: Improve pagination support for ListRegions

* Amazon Simple Storage Service Update: This release adds SDK support for request-payer request header and request-charged response header in the "GetBucketAccelerateConfiguration", "ListMultipartUploads", "ListObjects", "ListObjectsV2" and "ListObjectVersions" S3 APIs.

* Amazon Connect Service Update: Updates the *InstanceStorageConfig APIs to support a new ResourceType: SCREEN_RECORDINGS to enable screen recording and specify the storage configurations for publishing the recordings. Also updates DescribeInstance and ListInstances APIs to include InstanceAccessUrl attribute in the API response.

* AWS Identity and Access Management Update: Documentation updates for AWS Identity and Access Management (IAM).

* Release 2.20.87. Updated CHANGELOG.md, README.md and all pom.xml.

* Update to next snapshot version: 2.20.88-SNAPSHOT

* Fix the StackOverflowException in WaiterExecutor in case of large retries count. (#3956)

* Move checksum calculation from afterMarshalling to modifyHttpRequest (#4108)

* Update HttpChecksumRequiredInterceptor

* Update HttpChecksumInHeaderInterceptor

* Update tests and remove constant

* Add back constant to resolve japicmp

* Add back javadocs

* docs: add dave-fn as a contributor for code (#4092)

* docs: update README.md [skip ci]

* docs: update .all-contributorsrc [skip ci]

* Removing unnecessary vscode file

---------

Co-authored-by: allcontributors[bot] <46447321+allcontributors[bot]@users.noreply.github.com>
Co-authored-by: Debora N. Ito <476307+debora-ito@users.noreply.github.com>

* Amazon Route 53 Domains Update: Update MaxItems upper bound to 1000 for ListPricesRequest

* Amazon EC2 Container Service Update: Documentation only update to address various tickets.

* AWS CloudFormation Update: Specify desired CloudFormation behavior in the event of ChangeSet execution failure using the CreateChangeSet OnStackFailure parameter

* AWS Price List Service Update: This release updates the PriceListArn regex pattern.

* AWS Glue Update: This release adds support for creating cross region table/database resource links

* Amazon Elastic Compute Cloud Update: API changes to AWS Verified Access to include data from trust providers in logs

* Amazon SageMaker Service Update: Amazon Sagemaker Autopilot releases CreateAutoMLJobV2 and DescribeAutoMLJobV2 for Autopilot customers with ImageClassification, TextClassification and Tabular problem type config support.

* Release 2.20.88. Updated CHANGELOG.md, README.md and all pom.xml.

* Update to next snapshot version: 2.20.89-SNAPSHOT

* AWS Lambda Update: This release adds RecursiveInvocationException to the Invoke API and InvokeWithResponseStream API.

* AWS Config Update: Updated ResourceType enum with new resource types onboarded by AWS Config in May 2023.

* Amazon Appflow Update: This release adds new API to reset connector metadata cache

* Amazon Elastic Compute Cloud Update: Adds support for targeting Dedicated Host allocations by assetIds in AWS Outposts

* Amazon Redshift Update: Added support for custom domain names for Redshift Provisioned clusters. This feature enables customers to create a custom domain name and use ACM to generate fully secure connections to it.

* Updated endpoints.json and partitions.json.

* Release 2.20.89. Updated CHANGELOG.md, README.md and all pom.xml.

* Update to next snapshot version: 2.20.90-SNAPSHOT

* Move QueryParametersToBodyInterceptor to front of interceptor chain (#4109)

* Move QueryParametersToBodyInterceptor to front of interceptor chain

* Move customization.config interceptors to front of interceptor chain - for query protocols

* Refactoring

* Add codegen tests

* Refactoring

* Refactoring

---------

Co-authored-by: John Viegas <70235430+joviegas@users.noreply.github.com>
Co-authored-by: Martin <mart256@gmail.com>
Co-authored-by: Matthew Miller <millem@amazon.com>
Co-authored-by: AWS <>
Co-authored-by: aws-sdk-java-automation <43143862+aws-sdk-java-automation@users.noreply.github.com>
Co-authored-by: Stephen Flavin <stephenflavin@outlook.com>
Co-authored-by: Zoe Wang <33073555+zoewangg@users.noreply.github.com>
Co-authored-by: allcontributors[bot] <46447321+allcontributors[bot]@users.noreply.github.com>
Co-authored-by: Debora N. Ito <476307+debora-ito@users.noreply.github.com>
Co-authored-by: Adrian Chlebosz <36669019+breader124@users.noreply.github.com>
Co-authored-by: Adrian Chlebosz <Adrian.Chlebosz@stepstone.com>
Co-authored-by: Olivier L Applin <olapplin@amazon.com>
Co-authored-by: Benjamin Maizels <36682168+bmaizels@users.noreply.github.com>
Co-authored-by: flitt <55669857+flittev@users.noreply.github.com>
cenedhryn added a commit that referenced this pull request Jun 29, 2023
@StephenFlavin
Copy link
Contributor Author

@millems or @debora-ito,
This PR resolves #1735 so that can be closed 🙂
Also would it be possible to be added to the all-contributors thing 🙏

@millems
Copy link
Contributor

millems commented Jul 11, 2023

Sorry, hadn't realized we didn't add you!

@StephenFlavin
Copy link
Contributor Author

np, thanks 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-review This issue or PR needs review from the team.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants