Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't lock segment from subsegment #306

Merged
merged 2 commits into from
Oct 7, 2021

Conversation

anuraaga
Copy link
Contributor

Issue #, if available: #303

Description of changes:

High mutability and circular refrences between segments and subsegments create a prime landscape for thread issues. This tries to solve #303 by removing locking on the codepaths from DefaultStreamingStrategy - the streaming strategy is used when processing subsegments which may be on a different thread than the segment, which could then try to stream subsegments.

This should avoid deadlocks, I don't know if it's possible to be totally confident without removing streaming strategy completely - alway streaming, removing the circular references between segments / subsegments completely, would give confidence they can't deadlock. Probably best to shift to OTel instead though.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@codecov-commenter
Copy link

codecov-commenter commented Sep 30, 2021

Codecov Report

Merging #306 (58b37ab) into master (0fe77bb) will increase coverage by 0.04%.
The diff coverage is 100.00%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master     #306      +/-   ##
============================================
+ Coverage     58.87%   58.92%   +0.04%     
- Complexity     1205     1206       +1     
============================================
  Files           131      131              
  Lines          5072     5066       -6     
  Branches        593      593              
============================================
- Hits           2986     2985       -1     
+ Misses         1809     1806       -3     
+ Partials        277      275       -2     
Impacted Files Coverage Δ
...n/java/com/amazonaws/xray/entities/EntityImpl.java 87.05% <100.00%> (-0.21%) ⬇️
.../java/com/amazonaws/xray/entities/SegmentImpl.java 87.34% <100.00%> (-0.16%) ⬇️
...zonaws/xray/strategy/DefaultStreamingStrategy.java 96.29% <100.00%> (+3.70%) ⬆️
...ava/com/amazonaws/xray/AWSXRayRecorderBuilder.java 69.29% <0.00%> (+0.87%) ⬆️
...va/com/amazonaws/xray/entities/SubsegmentImpl.java 76.05% <0.00%> (+4.22%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0fe77bb...58b37ab. Read the comment docs.

Comment on lines 200 to 201
this.subsegments = new ArrayList<>();
this.subsegments = new CopyOnWriteArrayList<>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like it could get very expensive for people adding lots of subsegments... I know we don't want to spend the time for the most performant solution, but this seems like a pretty substantial degradation that would impact nearly all customers for more of an edge-case deadlock. Is there any other workaround for this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for something fairly standard like 5 subsegments, this is probably faster than something like a synchronized list.

For a hundred segments it does look slow - but so does the normal ArrayList since we call remove on it which results in quadratic behavior anyways.

I think to be optimal we would need to use a concurrent hash set, but unfortunately we need to support getSubsegments().add(). It is possible to wrap a set into a list (with slow performances of get(I)) I think, it could be an idea for if there are reports of regression in the wild.

Copy link
Contributor

@willarmiros willarmiros Oct 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok gotcha, I guess we're also still calling remove so there's going to definitely be some degradation there. Can we add a quick additional benchmark test for creating say 100 subsegments just to confirm it's not degrading performance too badly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd rather close out this PR without having to too big iterations. I went and switched to synchronizedList since it'll be more similar to what we have, just a separate fine-grained lock instead of the entire segment lock.

@anuraaga anuraaga force-pushed the no-segment-lock-from-subsegment branch from 80f0781 to 5125ec3 Compare October 7, 2021 08:08
@anuraaga anuraaga force-pushed the no-segment-lock-from-subsegment branch from 5125ec3 to 58b37ab Compare October 7, 2021 08:15
Copy link
Contributor

@willarmiros willarmiros left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@willarmiros willarmiros merged commit d5ccae2 into aws:master Oct 7, 2021
willarmiros pushed a commit to willarmiros/aws-xray-sdk-java that referenced this pull request Aug 13, 2022
wangzlei pushed a commit that referenced this pull request Aug 22, 2022
* Revert "Don't lock segment from subsegment (#306)"

This reverts commit d5ccae2.

* Revert "Keep track of emitted in emitter. (#279)"

This reverts commit d66d69a.

* Revert "Remove subsegments lock (#278)"

This reverts commit adfb395.

* Revert "Lock all accesses to entities. (#250)"

This reverts commit f0a3b3a.

* add back errorprone annotations

* fixed shouldPropagate

* added back getSubsegmentsCopy

* added back compareAndSetEmitted

* whitespace fix
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants