-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
optimize the read lock caused by computeIfAbsent and synchronized in high concurrency #9980
Conversation
Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: zhangyangshuo.
|
this.cache = (operatorConvention, supplier) -> { | ||
Object value = cache.get(operatorConvention); | ||
if (value == null) { | ||
value = cache.computeIfAbsent(operatorConvention, key -> supplier.get()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm surprised this makes a difference.
Edit: In a microbenchmark I find this indeed to be about 1/3 of the cost when the key is present. In this case we never remove any entries, so it is safe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
During large-scale data testing, we found that this code has performance problems. When modifying, we also refer to the transformation idea of computeifabsent method of mybatis. The following is the modification of mybatis:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The linked JDK bug was fixed in JDK 9 (see https://bugs.openjdk.java.net/browse/JDK-8161372) Trino requires JDK 11 at-least. Is there some way to reproduce your test to observe the locking?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My microbenchmark was with JDK 11.
Edit: There is still some extra locking needed to enforce the exactly-once execution of the compute part when the key is concurrently removed.
Hence when you know keys are never removed get() followed by computeIfAbsent might be faster.
public synchronized MethodHandle get() | ||
// optimize the read lock caused by synchronized | ||
// public synchronized MethodHandle get() | ||
public MethodHandle get() | ||
{ | ||
if (adapted == null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
adapted
would have to be declared volatile
, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello, this is to avoid locking when reading data. During the TB level data test, it is found that this code has a blocked lock. If modified, it can provide certain performance when reading data
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep. But it still needs to be volatile
to be correct. Since that has a slight performance implication (a memory barrier for each access), it might be prudent to redo the perf test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed. without volatile
, this code is not safe according to Java Memory Model semantics. The thread may perceive operations in the wrong order and see a partially constructed object. This is the classic "double checked locking" pattern.
this.cache = (operatorConvention, supplier) -> { | ||
Object value = cache.get(operatorConvention); | ||
if (value == null) { | ||
value = cache.computeIfAbsent(operatorConvention, key -> supplier.get()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
according to doc:
Some attempted update
* operations on this map by other threads may be blocked while
* computation is in progress, so the computation should be short
* and simple.
looking at java.util.concurrent.ConcurrentHashMap#computeIfAbsent
it shouldn't lock if element is already present, so once cache is filled, there should be no locking.
Why do you think this change matters in non-benchmark executions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
During large-scale data testing, we found that this code has performance problems. When modifying, we also refer to the transformation idea of computeifabsent method of mybatis. The following is the modification of mybatis:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
During large-scale data testing, we found that this code has performance problems.
How did you do the testing?
io.trino.spi.type.TypeOperators#cache
should not be changed after a while since all operators would be cached. Hence, there should be no locking
👋 @philippzhang - this PR is inactive and doesn't seem to be under development. If you'd like to continue work on this at any point in the future, feel free to re-open. |
We test on hundreds of trino based on TB data, support sql query through YJP (YourKit-JavaProfiler), and find that there is a blocked lock of threads. This is the image query performance. After removing the lock, the query performance has a certain improvement. Let’s look at the code. It is found that these locks can not be locked when reading data, and the lock is retained when writing data.
before change code:

after change code:

eg:
sql SELECT SUM(a), SUM(b), COUNT(c), SUM(pv), SUM(id) FROM table where event_day = '$day' and eg_code = $code