optimize the read lock caused by computeIfAbsent and synchronized in high concurrency #9980

philippzhang · 2021-11-17T15:24:27Z

We test on hundreds of trino based on TB data, support sql query through YJP (YourKit-JavaProfiler), and find that there is a blocked lock of threads. This is the image query performance. After removing the lock, the query performance has a certain improvement. Let’s look at the code. It is found that these locks can not be locked when reading data, and the lock is retained when writing data.

before change code:

after change code:

eg:
sql SELECT SUM(a), SUM(b), COUNT(c), SUM(pv), SUM(id) FROM table where event_day = '$day' and eg_code = $code

cla-bot · 2021-11-17T15:24:29Z

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: zhangyangshuo.
This is most likely caused by a git client misconfiguration; please make sure to:

check if your git client is configured with an email to sign commits git config --list | grep email
If not, set it up using git config --global user.email email@example.com
Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

lhofhansl · 2021-11-18T01:30:35Z

core/trino-spi/src/main/java/io/trino/spi/type/TypeOperators.java

+        this.cache = (operatorConvention, supplier) -> {
+            Object value = cache.get(operatorConvention);
+            if (value == null) {
+                value = cache.computeIfAbsent(operatorConvention, key -> supplier.get());


I'm surprised this makes a difference.

Edit: In a microbenchmark I find this indeed to be about 1/3 of the cost when the key is present. In this case we never remove any entries, so it is safe.

During large-scale data testing, we found that this code has performance problems. When modifying, we also refer to the transformation idea of computeifabsent method of mybatis. The following is the modification of mybatis:

mybatis/mybatis-3#2223

The linked JDK bug was fixed in JDK 9 (see https://bugs.openjdk.java.net/browse/JDK-8161372) Trino requires JDK 11 at-least. Is there some way to reproduce your test to observe the locking?

My microbenchmark was with JDK 11.

Edit: There is still some extra locking needed to enforce the exactly-once execution of the compute part when the key is concurrently removed.
Hence when you know keys are never removed get() followed by computeIfAbsent might be faster.

lhofhansl · 2021-11-18T01:31:29Z

core/trino-spi/src/main/java/io/trino/spi/type/TypeOperators.java

-        public synchronized MethodHandle get()
+        // optimize the read lock caused by synchronized
+        // public synchronized MethodHandle get()
+        public MethodHandle get()
        {
            if (adapted == null) {


adapted would have to be declared volatile, no?

Hello, this is to avoid locking when reading data. During the TB level data test, it is found that this code has a blocked lock. If modified, it can provide certain performance when reading data

Yep. But it still needs to be volatile to be correct. Since that has a slight performance implication (a memory barrier for each access), it might be prudent to redo the perf test.

Indeed. without volatile, this code is not safe according to Java Memory Model semantics. The thread may perceive operations in the wrong order and see a partially constructed object. This is the classic "double checked locking" pattern.

sopel39 · 2021-11-18T09:54:52Z

core/trino-spi/src/main/java/io/trino/spi/type/TypeOperators.java

+        this.cache = (operatorConvention, supplier) -> {
+            Object value = cache.get(operatorConvention);
+            if (value == null) {
+                value = cache.computeIfAbsent(operatorConvention, key -> supplier.get());


according to doc:

Some attempted update * operations on this map by other threads may be blocked while * computation is in progress, so the computation should be short * and simple.

looking at java.util.concurrent.ConcurrentHashMap#computeIfAbsent it shouldn't lock if element is already present, so once cache is filled, there should be no locking.

Why do you think this change matters in non-benchmark executions

During large-scale data testing, we found that this code has performance problems. When modifying, we also refer to the transformation idea of computeifabsent method of mybatis. The following is the modification of mybatis:

mybatis/mybatis-3#2223

During large-scale data testing, we found that this code has performance problems.

How did you do the testing?
io.trino.spi.type.TypeOperators#cache should not be changed after a while since all operators would be cached. Hence, there should be no locking

bitsondatadev · 2022-11-02T11:59:25Z

👋 @philippzhang - this PR is inactive and doesn't seem to be under development. If you'd like to continue work on this at any point in the future, feel free to re-open.

optimize the read lock caused by computeIfAbsent and synchronized

a49805c

lhofhansl reviewed Nov 18, 2021

View reviewed changes

hashhar requested a review from sopel39 November 18, 2021 06:25

sopel39 reviewed Nov 18, 2021

View reviewed changes

findepi mentioned this pull request Jan 20, 2022

Small code cleanup #10690

Merged

bitsondatadev closed this Nov 2, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

optimize the read lock caused by computeIfAbsent and synchronized in high concurrency #9980

optimize the read lock caused by computeIfAbsent and synchronized in high concurrency #9980

philippzhang commented Nov 17, 2021 •

edited

Loading

cla-bot bot commented Nov 17, 2021

lhofhansl Nov 18, 2021 •

edited

Loading

philippzhang Nov 18, 2021

hashhar Nov 18, 2021 •

edited

Loading

lhofhansl Nov 18, 2021 •

edited

Loading

lhofhansl Nov 18, 2021

philippzhang Nov 18, 2021

lhofhansl Nov 18, 2021

martint Nov 18, 2021

sopel39 Nov 18, 2021

philippzhang Nov 18, 2021

sopel39 Nov 18, 2021

bitsondatadev commented Nov 2, 2022

optimize the read lock caused by computeIfAbsent and synchronized in high concurrency #9980

optimize the read lock caused by computeIfAbsent and synchronized in high concurrency #9980

Conversation

philippzhang commented Nov 17, 2021 • edited Loading

cla-bot bot commented Nov 17, 2021

lhofhansl Nov 18, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hashhar Nov 18, 2021 • edited Loading

Choose a reason for hiding this comment

lhofhansl Nov 18, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bitsondatadev commented Nov 2, 2022

philippzhang commented Nov 17, 2021 •

edited

Loading

lhofhansl Nov 18, 2021 •

edited

Loading

hashhar Nov 18, 2021 •

edited

Loading

lhofhansl Nov 18, 2021 •

edited

Loading