-
Notifications
You must be signed in to change notification settings - Fork 871
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Threading deadlock with Edge unique indexes #5875
Comments
Hi @TeamXceleratorDev , |
After some more investigations, this seems to be related to #5821. |
Ok, any way please send us full thread dump to fix it. |
Here is a sample stack using jconsole from my loading app. It actually seems similar to issue #5433. I am testing now with 2.1.13 GA. The issue still exists. It only happens when I have a multi-threaded edge insertion with a unique index on an edge property. Name: Thread-3 Stack trace: |
Here is the thread dump from Orient itself: Full thread dump Java HotSpot(TM) 64-Bit Server VM (25.73-b02 mixed mode): "OrientDB HTTP Connection /0:0:0:0:0:0:0:1:2480<-/0:0:0:0:0:0:0:1:49691" #399 daemon prio=5 os_prio=31 tid=0x00007fcdd6552800 nid=0x621b runnable [0x0000000122d29000] "OrientDB WAL Flush Task (redbox)" #398 daemon prio=5 os_prio=31 tid=0x00007fcdee421800 nid=0x5d23 waiting on condition [0x000000011e0a9000] "OrientDB <- BinaryClient (/127.0.0.1:49597)" #367 daemon prio=5 os_prio=31 tid=0x00007fcdca2b5800 nid=0x3c13 waiting on condition [0x0000000124e67000] "OrientDB <- BinaryClient (/127.0.0.1:49596)" #366 daemon prio=5 os_prio=31 tid=0x00007fcdd662d800 nid=0x7407 runnable [0x0000000124d63000] "OrientDB <- BinaryClient (/127.0.0.1:49595)" #365 daemon prio=5 os_prio=31 tid=0x00007fcde2851000 nid=0x6c0b waiting on condition [0x0000000124c61000] "OrientDB <- BinaryClient (/127.0.0.1:49594)" #364 daemon prio=5 os_prio=31 tid=0x00007fcdebee8800 nid=0x6e0b waiting on condition [0x0000000124b5e000] "OrientDB <- BinaryClient (/127.0.0.1:49593)" #363 daemon prio=5 os_prio=31 tid=0x00007fcdf504e800 nid=0x6b0b waiting on condition [0x0000000124a5b000] "OrientDB <- BinaryClient (/127.0.0.1:49590)" #362 daemon prio=5 os_prio=31 tid=0x00007fcdfa9d5800 nid=0x640b waiting on condition [0x0000000124958000] "OrientDB <- BinaryClient (/127.0.0.1:49591)" #361 daemon prio=5 os_prio=31 tid=0x00007fcde2864000 nid=0x720b waiting on condition [0x0000000124855000] "OrientDB <- BinaryClient (/127.0.0.1:49589)" #360 daemon prio=5 os_prio=31 tid=0x00007fcde0e58000 nid=0x630b waiting on condition [0x00000001244ad000] "OrientDB <- BinaryClient (/127.0.0.1:49592)" #359 daemon prio=5 os_prio=31 tid=0x00007fcdf2018000 nid=0x5f13 waiting on condition [0x00000001243aa000] "OrientDB <- BinaryClient (/127.0.0.1:49588)" #358 daemon prio=5 os_prio=31 tid=0x00007fcdeedf8800 nid=0x3f23 waiting on condition [0x0000000124131000] "OrientDB Auditing Logging Thread - redbox" #27 prio=5 os_prio=31 tid=0x00007fcdf231f800 nid=0x6003 waiting on condition [0x0000000124234000] "OrientDB Write Cache Flush Task (redbox)" #25 daemon prio=10 os_prio=31 tid=0x00007fcdf2b95800 nid=0x3b07 waiting on condition [0x0000000123c2e000] "DestroyJavaVM" #20 prio=5 os_prio=31 tid=0x00007fcdf200b800 nid=0xe07 waiting on condition [0x0000000000000000] "Timer-1" #18 daemon prio=5 os_prio=31 tid=0x00007fcdf31bc800 nid=0x5b03 in Object.wait() [0x0000000123a28000] "OrientDB ONetworkProtocolHttpDb listen at 0.0.0.0:2480-2490" #16 prio=5 os_prio=31 tid=0x00007fcdf2aa0000 nid=0x5903 runnable [0x0000000123725000] "OrientDB ONetworkProtocolBinary listen at 0.0.0.0:2424-2430" #14 prio=5 os_prio=31 tid=0x00007fcdf2a9f000 nid=0x5703 runnable [0x0000000123087000] "Timer-0" #11 daemon prio=5 os_prio=31 tid=0x00007fcdf1b9e000 nid=0x5503 in Object.wait() [0x0000000122f03000] "Service Thread" #9 daemon prio=9 os_prio=31 tid=0x00007fcdf280d800 nid=0x5103 runnable [0x0000000000000000] "C1 CompilerThread3" #8 daemon prio=9 os_prio=31 tid=0x00007fcdf3017800 nid=0x4f03 waiting on condition [0x0000000000000000] "C2 CompilerThread2" #7 daemon prio=9 os_prio=31 tid=0x00007fcdf2808800 nid=0x4d03 waiting on condition [0x0000000000000000] "C2 CompilerThread1" #6 daemon prio=9 os_prio=31 tid=0x00007fcdf1837800 nid=0x4b03 waiting on condition [0x0000000000000000] "C2 CompilerThread0" #5 daemon prio=9 os_prio=31 tid=0x00007fcdf1836000 nid=0x4903 waiting on condition [0x0000000000000000] "Signal Dispatcher" #4 daemon prio=9 os_prio=31 tid=0x00007fcdf1860800 nid=0x3d0b waiting on condition [0x0000000000000000] "Finalizer" #3 daemon prio=8 os_prio=31 tid=0x00007fcdf2015800 nid=0x2a03 in Object.wait() [0x000000011df60000] "Reference Handler" #2 daemon prio=10 os_prio=31 tid=0x00007fcdf2806000 nid=0x2803 in Object.wait() [0x000000011de5d000] "VM Thread" os_prio=31 tid=0x00007fcdf185b800 nid=0x2603 runnable "GC task thread#0 (ParallelGC)" os_prio=31 tid=0x00007fcdf2803800 nid=0xb07 runnable "GC task thread#1 (ParallelGC)" os_prio=31 tid=0x00007fcdf3000000 nid=0x90b runnable "GC task thread#2 (ParallelGC)" os_prio=31 tid=0x00007fcdf3001000 nid=0x717 runnable "GC task thread#3 (ParallelGC)" os_prio=31 tid=0x00007fcdf3001800 nid=0x51b runnable "GC task thread#4 (ParallelGC)" os_prio=31 tid=0x00007fcdf3002000 nid=0x313 runnable "GC task thread#5 (ParallelGC)" os_prio=31 tid=0x00007fcdf3002800 nid=0x113 runnable "GC task thread#6 (ParallelGC)" os_prio=31 tid=0x00007fcdf180b000 nid=0x1507 runnable "GC task thread#7 (ParallelGC)" os_prio=31 tid=0x00007fcdf1809000 nid=0x2403 runnable "VM Periodic Task Thread" os_prio=31 tid=0x00007fcdf1861800 nid=0x5303 waiting on condition JNI global references: 1761 Heap |
Hi,
Also small note, could you put big snippets of code or thread dumps, not directly to issue comment but as Gist. |
Hi guys, |
My apologies, we are tied up with a few other efforts at the moment. I will try to provide more detail in a few days. I also spent time recently reading through the source code involved with locking based on the stack trace. I think I understand the basics. This issue only occurs under multi-threaded load test with edges + unique indexes. It is some form of a race condition. I have never run Orient in a debugger before. I bet that would show the issue right away. |
@TeamXceleratorDev our idea is that one of the write locks was acquired but was not released . But to find such place we need data which we asked above. Also please note that it is very specific case, look at those lines in thread dump:
it means that database thinks that there is no enough free space for data processing and tries to compact it's files . Anyway once we have all data I think we may quickly fix given issue. |
It seems that I have encountered the same issue. Here is the Orient part of the locked thread stack: |
Guys I close this issue as duplication of #5821 . Let's discuss it in single place. |
@ygoraly as I mentioned about that is only single thread we need thread dump because ODB has several background threads. Also I mentioned above too :-) could you put such long stack traces as a gists to make conversation more readable. Any way I will really appreciate if you will send us thread dump, server logs and heap dump, the more data we will have the faster we fix given issue. |
Thread is stuck on OReadersWriterSpinLock.acquireWriteLock, line 131: |
In 2.1.12, it is possible to create a thread deadlock when threads are inserting into an Edge class which has a property bound by a unique indexes. Performance slows down and comes to a halt. I will try to attach an example database later which can reproduce the issue. Work-around for now is to simply remove the unique index.
The text was updated successfully, but these errors were encountered: