Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DB halt when read/write simultaneously same class through different connections #6507

Closed
1 task done
zhengguozhu opened this issue Aug 3, 2016 · 27 comments
Closed
1 task done
Milestone

Comments

@zhengguozhu
Copy link

OrientDB Version, operating system, or hardware.

  • v2.2.6

Operating System

  • Linux

Expected behavior and actual behavior

Constantly wirte WARNI log: Auto transaction starting is turned off for the graph, but already started transaction is left open.Commit it manually or consider disabling auto transactions while creating the graph or its factory. [OrientGraph]

Then OrientDB halt after about 24hours.

Steps to reproduce the problem

@andrii0lomakin
Copy link
Member

Hi could you send us full thread dump when OreintDB halt ?

@andrii0lomakin
Copy link
Member

Also seems like you switched off autostart of transactions. Did you commit transactions at the end ? Seem like TX so big that you started to have huge GC pauses.

@zhengguozhu
Copy link
Author

When system halt, there is no thread dump. How to turn it on?
Do I need to change the code to commit explicitly? Or optimize GC params?

@zhengguozhu
Copy link
Author

The hardware and software environment are:

  • 4 CPU
  • 16G MEM
  • 40G SSD HD
  • CentOS 7.2
  • JDK 1.8.0

Data volume is about 400M of .tar.gz file of the database folder. A single class may consists of 300,000 records. As the data vaolume is relatively small, the hardware usage is always very low, memory consumption is less than 15%, CPU consumption is less that 5 percent.

When OrientDB halt happen, comment 'orientdb.sh stop' couldn't stop the java process, must use 'kill -9' to kill the process.

@zhengguozhu zhengguozhu changed the title WARNI: Auto transaction starting is turned off for the graph DB halt when read/write simultaneously same class through different connections Aug 19, 2016
@zhengguozhu
Copy link
Author

Updated issue title as 'DB halt when read/write simultaneously same class through different connections'.

Steps to reproduce:

  1. create a nodejs program to read records of a class through orientjs.
  2. create a nodejs program to write records into the same class through orientjs.
  3. run the two programs at the same time.
  4. db server halt, and failed to shutdown through 'orientdb.sh stop'.

@andrii0lomakin
Copy link
Member

Hi because that MT issue (actually we fixed similar two days ago) we need thread dump to fix the given problem. Could you send us thread dump when DB halts, to send thread dump you may use the following command: jstack -l JAVA_PID > jstack.out

@andrii0lomakin
Copy link
Member

@zhengguozhu if you have any problems with getting of stack trace, could you notify me in issue and I will try to help you.

@andrii0lomakin
Copy link
Member

Hi @zhengguozhu could you send us thread dump ?

@zhengguozhu
Copy link
Author

I adjusted the server settings to:

JAVA_OPTS="-XX:+PerfDisableSharedMem"
ORIENTDB_OPTS_MEMORY="-Xms512m -Xmx1024m -Dstorage.diskCache.bufferSize=9000"
JAVA_OPTS_SCRIPT="-Djna.nosys=true -XX:+HeapDumpOnOutOfMemoryError -XX:MaxDirectMemorySize=15g -Djava.awt.headless=true -Dfile.encoding=UTF8 -Drhino.opt.level=9"
ORIENTDB_SETTINGS="-Denvironment.dumpCfgAtStartup=true"

It is not easy to reproduce the issue now.
Is it related to issue #6548?
I will try to reproduce by changing settings.

@andrii0lomakin
Copy link
Member

@zhengguozhu thank you very much for your effort. We have fixed several deadlocks in 2.2.7 and 2.2.8 so I have hope that your issue is fixed too. But in case of any hint of deadlock (when database becomes frozen from your point of view) the thread dump is really helpful to identify and fix problem.

@zhengguozhu
Copy link
Author

JAVA_OPTS="-XX:+PerfDisableSharedMem"
JAVA_OPTS_SCRIPT="-Djna.nosys=true -XX:+HeapDumpOnOutOfMemoryError -XX:MaxDirectMemorySize=15g -Djava.awt.headless=true -Dfile.encoding=UTF8 -Drhino.opt.level=9"
ORIENTDB_SETTINGS="-Denvironment.dumpCfgAtStartup=true"

I removed above settings, and able to reproduce.

@zhengguozhu
Copy link
Author

Just tried to add back the settings, and it was able to reproduce the problem again.
I will try to upgrade to 2.2.8 and test.

@andrii0lomakin
Copy link
Member

@zhengguozhu is your server still locked , if that is true could you create heap dump too ? if not could you reproduce deadlock again (sorry for that) and send me thread dump again and heap dump. To make heap dump please execute jmap -J-d64 -dump:format=b,file=<heap_dump_filename> <pid> . Thank you very much for your help.

@zhengguozhu
Copy link
Author

I am able to reproduce with 2.2.8.

@zhengguozhu
Copy link
Author

The heap dump file is too big, 181M, can't send.

@andrii0lomakin
Copy link
Member

@zhengguozhu I may share my google drive with you . May I do it ? Very important though to have thread dump and heap dump are taken from the same time.

@zhengguozhu
Copy link
Author

Alright, I will create a user for you on my server, then you can scp to get the files.
How to send you the private msg?

@zhengguozhu
Copy link
Author

@iaa, have sent to your gmail.

@andrii0lomakin
Copy link
Member

@zhengguozhu heap dump download in progress but speed is very slow meanwhile could you answer the question do you use standard distribution of server with default settings, do not you ?

@andrii0lomakin
Copy link
Member

@zhengguozhu if you look at server log do you see any logged exceptions ?

@zhengguozhu
Copy link
Author

zhengguozhu commented Aug 25, 2016

The server is standard distribution of server with below settings:
JAVA_OPTS="-XX:+PerfDisableSharedMem"
ORIENTDB_OPTS_MEMORY="-Xms512m -Xmx512m -Dstorage.diskCache.bufferSize=4000"
JAVA_OPTS_SCRIPT="-Djna.nosys=true -XX:+HeapDumpOnOutOfMemoryError -XX:MaxDirectMemorySize=6g -Djava.awt.headless=true -Dfile.encoding=UTF8 -Drhino.opt.level=9"
ORIENTDB_SETTINGS="-Denvironment.dumpCfgAtStartup=true"

No logged exceptions.

Let me know once you downloaded the dump file.

@andrii0lomakin
Copy link
Member

@zhengguozhu thank you very much for your help I found the reason of issue and will fix it in a couple of hours. Very appreciate that you spent your time and helped us to diagnose the issue.

andrii0lomakin added a commit that referenced this issue Aug 26, 2016
andrii0lomakin added a commit that referenced this issue Aug 26, 2016
@andrii0lomakin
Copy link
Member

@zhengguozhu fixed , could you check it in latest 2.2.x branch. To build project simply run mvn clean install -DskipTests . DB distribution will be in distribution/target directory.

@zhengguozhu
Copy link
Author

I can't build on my machine. Any clue with following error?

constituent[36]: file:/usr/share/maven/lib/maven-wagon_http-shaded.jar

constituent[37]: file:/usr/share/maven/lib/maven-wagon_provider-api.jar

Exception in thread "main" java.lang.InternalError
at sun.security.ec.SunEC.initialize(Native Method)
at sun.security.ec.SunEC.access$000(SunEC.java:49)
at sun.security.ec.SunEC$1.run(SunEC.java:61)
at sun.security.ec.SunEC$1.run(SunEC.java:58)
at java.security.AccessController.doPrivileged(Native Method)
at sun.security.ec.SunEC.(SunEC.java:58)

@andrii0lomakin
Copy link
Member

@zhengguozhu
Copy link
Author

I run the maven build successfully, and tested the SNAPSHOT distribution, the problem fixed.
Thank you very much.

@andrii0lomakin
Copy link
Member

@zhengguozhu thank you very much for your time and issue report.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

4 participants