Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Significant performance degradation on a server with big RAM #5352

Closed
PhantomYdn opened this issue Nov 20, 2015 · 25 comments
Closed

Significant performance degradation on a server with big RAM #5352

PhantomYdn opened this issue Nov 20, 2015 · 25 comments
Assignees

Comments

@PhantomYdn
Copy link
Contributor

Guys,

We got a server with 128Gb RAM and 32 CPUs. But OrientDB is running even slower than on 1 CPU/1Gb account on digital ocean. The mostly it's observable during db startup: it's doing something for a couple of minutes (vs. few minutes on small environment). According to thread dump it's freezing here:

"main" prio=10 tid=0x00007f4e2c00a000 nid=0x18a3 runnable [0x00007f4e33a93000]
   java.lang.Thread.State: RUNNABLE
    at java.io.RandomAccessFile.write0(Native Method)
    at java.io.RandomAccessFile.write(RandomAccessFile.java:500)
    at java.io.RandomAccessFile.writeLong(RandomAccessFile.java:1071)
    at com.orientechnologies.orient.core.storage.impl.local.paginated.wal.ODiskWriteAheadLog.writeMasterRecord(ODiskWriteAheadLog.java:1216)
    at com.orientechnologies.orient.core.storage.impl.local.paginated.wal.ODiskWriteAheadLog.log(ODiskWriteAheadLog.java:884)
    at com.orientechnologies.orient.core.storage.impl.local.paginated.wal.OAbstractWriteAheadLog.logFullCheckpointStart(OAbstractWriteAheadLog.java:67)
    at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.makeFullCheckpoint(OAbstractPaginatedStorage.java:1631)
    at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.synch(OAbstractPaginatedStorage.java:1205)
    at com.orientechnologies.orient.core.metadata.schema.OSchemaShared$1.call(OSchemaShared.java:1217)
    at com.orientechnologies.orient.core.db.OScenarioThreadLocal.executeAsDistributed(OScenarioThreadLocal.java:67)
    at com.orientechnologies.orient.core.metadata.schema.OSchemaShared.saveInternal(OSchemaShared.java:1210)
    at com.orientechnologies.orient.core.metadata.schema.OSchemaShared.releaseSchemaWriteLock(OSchemaShared.java:640)
    at com.orientechnologies.orient.core.metadata.schema.OClassImpl.releaseSchemaWriteLock(OClassImpl.java:1808)
    at com.orientechnologies.orient.core.metadata.schema.OClassImpl.releaseSchemaWriteLock(OClassImpl.java:1803)
    at com.orientechnologies.orient.core.metadata.schema.OPropertyImpl.releaseSchemaWriteLock(OPropertyImpl.java:1109)
    at com.orientechnologies.orient.core.metadata.schema.OPropertyImpl.setCustom(OPropertyImpl.java:757)
    at com.orientechnologies.orient.core.metadata.schema.OPropertyImpl.setCustom(OPropertyImpl.java:68)

It's interesting, but JVM instance limited just be few Gb.

Any ideas?

@PhantomYdn
Copy link
Contributor Author

Forgot to mention timings also for db creaton: it depends but commonly it also took >1 min to create just empty database.
OrientDB version 2.1.5

@andrii0lomakin
Copy link
Member

Hi,
It is funny. I will explain reason.
Now we need to force sync data for each class creation because you have huge amount of memory it will take a lot to make of time to force sync whole cache. Is it possible to create classes before head to avoid such behaviour ?

@PhantomYdn
Copy link
Contributor Author

Andrey,

Not sure that I understood your explanation... But let me mention that runtime working with schema (creation/deletion of classes and properties) is common use case for Orienteer.

Aaa.... Just checked performance of classes/properties creation: it takes about 3 sec in avg. That's crazy...

@andrii0lomakin
Copy link
Member

I think I have solution for you.
I will set max limit of memory for write cache it should solve your problems.
Meantime could you limit disk cache size ?

@PhantomYdn
Copy link
Contributor Author

Andrey,
Sounds good. But it looks like short-term solution. Am I right?
And can we please discuss reasons of that in more details by skype?
I can't beleive that we have to limit OrientDB to make it faster:) And even more: as I said we limited OrientDB just by 2gb of RAM (by -Xms2g -Xmx2g) and empty db takes less then 10Mb on hard-drive: so problem looks really strange from 3rd side.

@andrii0lomakin
Copy link
Member

Sure, let's do it on Monday.

On Fri, Nov 20, 2015 at 7:48 PM Ilia Naryzhny notifications@github.com
wrote:

Andrey,
Sounds good. But it looks like short-term solution. Am I right?
And can we please discuss reasons of that in more details by skype?
I can't beleive that we have to limit OrientDB to make it faster:) And
even more: as I said we limited OrientDB just by 2gb of RAM (by -Xms2g
-Xmx2g) and empty db takes less then 10Mb on hard-drive: so problem looks
really strange from 3rd side.


Reply to this email directly or view it on GitHub
#5352 (comment)
.

Best regards,
Andrey Lomakin, R&D lead.
OrientDB Ltd

twitter:@Andrey_Lomakin linkedin:https://ua.linkedin.com/in/andreylomakin

@lvca lvca modified the milestones: 2.1.x (next hotfix), 2.1.6 Nov 23, 2015
@lvca
Copy link
Member

lvca commented Dec 4, 2015

Any feedback on this?

@andrii0lomakin
Copy link
Member

We agreed with Iya to work on this right after 2.2

On Fri, Dec 4, 2015 at 11:38 AM Luca Garulli notifications@github.com
wrote:

Any feedback on this?


Reply to this email directly or view it on GitHub
#5352 (comment)
.

Best regards,
Andrey Lomakin, R&D lead.
OrientDB Ltd

twitter:@Andrey_Lomakin linkedin:https://ua.linkedin.com/in/andreylomakin

@andrii0lomakin
Copy link
Member

HI,
Could you try now from build in develop branch ?

@PhantomYdn
Copy link
Contributor Author

Should I make some configuration changes? According to diff there are some config properties...

@andrii0lomakin
Copy link
Member

Hi Ilia,
No you do not have to change default settings.

On Fri, Dec 25, 2015 at 11:06 PM Ilia Naryzhny notifications@github.com
wrote:

Should I make some configuration changes? According to diff there are some
config properties...


Reply to this email directly or view it on GitHub
#5352 (comment)
.

Best regards,
Andrey Lomakin, R&D lead.
OrientDB Ltd

twitter:@Andrey_Lomakin linkedin:https://ua.linkedin.com/in/andreylomakin

@andrii0lomakin
Copy link
Member

Ilya, any updates on this issue ?

@PhantomYdn
Copy link
Contributor Author

Not yet - sorry. Pretty hard to "bring" SNAPSHOT version (which is not compatible with latest Orienteer for now) to production.
Any due dates for that? When are you going to have some RC for 2.2?

@PhantomYdn
Copy link
Contributor Author

Still no ability to test on prod for 2.2-SNAPSHOT.

But there is significant degradation of this issue in 2.1.13 (probably introduced somewhere here: 2.1.10-2.1.13). If before it took ~20 sec for schema update - now it can take up to 2 minutes(!!!) for simple update like adding new class.

@andrii0lomakin
Copy link
Member

@PhantomYdn could you profile such update and send us yourkit snapshot if possible ? I do not think that issues are related , so would be good to send to me for example by email.

@PhantomYdn
Copy link
Contributor Author

During class creation it hanging here:

"qtp81628611-111" #111 prio=5 os_prio=0 tid=0x00007fc8cffc2800 nid=0x61ad runnable [0x00007fca2a1c3000]
   java.lang.Thread.State: RUNNABLE
        at com.orientechnologies.orient.core.metadata.schema.OSchemaShared.acquireSchemaReadLock(OSchemaShared.java:618)
        at com.orientechnologies.orient.core.metadata.schema.OClassImpl.acquireSchemaReadLock(OClassImpl.java:1866)
        at com.orientechnologies.orient.core.metadata.schema.OPropertyImpl.acquireSchemaReadLock(OPropertyImpl.java:1093)
        at com.orientechnologies.orient.core.metadata.schema.OPropertyImpl.getLinkedType(OPropertyImpl.java:398)
        at com.orientechnologies.orient.core.metadata.schema.OImmutableProperty.<init>(OImmutableProperty.java:60)
        at com.orientechnologies.orient.core.metadata.schema.OImmutableClass.<init>(OImmutableClass.java:86)
        at com.orientechnologies.orient.core.metadata.schema.OImmutableSchema.<init>(OImmutableSchema.java:39)
        at com.orientechnologies.orient.core.metadata.schema.OSchemaShared.saveInternal(OSchemaShared.java:1228)
        at com.orientechnologies.orient.core.metadata.schema.OSchemaShared.releaseSchemaWriteLock(OSchemaShared.java:642)
        at com.orientechnologies.orient.core.metadata.schema.OSchemaShared.releaseSchemaWriteLock(OSchemaShared.java:631)
        at com.orientechnologies.orient.core.metadata.schema.OSchemaShared.doCreateClass(OSchemaShared.java:1050)
        at com.orientechnologies.orient.core.metadata.schema.OSchemaShared.createClass(OSchemaShared.java:415)
        at com.orientechnologies.orient.core.metadata.schema.OSchemaShared.createClass(OSchemaShared.java:400)
        at com.orientechnologies.orient.core.metadata.schema.OSchemaShared.createClass(OSchemaShared.java:275)
        at com.orientechnologies.orient.core.metadata.schema.OSchemaProxy.createClass(OSchemaProxy.java:71)

And then a catched:

"qtp81628611-117" #117 prio=5 os_prio=0 tid=0x00007fc928004800 nid=0x6351 runnable [0x00007fca29378000]
   java.lang.Thread.State: RUNNABLE
        at java.io.RandomAccessFile.write0(Native Method)
        at java.io.RandomAccessFile.write(RandomAccessFile.java:489)
        at java.io.RandomAccessFile.writeInt(RandomAccessFile.java:1031)
        at com.orientechnologies.orient.core.storage.impl.local.paginated.wal.ODiskWriteAheadLog.writeMasterRecord(ODiskWriteAheadLog.java:1210)
        at com.orientechnologies.orient.core.storage.impl.local.paginated.wal.ODiskWriteAheadLog.log(ODiskWriteAheadLog.java:884)
        at com.orientechnologies.orient.core.storage.impl.local.paginated.wal.OAbstractWriteAheadLog.logFullCheckpointStart(OAbstractWriteAheadLog.java:67)
        at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.makeFullCheckpoint(OAbstractPaginatedStorage.java:1738)
        at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.synch(OAbstractPaginatedStorage.java:1288)
        at com.orientechnologies.orient.core.metadata.schema.OSchemaShared$1.call(OSchemaShared.java:1219)
        at com.orientechnologies.orient.core.db.OScenarioThreadLocal.executeAsDistributed(OScenarioThreadLocal.java:71)
        at com.orientechnologies.orient.core.metadata.schema.OSchemaShared.saveInternal(OSchemaShared.java:1212)
        at com.orientechnologies.orient.core.metadata.schema.OSchemaShared.releaseSchemaWriteLock(OSchemaShared.java:642)
        at com.orientechnologies.orient.core.metadata.schema.OSchemaShared.releaseSchemaWriteLock(OSchemaShared.java:631)
        at com.orientechnologies.orient.core.metadata.schema.OSchemaShared.doCreateClass(OSchemaShared.java:1050)
        at com.orientechnologies.orient.core.metadata.schema.OSchemaShared.createClass(OSchemaShared.java:415)

During property creation:

"qtp81628611-113" #113 prio=5 os_prio=0 tid=0x00007fc8ef9f5000 nid=0x626d runnable [0x00007fca2a1c3000]
   java.lang.Thread.State: RUNNABLE
        at java.io.RandomAccessFile.write0(Native Method)
        at java.io.RandomAccessFile.write(RandomAccessFile.java:489)
        at java.io.RandomAccessFile.writeLong(RandomAccessFile.java:1049)
        at com.orientechnologies.orient.core.storage.impl.local.paginated.wal.ODiskWriteAheadLog.writeMasterRecord(ODiskWriteAheadLog.java:1212)
        at com.orientechnologies.orient.core.storage.impl.local.paginated.wal.ODiskWriteAheadLog.log(ODiskWriteAheadLog.java:884)
        at com.orientechnologies.orient.core.storage.impl.local.paginated.wal.OAbstractWriteAheadLog.logFullCheckpointStart(OAbstractWriteAheadLog.java:67)
        at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.makeFullCheckpoint(OAbstractPaginatedStorage.java:1738)
        at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.synch(OAbstractPaginatedStorage.java:1288)
        at com.orientechnologies.orient.core.metadata.schema.OSchemaShared$1.call(OSchemaShared.java:1219)
        at com.orientechnologies.orient.core.db.OScenarioThreadLocal.executeAsDistributed(OScenarioThreadLocal.java:71)
        at com.orientechnologies.orient.core.metadata.schema.OSchemaShared.saveInternal(OSchemaShared.java:1212)
        at com.orientechnologies.orient.core.metadata.schema.OSchemaShared.releaseSchemaWriteLock(OSchemaShared.java:642)
        at com.orientechnologies.orient.core.metadata.schema.OClassImpl.releaseSchemaWriteLock(OClassImpl.java:1883)
        at com.orientechnologies.orient.core.metadata.schema.OClassImpl.releaseSchemaWriteLock(OClassImpl.java:1878)
        at com.orientechnologies.orient.core.metadata.schema.OClassImpl.addProperty(OClassImpl.java:2387)
        at com.orientechnologies.orient.core.metadata.schema.OClassImpl.createProperty(OClassImpl.java:769)

Looks realy strange... It's unclear: is it just one call for random file access or there multiple invocations. But so drastical degradation was not here before.

@andrii0lomakin
Copy link
Member

Well it is clear for me, and already fixed in issue which we discuss now
:-)).

On Tue, Mar 22, 2016 at 7:23 PM Ilia Naryzhny notifications@github.com
wrote:

During class creation it hanging here:

"qtp81628611-111" #111 prio=5 os_prio=0 tid=0x00007fc8cffc2800 nid=0x61ad runnable [0x00007fca2a1c3000]
java.lang.Thread.State: RUNNABLE
at com.orientechnologies.orient.core.metadata.schema.OSchemaShared.acquireSchemaReadLock(OSchemaShared.java:618)
at com.orientechnologies.orient.core.metadata.schema.OClassImpl.acquireSchemaReadLock(OClassImpl.java:1866)
at com.orientechnologies.orient.core.metadata.schema.OPropertyImpl.acquireSchemaReadLock(OPropertyImpl.java:1093)
at com.orientechnologies.orient.core.metadata.schema.OPropertyImpl.getLinkedType(OPropertyImpl.java:398)
at com.orientechnologies.orient.core.metadata.schema.OImmutableProperty.(OImmutableProperty.java:60)
at com.orientechnologies.orient.core.metadata.schema.OImmutableClass.(OImmutableClass.java:86)
at com.orientechnologies.orient.core.metadata.schema.OImmutableSchema.(OImmutableSchema.java:39)
at com.orientechnologies.orient.core.metadata.schema.OSchemaShared.saveInternal(OSchemaShared.java:1228)
at com.orientechnologies.orient.core.metadata.schema.OSchemaShared.releaseSchemaWriteLock(OSchemaShared.java:642)
at com.orientechnologies.orient.core.metadata.schema.OSchemaShared.releaseSchemaWriteLock(OSchemaShared.java:631)
at com.orientechnologies.orient.core.metadata.schema.OSchemaShared.doCreateClass(OSchemaShared.java:1050)
at com.orientechnologies.orient.core.metadata.schema.OSchemaShared.createClass(OSchemaShared.java:415)
at com.orientechnologies.orient.core.metadata.schema.OSchemaShared.createClass(OSchemaShared.java:400)
at com.orientechnologies.orient.core.metadata.schema.OSchemaShared.createClass(OSchemaShared.java:275)
at com.orientechnologies.orient.core.metadata.schema.OSchemaProxy.createClass(OSchemaProxy.java:71)

And then a catched:

"qtp81628611-117" #117 prio=5 os_prio=0 tid=0x00007fc928004800 nid=0x6351 runnable [0x00007fca29378000]
java.lang.Thread.State: RUNNABLE
at java.io.RandomAccessFile.write0(Native Method)
at java.io.RandomAccessFile.write(RandomAccessFile.java:489)
at java.io.RandomAccessFile.writeInt(RandomAccessFile.java:1031)
at com.orientechnologies.orient.core.storage.impl.local.paginated.wal.ODiskWriteAheadLog.writeMasterRecord(ODiskWriteAheadLog.java:1210)
at com.orientechnologies.orient.core.storage.impl.local.paginated.wal.ODiskWriteAheadLog.log(ODiskWriteAheadLog.java:884)
at com.orientechnologies.orient.core.storage.impl.local.paginated.wal.OAbstractWriteAheadLog.logFullCheckpointStart(OAbstractWriteAheadLog.java:67)
at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.makeFullCheckpoint(OAbstractPaginatedStorage.java:1738)
at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.synch(OAbstractPaginatedStorage.java:1288)
at com.orientechnologies.orient.core.metadata.schema.OSchemaShared$1.call(OSchemaShared.java:1219)
at com.orientechnologies.orient.core.db.OScenarioThreadLocal.executeAsDistributed(OScenarioThreadLocal.java:71)
at com.orientechnologies.orient.core.metadata.schema.OSchemaShared.saveInternal(OSchemaShared.java:1212)
at com.orientechnologies.orient.core.metadata.schema.OSchemaShared.releaseSchemaWriteLock(OSchemaShared.java:642)
at com.orientechnologies.orient.core.metadata.schema.OSchemaShared.releaseSchemaWriteLock(OSchemaShared.java:631)
at com.orientechnologies.orient.core.metadata.schema.OSchemaShared.doCreateClass(OSchemaShared.java:1050)
at com.orientechnologies.orient.core.metadata.schema.OSchemaShared.createClass(OSchemaShared.java:415)

During property creation:

"qtp81628611-113" #113 prio=5 os_prio=0 tid=0x00007fc8ef9f5000 nid=0x626d runnable [0x00007fca2a1c3000]
java.lang.Thread.State: RUNNABLE
at java.io.RandomAccessFile.write0(Native Method)
at java.io.RandomAccessFile.write(RandomAccessFile.java:489)
at java.io.RandomAccessFile.writeLong(RandomAccessFile.java:1049)
at com.orientechnologies.orient.core.storage.impl.local.paginated.wal.ODiskWriteAheadLog.writeMasterRecord(ODiskWriteAheadLog.java:1212)
at com.orientechnologies.orient.core.storage.impl.local.paginated.wal.ODiskWriteAheadLog.log(ODiskWriteAheadLog.java:884)
at com.orientechnologies.orient.core.storage.impl.local.paginated.wal.OAbstractWriteAheadLog.logFullCheckpointStart(OAbstractWriteAheadLog.java:67)
at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.makeFullCheckpoint(OAbstractPaginatedStorage.java:1738)
at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.synch(OAbstractPaginatedStorage.java:1288)
at com.orientechnologies.orient.core.metadata.schema.OSchemaShared$1.call(OSchemaShared.java:1219)
at com.orientechnologies.orient.core.db.OScenarioThreadLocal.executeAsDistributed(OScenarioThreadLocal.java:71)
at com.orientechnologies.orient.core.metadata.schema.OSchemaShared.saveInternal(OSchemaShared.java:1212)
at com.orientechnologies.orient.core.metadata.schema.OSchemaShared.releaseSchemaWriteLock(OSchemaShared.java:642)
at com.orientechnologies.orient.core.metadata.schema.OClassImpl.releaseSchemaWriteLock(OClassImpl.java:1883)
at com.orientechnologies.orient.core.metadata.schema.OClassImpl.releaseSchemaWriteLock(OClassImpl.java:1878)
at com.orientechnologies.orient.core.metadata.schema.OClassImpl.addProperty(OClassImpl.java:2387)
at com.orientechnologies.orient.core.metadata.schema.OClassImpl.createProperty(OClassImpl.java:769)

Looks realy strange... It's unclear: is it just one call for random file
access or there multiple invocations. But so drastical degradation was not
here before.


You are receiving this because you were assigned.

Reply to this email directly or view it on GitHub
#5352 (comment)

Best regards,
Andrey Lomakin, R&D lead.
OrientDB Ltd

twitter:@Andrey_Lomakin linkedin:https://ua.linkedin.com/in/andreylomakin

@PhantomYdn
Copy link
Contributor Author

That's great:) Thanks! But when we will have some fix? In 2.2 or 2.1.14? And as I said, things became worsen in latest 2.1.X fixes... Any workaround for 2.1?

@andrii0lomakin
Copy link
Member

Well sorry we do not provide radical changes in design of underlying engine
in hotfixes.
((.

On Wed, Mar 23, 2016 at 11:02 AM Ilia Naryzhny notifications@github.com
wrote:

That's great:) Thanks! But when we will have some fix? In 2.2 or 2.1.14?
And as I said, things became worsen in latest 2.1.X fixes... Any workaround
for 2.1?


You are receiving this because you were assigned.

Reply to this email directly or view it on GitHub
#5352 (comment)

Best regards,
Andrey Lomakin, R&D lead.
OrientDB Ltd

twitter:@Andrey_Lomakin linkedin:https://ua.linkedin.com/in/andreylomakin

@smolinari
Copy link
Contributor

So the fix is in 2.2, right?

Scott

@andrii0lomakin
Copy link
Member

Yes, exactly.

@andrii0lomakin
Copy link
Member

@PhantomYdn Did you have a chance to try it on 2.2 rc1 ?

@andrii0lomakin andrii0lomakin modified the milestones: 2.2.1, 2.2.0 GA May 10, 2016
@andrii0lomakin andrii0lomakin modified the milestones: 2.2.1, 2.2.x (next hotfix) May 23, 2016
@andrii0lomakin
Copy link
Member

Close issue because of no reply. If you reproduce given issue and have information which is requested in issue feel free to reopen it.

@PhantomYdn
Copy link
Contributor Author

Was not able to mobe due to #6157 . Will be able to say something after full transion

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

5 participants