Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error creating a new database in a cluster with one node (orientdb version 2.1.6) #5375

Closed
sandhya-inbetween opened this issue Nov 25, 2015 · 7 comments
Assignees
Labels
Milestone

Comments

@sandhya-inbetween
Copy link

Hi, we intend to setup orientdb cluster (3 nodes) running on Docker containers , on three different VM(multi-host orientdb cluster with docker). But we are facing issues while creating a new database even with a single node. Its a critical issue for us, as we are very close to our first Beta release of our product.

Error Occurs in version : 2.1.6
Docker Container OS : ubuntu 14.04
Steps to reproduce:

  1. Start Node New Intent API [moved] #1 using dserver.sh
  2. send a curl request to create a new database.
  3. The above call results in errors in orientdb node New Intent API [moved] #1.

Error stacktrace:

Received: {node1448441646293=waiting-for-response} [ODistributedResponseManager]
2015-11-25 08:54:57:864 WARNI [node1448441646293] Quorum 1 not reached for request (id=0 from=node1448441646293 task=record_read(#5:1) user=#5:0). Elapsed=15027ms No server in conflict. Received: {node1448441646293=waiting-for-response} [ODistributedResponseManager]
2015-11-25 08:54:57:864 SEVER failed to convert to OUser Error on retrieving record #5:0 (cluster: ouser) [ODistributedWorker]Error on fetching record during browsing. The record has been skipped
com.orientechnologies.orient.core.exception.ODatabaseException: Error on retrieving record #5:1 (cluster: ouser)
at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.executeReadRecord(ODatabaseDocumentTx.java:1849)
at com.orientechnologies.orient.core.tx.OTransactionNoTx.loadRecord(OTransactionNoTx.java:92)
at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.load(ODatabaseDocumentTx.java:1592)
at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.load(ODatabaseDocumentTx.java:121)
at com.orientechnologies.orient.core.iterator.OIdentifiableIterator.readCurrentRecord(OIdentifiableIterator.java:287)
at com.orientechnologies.orient.core.iterator.ORecordIteratorClusters.hasNext(ORecordIteratorClusters.java:160)
at com.orientechnologies.orient.core.sql.OCommandExecutorSQLSelect.fetchFromTarget(OCommandExecutorSQLSelect.java:1422)
at com.orientechnologies.orient.core.sql.OCommandExecutorSQLSelect.executeSearch(OCommandExecutorSQLSelect.java:469)
at com.orientechnologies.orient.core.sql.OCommandExecutorSQLSelect.execute(OCommandExecutorSQLSelect.java:427)
at com.orientechnologies.orient.core.sql.OCommandExecutorSQLDelegate.execute(OCommandExecutorSQLDelegate.java:90)
at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.executeCommand(OAbstractPaginatedStorage.java:1538)
at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.command(OAbstractPaginatedStorage.java:1519)
at com.orientechnologies.orient.server.distributed.ODistributedStorage.command(ODistributedStorage.java:268)
at com.orientechnologies.orient.core.sql.query.OSQLQuery.run(OSQLQuery.java:72)
at com.orientechnologies.orient.core.sql.query.OSQLSynchQuery.run(OSQLSynchQuery.java:85)
at com.orientechnologies.orient.core.query.OQueryAbstract.execute(OQueryAbstract.java:33)
at com.orientechnologies.orient.core.metadata.security.OSecurityShared.getAllUsers(OSecurityShared.java:300)
at com.orientechnologies.orient.core.metadata.security.OSecurityProxy.getAllUsers(OSecurityProxy.java:127)
at com.orientechnologies.orient.server.network.protocol.http.command.post.OServerCommandPostDatabase.sendDatabaseInfo(OServerCommandPostDatabase.java:155)
at com.orientechnologies.orient.server.network.protocol.http.command.post.OServerCommandPostDatabase.execute(OServerCommandPostDatabase.java:83)
at com.orientechnologies.orient.server.network.protocol.http.ONetworkProtocolHttpAbstract.service(ONetworkProtocolHttpAbstract.java:180)
at com.orientechnologies.orient.server.network.protocol.http.ONetworkProtocolHttpAbstract.execute(ONetworkProtocolHttpAbstract.java:627)
at com.orientechnologies.common.thread.OSoftThread.run(OSoftThread.java:77)
Caused by: com.orientechnologies.orient.server.distributed.ODistributedException: Error on executing distributed request (id=0 from=node1448441646293 task=record_read(#5:1) user=#5:0) against database 'Test1.[ouser]' to nodes [node1448441646293]
at com.orientechnologies.orient.server.hazelcast.OHazelcastDistributedDatabase.send2Nodes(OHazelcastDistributedDatabase.java:189)
at com.orientechnologies.orient.server.hazelcast.OHazelcastPlugin.sendRequest(OHazelcastPlugin.java:359)
at com.orientechnologies.orient.server.distributed.ODistributedStorage.readRecord(ODistributedStorage.java:592)
at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx$SimpleRecordReader.readRecord(ODatabaseDocumentTx.java:3193)
at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.executeReadRecord(ODatabaseDocumentTx.java:1816)
... 22 more
Caused by: com.orientechnologies.orient.server.distributed.ODistributedException: Quorum 1 not reached for request (id=0 from=node1448441646293 task=record_read(#5:1) user=#5:0). Elapsed=15027ms No server in conflict. Received: {node1448441646293=waiting-for-response}
at com.orientechnologies.orient.server.distributed.ODistributedResponseManager.manageConflicts(ODistributedResponseManager.java:585)
at com.orientechnologies.orient.server.distributed.ODistributedResponseManager.getFinalResponse(ODistributedResponseManager.java:349)
at com.orientechnologies.orient.server.hazelcast.OHazelcastDistributedDatabase.waitForResponse(OHazelcastDistributedDatabase.java:423)
at com.orientechnologies.orient.server.hazelcast.OHazelcastDistributedDatabase.send2Nodes(OHazelcastDistributedDatabase.java:186)
... 26 more

2015-11-25 08:55:00:867 WARNI [node1448441646293] timeout (3000ms) on waiting for synchronous responses from nodes=[node1448441646293] responsesSoFar=[] request=id=2 from=node1448441646293 task=record_read(#5:1) [OHazelcastDistributedDatabase]
2015-11-25 08:55:00:867 WARNI [node1448441646293] no response received from local node about request id=2 from=node1448441646293 task=record_read(#5:1) [ODistributedResponseManager]
2015-11-25 08:55:00:868 WARNI [node1448441646293] detected 1 node(s) in timeout or in conflict and quorum (1) has not been reached, rolling back changes for request (id=2 from=node1448441646293 task=record_read(#5:1)) [ODistributedResponseManager]

Our configuration files are as below:

Hazelcast Configuration file:

[


orientdb
orientdb


2434


235.1.1.1
2434


192.168.134.79:2434
192.168.134.81:2434
192.168.135.25:2434


192.168.134.81

16 ]

default-distributed-db-config.json
[
{
"autoDeploy": true,
"hotAlignment": true,
"executionMode": "asynchronous",
"readQuorum": 2,
"writeQuorum": 2,
"failureAvailableNodesLessQuorum": false,
"readYourWrites": true,
"servers": {
"": "master"
},
"clusters": {
"internal": {
},
"index": {
},
"
": {
"servers": ["<NEW_NODE>"]
}
}
}
]

@sandhya-inbetween
Copy link
Author

we are able to reproduced same error on windows 7 also.

@sandhya-inbetween
Copy link
Author

   We just realized that in above scenario we kept  readQuorum 2 and we had started only one node due to that this error occurs.but now we have started 3 nodes and then we tried to create database.. database get created but we are getting following error in one of the node out of three.

Steps to Reproduce:

1.Start node#1,node#2 & node#3 using dserver.sh
2.Send a curl request to create a new database.
3.Error occurs in one of the node

Error stacktrace in one node:

2015-11-25 17:23:36:914 INFO [node1448375802075] class 'ORole', creation of new local cluster 'orole_node1448375802075' (id=-1) [OHazelcastPlugin]
2015-11-25 17:23:56:979 WARNI [node1448375802075] timeout (20017ms) on waiting for synchronous responses from nodes=[node1448452328356] responsesSoFar=[] request=id=5 from=node1448375802075 task=comma
nd_sql(create cluster orole_node1448375802075) [OHazelcastDistributedDatabase]
2015-11-25 17:23:56:979 WARNI [node1448375802075] detected 1 node(s) in timeout or in conflict and quorum (1) has not been reached, rolling back changes for request (id=5 from=node1448375802075 task=c
ommand_sql(create cluster orole_node1448375802075)) [ODistributedResponseManager]
2015-11-25 17:23:56:979 WARNI [node1448375802075] Quorum 1 not reached for request (id=5 from=node1448375802075 task=command_sql(create cluster orole_node1448375802075)). Elapsed=20025ms No server in
conflict. Received: {node1448452328356=waiting-for-response} [ODistributedResponseManager]
2015-11-25 17:23:57:019 SEVER [node1448375802075] error on creating cluster 'orole_node1448375802075' in class 'ORole': [OHazelcastPlugin][192.168.135.25]:2434 [orientdb] [3.5.3] Error while logging
processing event
com.orientechnologies.orient.server.distributed.ODistributedException: com.orientechnologies.orient.server.distributed.ODistributedException: Error on creating cluster 'orole_node1448375802075' in cla
ss 'ORole'
at com.orientechnologies.orient.server.hazelcast.OHazelcastDistributedDatabase.configureDatabase(OHazelcastDistributedDatabase.java:241)
at com.orientechnologies.orient.server.hazelcast.OHazelcastPlugin.installDatabaseFromNetwork(OHazelcastPlugin.java:1122)
at com.orientechnologies.orient.server.hazelcast.OHazelcastPlugin.requestDatabase(OHazelcastPlugin.java:964)
at com.orientechnologies.orient.server.hazelcast.OHazelcastPlugin.installDatabase(OHazelcastPlugin.java:901)
at com.orientechnologies.orient.server.hazelcast.OHazelcastPlugin.installNewDatabases(OHazelcastPlugin.java:1459)
at com.orientechnologies.orient.server.hazelcast.OHazelcastPlugin.checkDatabaseEvent(OHazelcastPlugin.java:1269)
at com.orientechnologies.orient.server.hazelcast.OHazelcastPlugin.entryAdded(OHazelcastPlugin.java:677)
at com.hazelcast.map.impl.MapListenerAdaptors$1$1.onEvent(MapListenerAdaptors.java:63)
at com.hazelcast.map.impl.InternalMapListenerAdapter.onEvent(InternalMapListenerAdapter.java:51)
at com.hazelcast.map.impl.MapEventPublishingService.callListener(MapEventPublishingService.java:90)
at com.hazelcast.map.impl.MapEventPublishingService.dispatchEntryEventData(MapEventPublishingService.java:102)
at com.hazelcast.map.impl.MapEventPublishingService.dispatchEvent(MapEventPublishingService.java:46)
at com.hazelcast.map.impl.MapEventPublishingService.dispatchEvent(MapEventPublishingService.java:33)
at com.hazelcast.map.impl.MapService.dispatchEvent(MapService.java:91)
at com.hazelcast.map.impl.MapService.dispatchEvent(MapService.java:61)
at com.hazelcast.spi.impl.eventservice.impl.EventPacketProcessor.process(EventPacketProcessor.java:53)
at com.hazelcast.spi.impl.eventservice.impl.RemoteEventPacketProcessor.run(RemoteEventPacketProcessor.java:38)
at com.hazelcast.util.executor.StripedExecutor$Worker.process(StripedExecutor.java:190)
at com.hazelcast.util.executor.StripedExecutor$Worker.run(StripedExecutor.java:174)
Caused by: com.orientechnologies.orient.server.distributed.ODistributedException: Error on creating cluster 'orole_node1448375802075' in class 'ORole'
at com.orientechnologies.orient.server.hazelcast.OHazelcastPlugin.installLocalClusterPerClass(OHazelcastPlugin.java:1622)
at com.orientechnologies.orient.server.hazelcast.OHazelcastPlugin.installDbClustersForLocalNode(OHazelcastPlugin.java:1291)
at com.orientechnologies.orient.server.hazelcast.OHazelcastPlugin$2.call(OHazelcastPlugin.java:1125)
at com.orientechnologies.orient.server.hazelcast.OHazelcastPlugin$2.call(OHazelcastPlugin.java:1122)
at com.orientechnologies.orient.server.hazelcast.OHazelcastDistributedDatabase.configureDatabase(OHazelcastDistributedDatabase.java:239)
... 18 more
Caused by: com.orientechnologies.orient.server.distributed.ODistributedException: Error on executing distributed request (id=5 from=node1448375802075 task=command_sql(create cluster orole_node14483758
02075)) against database 'Test1.[]' to nodes [node1448452328356]
at com.orientechnologies.orient.server.hazelcast.OHazelcastDistributedDatabase.send2Nodes(OHazelcastDistributedDatabase.java:189)
at com.orientechnologies.orient.server.hazelcast.OHazelcastPlugin.sendRequest(OHazelcastPlugin.java:359)
at com.orientechnologies.orient.server.distributed.ODistributedStorage.command(ODistributedStorage.java:307)
at com.orientechnologies.orient.server.distributed.ODistributedStorage.addCluster(ODistributedStorage.java:1269)
at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.addCluster(ODatabaseDocumentTx.java:1263)
at com.orientechnologies.orient.core.metadata.schema.OClassImpl.createClusterIfNeeded(OClassImpl.java:2091)
at com.orientechnologies.orient.core.metadata.schema.OClassImpl.addCluster(OClassImpl.java:1058)
at com.orientechnologies.orient.server.hazelcast.OHazelcastPlugin.installLocalClusterPerClass(OHazelcastPlugin.java:1615)
... 22 more
Caused by: com.orientechnologies.orient.server.distributed.ODistributedException: Quorum 1 not reached for request (id=5 from=node1448375802075 task=command_sql(create cluster orole_node1448375802075)
). Elapsed=20025ms No server in conflict. Received: {node1448452328356=waiting-for-response}
at com.orientechnologies.orient.server.distributed.ODistributedResponseManager.manageConflicts(ODistributedResponseManager.java:585)
at com.orientechnologies.orient.server.distributed.ODistributedResponseManager.getFinalResponse(ODistributedResponseManager.java:349)
at com.orientechnologies.orient.server.hazelcast.OHazelcastDistributedDatabase.waitForResponse(OHazelcastDistributedDatabase.java:423)
at com.orientechnologies.orient.server.hazelcast.OHazelcastDistributedDatabase.send2Nodes(OHazelcastDistributedDatabase.java:186)
... 29 more

@lvca
Copy link
Member

lvca commented Nov 28, 2015

Having writeQuorum=2 and readQuorum=2 has no sense, because with writeQuorum you already guarantee consistency on write. Could you please try setting readQuorum=1 and retry?

@lvca lvca added this to the 2.1.x (next hotfix) milestone Nov 28, 2015
@gayatri-inbetween
Copy link

Hi Ivca

I am working on the same issue mentioned above, by a team member of mine.
We tried to build the 2.1.x branch, but the ant build fails with the following error

orientdb/distributed/src/main/java/com/orientechnologies/orient/server/hazelcast/OHazelcastDistributedMessageService.java:236: error: cannot find symbol
[javac] if (d.getServiceName().equals(QueueService.SERVICE_NAME))
[javac] ^
[javac] symbol: variable QueueService
[javac] location: class OHazelcastDistributedMessageService
[javac] Note: Some input files use unchecked or unsafe operations.
[javac] Note: Recompile with -Xlint:unchecked for details.
[javac] 2 errors
[javac] 4 warnings

BUILD FAILED
/orientdb/build.xml:133: The following error occurred while executing this line:
/orientdb/build.xml:21: The following error occurred while executing this line:
/orientdb/_base/base-build.xml:44: Compile failed; see the compiler error output for details.

@lvca
Copy link
Member

lvca commented Dec 4, 2015

Don't us ANT, but MAVEN:

mvn clean install

@sandhya-inbetween
Copy link
Author

We have tested with provided solutions on Version: 2.1.7 and it works.
Thank You.

@lvca lvca closed this as completed Dec 9, 2015
@lvca
Copy link
Member

lvca commented Dec 9, 2015

Cool, closing it.

@lvca lvca modified the milestones: 2.1.7, 2.1.x (next hotfix) Dec 9, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

5 participants