2.1-RC3: Error on executing distributed request for counter: Quorum 2 not reached for request (but it is?) #4316

alexpmorris · 2015-06-06T18:26:18Z

I was trying to put together a short java demo/test case for a distributed chat application, and I've been coming up with the following error with OrientDB 2.1-RC3. Write Quorum is set to 2. The response received would indicate both databases are in sync, yet the error persists. I also tried changing the Conflict Strategy on the class to version, content, and automerge, all with the same result.

ALSO NOTE, that while an exception is thrown each time, the field IS still properly incremented each time:

com.orientechnologies.orient.server.distributed.ODistributedException: Error on executing distributed request (id=82 from=db000 task=command_sql(UPDATE ChatCounters INCREMENT counter=1 RETURN AFTER $current.counter WHERE name='counter1') user=#5:0) against database 'chat.[]' to nodes [db001, db000]
--> com.orientechnologies.orient.server.distributed.ODistributedException: Quorum 2 not reached for request (id=82 from=db000 task=command_sql(UPDATE ChatCounters INCREMENT counter=1 RETURN AFTER $current.counter WHERE name='counter1') user=#5:0). Elapsed=15ms Servers in timeout/conflict are: - db001: [{value:33}] Received: {db000=[{value:33}], db001=[{value:33}]}

com.orientechnologies.orient.server.distributed.ODistributedException: Error on executing distributed request (id=82 from=db000 task=command_sql(UPDATE ChatCounters INCREMENT counter=1 RETURN AFTER $current.counter WHERE name='counter1') user=#5:0) against database 'chat.[]' to nodes [db001, db000]
--> com.orientechnologies.orient.server.distributed.ODistributedException: Quorum 2 not reached for request (id=82 from=db000 task=command_sql(UPDATE ChatCounters INCREMENT counter=1 RETURN AFTER $current.counter WHERE name='counter1') user=#5:0). Elapsed=15ms Servers in timeout/conflict are: - db001: [{value:34}] Received: {db000=[{value:34}], db001=[{value:34}]}

lvca · 2015-06-12T11:51:10Z

Could you please try with last 2.1-SNAPSHOT?

alexpmorris · 2015-06-12T17:14:03Z

I tried with the latest snapshot and the problem is still there. HOWEVER, the latest snapshot listed is orientdb-community-2.1-20150610.173559-189-distribution.zip

So I'll try again when the next update shows up for today's date or better.

alexpmorris · 2015-06-16T23:52:53Z

just tried the latest version from orientdb-community-2.1-20150616.215843-194-distribution.zip

Unfortunately, same result, and I still received the same error message.

alexpmorris · 2015-06-17T00:04:46Z

I found some more warning messages that could perhaps help better diagnose the problem:

2015-06-16 19:56:35:088 WARNING [db000] detected 1 node(s) in timeout or in conflict and quorum (2) has not been reached, rolling back changes for request (id=10 from=db000 task=command_sql(UPDATE ChatCounters INCREMENT counter=1 RETURN AFTER $current.counter WHERE name='counter1') user=#5:0) [ODistributedResponseManager]

2015-06-16 19:56:35:088 WARNING [db000] Quorum 2 not reached for request (id=10 from=db000 task=command_sql(UPDATE ChatCounters INCREMENT counter=1 RETURN AFTER $current.counter WHERE name='counter1') user=#5:0). Elapsed=16ms Servers in timeout/conflict are: - db000: [{value:52}] Received: {db000=[{value:52}], db001=[{value:52}]} [ODistributedResponseManager]

2015-06-16 19:56:35:088 WARNING [db000] sending undo message for request (id=10 from=db000 task=command_sql(UPDATE ChatCounters INCREMENT counter=1 RETURN AFTER $current.counter WHERE name='counter1') user=#5:0) to server db000 [ODistributedResponseManager]

2015-06-16 19:56:35:088 WARNING [db000] sending undo message for request (id=10 from=db000 task=command_sql(UPDATE ChatCounters INCREMENT counter=1 RETURN AFTER $current.counter WHERE name='counter1') user=#5:0) to server db001 [ODistributedResponseManager]

2015-06-16 19:56:35:088 SEVERE Internal server error: com.orientechnologies.orient.server.distributed.ODistributedException: Error on executing distributed request (id=10 from=db000 task=command_sql(UPDATE ChatCounters INCREMENT counter=1 RETURN AFTER $current.counter WHERE name='counter1') user=#5:0) against database 'chat.[]' to nodes [db001, db000] --> com.orientechnologies.orient.server.distributed.ODistributedException: Quorum 2 not reached for request (id=10 from=db000 task=command_sql(UPDATE ChatCounters INCREMENT counter=1 RETURN AFTER $current.counter WHERE name='counter1') user=#5:0). Elapsed=16ms Servers in timeout/conflict are: - db000: [{value:52}] Received: {db000=[{value:52}], db001=[{value:52}]} [ONetworkProtocolHttpDb]

alexpmorris · 2015-06-17T00:15:36Z

After some additional testing, it seems as if the problem is related to this part of the SQL query:

RETURN AFTER $current.counter

because when I queried just the following (without asking for the new value to be returned), I received NO error and everything appeared to work correctly:

UPDATE ChatCounters INCREMENT counter=1 WHERE name='counter1'

lvca · 2015-11-19T16:47:35Z

As workaround, could you use this?

UPDATE ChatCounters INCREMENT counter=1 WHERE name='counter1' RETURN AFTER $current

And get the counter from the record?

alexpmorris · 2015-11-20T05:56:52Z

I tried banging away a bunch of times with the latest 2.2 snapshot, and it didn't give me the error this time for either "RETURN AFTER $current" or "RETURN AFTER $current.counter". One thing though, in case it's a bug, if RETURN is after WHERE as in your statement, I get an error:

FAILS:
UPDATE ChatCounters INCREMENT counter=1 WHERE name='counter1' RETURN AFTER $current

com.orientechnologies.orient.core.sql.OCommandSQLParsingException: Error on parsing command at position #0: Encountered " <RETURN> "RETURN "" at line 1, column 63.
 Was expecting one of:
 <EOF> 
 <AND> ...
 <OR> ...
 <LIMIT> ...
 <TIMEOUT> ...
 <LOCK> ...
 ";" ...

This seems to work now (as does "RETURN AFTER $current"):
UPDATE ChatCounters INCREMENT counter=1 RETURN AFTER $current.counter WHERE name='counter1'

lvca · 2015-11-22T19:10:45Z

My mistake the RETURN must go before the WHERE as from docs.

alexpmorris changed the title ~~Error on executing distributed request for counter: Quorum 2 not reached for request (but it is?)~~ 2.1-RC3: Error on executing distributed request for counter: Quorum 2 not reached for request (but it is?) Jun 6, 2015

lvca added the bug label Jun 12, 2015

lvca self-assigned this Jun 12, 2015

lvca added this to the 2.1 GA milestone Jun 12, 2015

lvca added the waiting reply label Jun 12, 2015

prjhub removed waiting reply labels Jun 12, 2015

lvca modified the milestones: 2.1 GA, 2.1.1 Aug 5, 2015

lvca modified the milestones: 2.1.1, 2.1.x (next hotfix) Aug 31, 2015

lvca closed this as completed Nov 22, 2015

lvca modified the milestones: 2.2, 2.1.x (next hotfix) Nov 22, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2.1-RC3: Error on executing distributed request for counter: Quorum 2 not reached for request (but it is?) #4316

2.1-RC3: Error on executing distributed request for counter: Quorum 2 not reached for request (but it is?) #4316

alexpmorris commented Jun 6, 2015

lvca commented Jun 12, 2015

alexpmorris commented Jun 12, 2015

alexpmorris commented Jun 16, 2015

alexpmorris commented Jun 17, 2015

alexpmorris commented Jun 17, 2015

lvca commented Nov 19, 2015

alexpmorris commented Nov 20, 2015

lvca commented Nov 22, 2015

2.1-RC3: Error on executing distributed request for counter: Quorum 2 not reached for request (but it is?) #4316

2.1-RC3: Error on executing distributed request for counter: Quorum 2 not reached for request (but it is?) #4316

Comments

alexpmorris commented Jun 6, 2015

lvca commented Jun 12, 2015

alexpmorris commented Jun 12, 2015

alexpmorris commented Jun 16, 2015

alexpmorris commented Jun 17, 2015

alexpmorris commented Jun 17, 2015

lvca commented Nov 19, 2015

alexpmorris commented Nov 20, 2015

lvca commented Nov 22, 2015