Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use standard semantics for retried auto-id requests #47311

Merged
merged 5 commits into from
Oct 1, 2019

Conversation

ywelsch
Copy link
Contributor

@ywelsch ywelsch commented Sep 30, 2019

Adds support for handling auto-id requests with optype CREATE. Also simplifies the code handling this by using the standard indexing path when dealing with possible retry conflicts.

Relates #47169

@ywelsch ywelsch added >non-issue :Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard. v8.0.0 v7.5.0 labels Sep 30, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

Copy link
Contributor

@henningandersen henningandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ywelsch for working on this. My primary comment here is if we should treat op_type=create with auto-id identically to op_type=index with auto-id? More details in comments.

@@ -267,9 +267,6 @@ public void testVersionMapAfterAutoIDDocument() throws IOException {
try (Engine.Searcher searcher = engine.acquireSearcher("test")) {
assertEquals(2, searcher.getIndexReader().numDocs());
}
if (operation.origin() == PRIMARY) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I follow why this is not still valid to assert?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was an artifact of an earlier iteration where I had changed things up more fundamentally. I've reinstated this check again in 76066ee.

} else {
plan = IndexingStrategy.optimizedAppendOnly(1L);
}
final boolean canOptimizeAddDocument = canOptimizeAddDocument(index);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this means that a double delivery of an auto-id with op_type=create will fail with a version conflict? I know this is similar to regular if_seqno update, but I would prefer to not do that in the auto-id case, since we know that it is safe? Especially because this makes it simpler to later advocate to change POST index/_doc to imply op_type=create.

We can still do the code simplification, we should just avoid doing the isVersionConflictForWrites check when canOptimizeAddDocument=true.

Another version of it is to skip the write if the doc is already there? Would be kind of nicer towards someone doing update-by-query concurrently with this (the updated-counts would be slightly more correct, since we no longer risk a retry overwriting something that was already updated). However, I think we then risk returning a non-fsync'ed result and we should probably defer this to a follow-up if we want to do this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, this will mean that double-delivery of an indexing operation with auto-id and optype create will fail with a version conflict, which is similar to an indexing operation without auto-id but with optype create. I agree that we should try to handle internal retries better (in both situations, with the auto-id case perhaps being simpler), but I also do not want to continue being lenient when it comes to overriding newer docs due to retries, in particular if the optype is create (even more so if we associate an index privilege with this).
My plan was to explore something along the lines of your last paragraph as a follow-up, but I do not see this as a blocker to release the given change here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Contributor

@henningandersen henningandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

} else {
plan = IndexingStrategy.optimizedAppendOnly(1L);
}
final boolean canOptimizeAddDocument = canOptimizeAddDocument(index);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@ywelsch ywelsch merged commit 9993cf3 into elastic:master Oct 1, 2019
ywelsch added a commit to ywelsch/elasticsearch that referenced this pull request Oct 2, 2019
Adds support for handling auto-id requests with optype CREATE. Also simplifies the code
handling this by using the standard indexing path when dealing with possible retry conflicts.

Relates elastic#47169
ywelsch added a commit that referenced this pull request Oct 2, 2019
Adds support for handling auto-id requests with optype CREATE. Also simplifies the code
handling this by using the standard indexing path when dealing with possible retry conflicts.

Relates #47169
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard. >non-issue v7.5.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants