-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use sequence numbers to identify out of order delivery in replicas & recovery #24060
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM I left some minors that I would love to be addressed but no need for another round on my end.
import static org.elasticsearch.common.lucene.uid.Versions.NOT_FOUND; | ||
|
||
/** Utility class to resolve the Lucene doc ID, version, seqNo and primaryTerms for a given uid. */ | ||
public class VersionsAndSeqNoResolver { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make this class final? and does it need to be public?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure
* </ul> | ||
*/ | ||
public static DocIdAndVersion loadDocIdAndVersion(IndexReader reader, Term term) throws IOException { | ||
assert term.field().equals(UidFieldMapper.NAME); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh please give us a message here what field it is... it will save some CPU cycles if shit hits the fan
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good one.
* </ul> | ||
*/ | ||
public static DocIdAndSeqNo loadDocIdAndSeqNo(IndexReader reader, Term term) throws IOException { | ||
assert term.field().equals(UidFieldMapper.NAME); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
if (versionValue != null) { | ||
if (op.seqNo() > versionValue.getSeqNo() || | ||
(op.seqNo() == versionValue.getSeqNo() && op.primaryTerm() > versionValue.getTerm())) | ||
status = OpVsLuceneDocStatus.OP_NEWER; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in general I find it odd that we have OP_STALE_OR_EQUAL
instead of OP_NEWER_OR_EQUAL
which is more natural then you have only one state where indexing would be fatal but I can see that this is an optimization to not index the doc again?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes indeed. In the case of equality it should be identical and we can skip tokenization etc.
@@ -601,7 +640,14 @@ private IndexingStrategy planIndexingAsNonPrimary(Index index) throws IOExceptio | |||
// unlike the primary, replicas don't really care to about creation status of documents | |||
// this allows to ignore the case where a document was found in the live version maps in | |||
// a delete state and return false for the created flag in favor of code simplicity | |||
final OpVsLuceneDocStatus opVsLucene = compareOpToLuceneDocBasedOnVersions(index); | |||
final OpVsLuceneDocStatus opVsLucene; | |||
if (index.seqNo() != SequenceNumbersService.UNASSIGNED_SEQ_NO) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please leave a comment here when this unassigned seq no can occur... and when it can go away
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left an extra comment on the assertion in the else clause:
// This can happen if the primary is still on an old node and send traffic without seq# or we recover from translog
// created by an old version.
assert config().getIndexSettings().getIndexVersionCreated().before(Version.V_6_0_0_alpha1_UNRELEASED) :
"index is newly created but op has no sequence numbers. op: " + index;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
++
@@ -32,8 +32,15 @@ | |||
/** the version of the document. used for versioned indexed operations and as a BWC layer, where no seq# are set yet */ | |||
private final long version; | |||
|
|||
VersionValue(long version) { | |||
/** the seq number of the operation that last changed the associated uuid */ | |||
private final long seqNo; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can also just use the members no need to getters... I like it when it's struct like for internal things like this. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
works for me. Less fluff is good.
@s1monw all your feedback is addressed. Thx. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
still LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR was very easy to read; I'm glad that it was factored out of the first PR for this reason. I left some very minor comments. It LGTM and I don't need to do another review.
@@ -43,15 +46,18 @@ | |||
* in more than one document! It will only return the first one it | |||
* finds. */ | |||
|
|||
final class PerThreadIDAndVersionLookup { | |||
final class PerThreadIDAndVersionSeqNoLookup { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "and" is in an awkward place now, maybe PerThreadIDVersionAndSeqNoLookup
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed and fixed.
@@ -113,4 +121,21 @@ private int getDocID(BytesRef id, Bits liveDocs) throws IOException { | |||
return DocIdSetIterator.NO_MORE_DOCS; | |||
} | |||
} | |||
|
|||
/** Return null if id is not found. */ | |||
public DocIdAndSeqNo lookupSequenceNo(BytesRef id, Bits liveDocs, LeafReaderContext context) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think lookupSeqNo
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that this method can be package private? And maybe the other lookup methods in this class too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed on all accounts.
CloseableThreadLocal<PerThreadIDAndVersionSeqNoLookup> other = lookupStates.putIfAbsent(key, ctl); | ||
if (other == null) { | ||
// Our CTL won, we must remove it when the | ||
// core is closed: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that you can put this comment all on one line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know it was like that before, but we are here now. 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure
reader.addCoreClosedListener(removeLookupState); | ||
} else { | ||
// Another thread beat us to it: just use | ||
// their CTL: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that you can put this comment all on one line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know it was like that before, but we are here now. 😄
} | ||
} | ||
|
||
/** returns 0 if the primary term is not found */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add link to IndexMetaData#primaryTerm
as it has a note about the primary term on an operational shard always being strictly positive?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good one.
@@ -416,6 +417,43 @@ public GetResult get(Get get, Function<String, Searcher> searcherFactory) throws | |||
LUCENE_DOC_NOT_FOUND | |||
} | |||
|
|||
private OpVsLuceneDocStatus compareOpToLuceneDocBasedOnSeqNo(final Operation op) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's so nice to finally see this arriving.
@@ -416,6 +417,43 @@ public GetResult get(Get get, Function<String, Searcher> searcherFactory) throws | |||
LUCENE_DOC_NOT_FOUND | |||
} | |||
|
|||
private OpVsLuceneDocStatus compareOpToLuceneDocBasedOnSeqNo(final Operation op) throws IOException { | |||
assert op.seqNo() != SequenceNumbersService.UNASSIGNED_SEQ_NO : "resolving ops based seq# but no seqNo is found"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: based
-> based on
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed. Thx.
} | ||
|
||
/** returns 0 if the primary term is not found */ | ||
public long lookUpPrimaryTerm(int docID) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can it be package private?
public final long version; | ||
public final LeafReaderContext context; | ||
|
||
public DocIdAndVersion(int docId, long version, LeafReaderContext context) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that this constructor can be package private?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure
public final long seqNo; | ||
public final LeafReaderContext context; | ||
|
||
public DocIdAndSeqNo(int docId, long seqNo, LeafReaderContext context) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that this constructor can be package private?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Punch it.
Thx @jasontedor @s1monw |
* master: Use sequence numbers to identify out of order delivery in replicas & recovery (elastic#24060) Remove customization of ES_USER and ES_GROUP
Internal indexing requests in Elasticsearch may be processed out of order and repeatedly. This is important during recovery and due to concurrency in replicating requests between primary and replicas. As such, a replica/recovering shard needs to be able to identify that an incoming request contains information that is old and thus need not be processed. The current logic is based on external version. This is sadly not sufficient. This PR moves the logic to rely on sequences numbers and primary terms which give the semantics we need.
Relates to #10708