New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Introduce primary/replica mode for GlobalCheckPointTracker #25468

Merged

jasontedor merged 15 commits into elastic:master from ywelsch:enhance/harden-globalcheckpoint-tracker

Jul 7, 2017

Contributor

ywelsch commented Jun 29, 2017 •

edited

Loading

This PR refactors the GlobalCheckPointTracker to make it more resilient. The main idea is to make it more explicit what state is actually captured and how that state is updated through replication / cluster state updates etc. It also fixes the issue where the local checkpoint information is not being updated when a shard becomes primary. The primary relocation handoff becomes very simple too, we can just verbatim copy over the internal state.

The PR still misses some tests, which I will address soon. The main reason for opening it as is to get initial feedback.


          Introduce primary/replica mode for GlobalCheckPointTracker

75d135f

ywelsch added :Sequence IDs >enhancement v6.0.0 labels

ywelsch requested a review from jasontedor

June 29, 2017 10:10

bleskes mentioned this pull request

Sequence Numbers related work slated for 6.0.0 #25355

Closed

9 tasks

ywelsch added 2 commits

June 30, 2017 10:36


          Additional assertion that primary context will not contain entries th…

a96bcd5

…at block GCP advancement


          Merge remote-tracking branch 'elastic/master' into enhance/harden-glo…

c02bf56

…balcheckpoint-tracker

ywelsch requested a review from bleskes

July 1, 2017 11:59

ywelsch added 7 commits

July 4, 2017 20:02


          remove masterInitializing and masterInSync

ad6d23a


          fix tests

1aad69e


          make sure invariants are not violated by initializeWithPrimaryContext


          Add BWC layer for primary context handoff

45ecd88


          minor fixes

58dcfb5


          don't let shards on pre-6.0 nodes block global checkpoint advancement

798056c


          checkstyle

f413758

ywelsch mentioned this pull request

Forward compatibility for primary context handoff on 6.x #25545

Merged

bleskes suggested changes

View reviewed changes

Contributor

bleskes left a comment

I left a bunch of nits around assertion messages. There is one important test coverage ask for GlobalCheckpointTracker#initializeWithPrimaryContext, which caused to mark it as request changes. All the rest LGTM. Thanks @ywelsch

core/src/main/java/org/elasticsearch/index/seqno/GlobalCheckpointTracker.java Outdated

+                      /**
+                       * during relocation handoff there are no entries blocking global checkpoint advancement
+                       */
+                      assert !handOffInProgress || pendingInSync.isEmpty();

Contributor

bleskes Jul 6, 2017

can we add a message that says what the pending in sync aIds are?

core/src/main/java/org/elasticsearch/index/seqno/GlobalCheckpointTracker.java Outdated

+                      /**
+                       * the computed global checkpoint is always up-to-date
+                       */
+                      assert !primaryMode || globalCheckpoint == computeGlobalCheckPoint(pendingInSync, localCheckpoints.values(), globalCheckpoint);

Contributor

bleskes Jul 6, 2017

can we add a message with the globalCheckpoint and the result of the computation?

core/src/main/java/org/elasticsearch/index/seqno/GlobalCheckpointTracker.java Outdated

+                          /**
+                           * blocking global checkpoint advancement only happens for shards that are not in-sync
+                           */
+                          assert !pendingInSync.contains(entry.getKey()) || !entry.getValue().inSync;

Contributor

bleskes Jul 6, 2017

can we add message that indicates which aID it is?

core/src/main/java/org/elasticsearch/index/seqno/GlobalCheckpointTracker.java Outdated

-                   * Notifies the service of the current allocation ids in the cluster state. This method trims any shards that have been removed.
+                   * Initializes the global checkpoint tracker in primary mode (see {@link #primaryMode}. Called on primary activation or promotion.
+                   */
+                  public synchronized void initializeAsPrimary(final String allocationId, final long localCheckpoint) {

Contributor

bleskes Jul 6, 2017

how would you feel about naming this method (and it's counterpart) activatePrimaryMode ? I was confused a couple of times as initialize and primary terms already used in the IndexShard context (a primary relocation target is a primary shard and is already initializing long before the method is called) .

core/src/main/java/org/elasticsearch/index/seqno/GlobalCheckpointTracker.java Outdated

+                  public synchronized void initializeAsPrimary(final String allocationId, final long localCheckpoint) {
+                      assert invariant();
+                      assert primaryMode == false;
+                      assert localCheckpoints.get(allocationId) != null && localCheckpoints.get(allocationId).inSync &&

Contributor

bleskes Jul 6, 2017

can we add a message with localCheckpoints.get(allocationId) and allocationId ?

core/src/main/java/org/elasticsearch/index/shard/IndexShard.java Outdated

+                          persistMetadata(path, indexSettings, newRouting, currentRouting, logger);
+                          if (shardRouting.primary()) {
+                              assert Thread.holdsLock(mutex);

Contributor

bleskes Jul 6, 2017

this can go away, since you inlined the method

core/src/main/java/org/elasticsearch/indices/recovery/RecoverySourceHandler.java

                               }
                           }
+                          cancellableThreads.execute(() -> runUnderOperationPermit(() -> shard.initiateTracking(request.targetAllocationId())));

Contributor

bleskes Jul 6, 2017

question, why did you choose to do it after phase1 ?

Contributor Author

ywelsch Jul 7, 2017

It only becomes relevant that we're properly tracking the target shard when starting the engine on the target shard. With this we ensure that we don't miss any local checkpoint updates from the target shard.

core/src/test/java/org/elasticsearch/index/replication/IndexLevelReplicationTests.java

@@ @@ -164,9 +165,10 @@ public void testCheckpointsAdvance() throws Exception { @@
                                */
                               final Matcher<Long> globalCheckpointMatcher;
                               if (shardRouting.primary()) {
-                                  globalCheckpointMatcher = numDocs == 0 ? equalTo(unassignedSeqNo) : equalTo(numDocs - 1L);
+                                  globalCheckpointMatcher = numDocs == 0 ? equalTo(SequenceNumbersService.NO_OPS_PERFORMED) : equalTo(numDocs - 1L);

Contributor

bleskes Jul 6, 2017

++. It is a good that this is fixed and we start with no ops performed.

core/src/test/java/org/elasticsearch/index/seqno/GlobalCheckpointTrackerTests.java Outdated

    
                      /*

                       * Now we will add an allocation ID to each of active and initializing and ensure they propagate through. Using different lengths

                       * than we have been using above ensures that we can not collide with a previous allocation ID

                       */

                      newActiveAllocationIds.add(randomAlphaOfLength(32));

                      // TODO: fix this: newActiveAllocationIds.add(initializingIds.iterator().next());

Contributor

bleskes Jul 6, 2017

what do you mean?

Contributor Author

ywelsch Jul 7, 2017

I've removed this line, as it was an illegal operation (adding a fresh in-sync allocation id while the tracker was in primary mode).

Contributor

bleskes Jul 7, 2017

clear. thx.

core/src/test/java/org/elasticsearch/index/seqno/GlobalCheckpointTrackerTests.java

                       assertThat(tracker.getGlobalCheckpoint(), equalTo((long) nextActiveLocalCheckpoint));
                   }
-                  public void testPrimaryContextOlderThanAppliedClusterState() {

Contributor

bleskes Jul 6, 2017

I think we need equivalent tests of initializeWithPrimaryContext and it's relation to appliedClusterStateVersion ?


          review comments

bc34c7f

Contributor Author

ywelsch commented Jul 7, 2017

@bleskes I've addressed all comments. Have another look.


          checkstyle

36f66f0

bleskes approved these changes

View reviewed changes

Contributor

bleskes left a comment

LGTM. Thanks for the thorough test.

core/src/test/java/org/elasticsearch/index/seqno/GlobalCheckpointTrackerTests.java Outdated


		activatePrimary(clusterState, oldPrimary);

		for (int i = 0; i < randomInt(10); i++) {

Contributor

bleskes Jul 7, 2017

nit: this samples the randomInt again and again.. much less likely to get to 10.


          randomInt

0ff1c4d

ywelsch added a commit that referenced this pull request


          Forward compatibility for primary context handoff on 6.x (#25545)

cf34daf

Companion PR for #25468

Relates to #25355

jasontedor added 2 commits

July 7, 2017 13:22


          Fix naming

499a8fe


          Avoid dangling Javadoc warnings in IDE

ec60946

jasontedor approved these changes

View reviewed changes

Member

jasontedor left a comment

LGTM.

jasontedor merged commit baa87db into elastic:master

This was referenced Jul 7, 2017

After recovering, a primary should update its knowledge of its own local checkpoint #25415

Closed

[CI[ SearchScrollIT#testCloseAndReopenOrDeleteWithActiveScroll failure #25465

Closed

bleskes mentioned this pull request

Add Sequence Numbers to write operations #10708

Closed

64 tasks

colings86 added v6.0.0-beta1 and removed v6.0.0 labels

clintongormley added :Distributed Indexing/Engine and removed :Sequence IDs labels

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Distributed Indexing/Engine >enhancement v6.0.0-beta1