Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gracefully shutdown elasticsearch #96363

Merged

Conversation

stu-elastic
Copy link
Contributor

@stu-elastic stu-elastic commented May 25, 2023

Early in shutdown, stop listening for HTTP requests and gracefully close all HTTP connections.

Adds http.shutdown_grace_period setting, the maximum amount of time to wait for in-flight HTTP requests to finish. After that time, the http channels are all closed.

Graceful shutdown procedure:

  1. Stop listening for new HTTP connections
  2. Tell all new requests to add Connection: close response header and close the channel after the request.
  3. Wait up to the grace period for all open connections to close
  4. If grace period expired, close all remaining connections

Fixes: #96147

@stu-elastic stu-elastic added WIP :Core/Infra/Node Lifecycle Node startup, bootstrapping, and shutdown labels May 25, 2023
@stu-elastic stu-elastic marked this pull request as ready for review May 25, 2023 21:40
Copy link
Member

@rjernst rjernst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The approach looks good to me. I think we'll want some new tests to check the edge cases.

ThreadPool threadPool = injector.getInstance(ThreadPool.class);
HttpServerTransport httpServerTransport = injector.getInstance(HttpServerTransport.class);

Future<?> httpServerStopped = threadPool.generic().submit(httpServerTransport::stop);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can do this in a dedicated thread created here. We don't need to balance the work of this thread compared to other work, and this isn't a generic task in the system, it is special for shutdown.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to using a newly created thread.

FutureUtils.get(allClientsClosedListener, shutdownGracePeriodMillis, TimeUnit.MILLISECONDS);
closed = true;
} catch (ElasticsearchTimeoutException t) {
logger.trace(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a warning? I would think we would want more visibility into whether this grace period was exhausted and connections are being closed forcefully.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to warning.


try {
allClientsClosedListener.get();
logger.warn("STU: done force closing clients");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leftover debugging?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

@ChrisHegarty
Copy link
Contributor

The approach seems sound to me. Just a few minor clarifications (without looking too deeply at the code)

  1. Stop listening for new HTTP connections

👍

  1. Tell all new requests to add Connection: close response header and close the channel after the request.

Since the listening socket has been closed there are no new request. This is all outstanding in-flight requests, correct? If so, then that makes perfect sense to me.

  1. Wait up to the grace period for all open connections to close
  2. If grace period expired, close all remaining connections

👍 .. and I assume that terminates outstanding requests in a forceful way, i.e. the TCP socket just gets closed. Good.

@stu-elastic
Copy link
Contributor Author

stu-elastic commented May 26, 2023

Since the listening socket has been closed there are no new request. This is all outstanding in-flight requests, correct?

There's some subtlety based on the actual implementation.

After 1, no new TCP connections can be established.

All new HTTP requests must come in on established connections.

If we've started handling a request before getting shut down, we will not close on that request as DefaultRestChannel has already been created.

The next request will get a new DefaultRestChannel that will close the connection after that request.

We could do slightly better so long as we had not finished sending the HTTP headers but I don't think it's worth the complexity.

.. and I assume that terminates outstanding requests in a forceful way, i.e. the TCP socket just gets closed. Good.

Yup.

@ChrisHegarty
Copy link
Contributor

Thanks for the explanation @stu-elastic 👍

@stu-elastic
Copy link
Contributor Author

Ok, looks like this is a good approach. I will add some tests, remove RFC from the title and remove the WIP label and we can proceed with review.

Copy link
Contributor

@Tim-Brooks Tim-Brooks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach looks good to me.

@stu-elastic stu-elastic changed the title RFC: Gracefully shutdown elasticsearch Gracefully shutdown elasticsearch Jun 1, 2023
@stu-elastic stu-elastic added >feature and removed WIP labels Jun 1, 2023
@elasticsearchmachine elasticsearchmachine added the Team:Core/Infra Meta label for core/infra team label Jun 1, 2023
@elasticsearchmachine
Copy link
Collaborator

Hi @stu-elastic, I've created a changelog YAML for you.

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra (Team:Core/Infra)

@stu-elastic
Copy link
Contributor Author

@rjernst (and other interested reviewers) I've added tests, please take another pass.

Copy link
Member

@rjernst rjernst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding tests! I have a couple thoughts.

/**
* Gracefully shut down. If {@link HttpTransportSettings#SETTING_HTTP_SERVER_SHUTDOWN_GRACE_PERIOD} is zero, the default, then
* forcefully close all open connections immediately.
* Serially run through the following step:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: steps

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pluralized.

httpServerTransport.stop();
return null;
});
new Thread(stopper).start();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should give the thread a name, so that a thread dump while shutting down will be understandable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Named http-server-transport-stop

transport.incomingRequest(httpRequest, httpChannel);

transport.doStop();
assertThat("stop listening first", transport.stopListeningForNewConnectionsOrder(), is(1));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Relying on the exact order internal methods are called seems fragile. Could we instead take a more black box approach to the transport by testing the behavior we expect? That is eg open a new connection with the transport, but mock out the dispatcher so that you can hang the response when desired. I realize it's more complicated than that in practice, but I think testing method calls is relying too heavily on low level implementation details which will make any future modifications to this code difficult.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the helper functions. The only one I kept was gracefullyCloseConnections() which was already there (but I moved it closer to it's use in doStop()). The tests need that hook, still.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything else is doing black box testing now.

Copy link
Member

@rjernst rjernst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the iterations on the tests @stu-elastic. I left a few more nits.

More broadly from reading through the tests now, I want to check what the expected behavior would be in the following scenario:

  • connection is opened
  • request sent and returned
  • stop is called

Will stop return quickly, or wait for the timeout? I think the latter, which is IMO a problem. The shutdown timeout could be long (lots of io could be necessary to move shards around). We can't expect any more requests will be received after sigterm (assume in the normal case the node has already been removed from receiving new requests). It's certainly something we can iterate on, and I believe you've mentioned this edge case before, but I wanted to call it out more clearly (and verify it is actually as I stated).

TestHttpChannel httpChannel = new TestHttpChannel();
transport.serverAcceptedChannel(httpChannel);
transport.incomingRequest(httpRequest, httpChannel);
// idle connection
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: isn't this an in flight request, not an idle connection?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After transport.incomingRequest(httpRequest, httpChannel); returns, the httpChannel is idle. We'd need to block in the Transport Dispatcher's dispatchRequest for the httpChannel to be active.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added testStopForceClosesConnectionDuringRequest to perform that test.

transport.serverAcceptedChannel(httpChannel);
transport.incomingRequest(new TestHttpRequest(HttpRequest.HttpVersion.HTTP_1_1, RestRequest.Method.GET, "/"), httpChannel);

TestHttpChannel idleChannel = new TestHttpChannel();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: same comment as before, this channel has an open request, it's not sitting idle (no in flight requests)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, after incomingRequest completes, the channel is idle.

}).start();

try {
assertTrue(transport.gracePeriodCalled.await(10, TimeUnit.SECONDS));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: gracePeriodCalled -> gracefullyCloseCalled?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed


assertFalse(transport.testHttpServerChannel.isOpen());
assertFalse(idleChannel.isOpen());
assertFalse(httpChannel.isOpen());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be asserted after the incomingRequest above, since that would have sent back the connection close header?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved.

new HttpServerTransport.Dispatcher() {
@Override
public void dispatchRequest(RestRequest request, RestChannel channel, ThreadContext threadContext) {
channel.sendResponse(emptyResponse(RestStatus.OK));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is where some of my confusion above on idle terminology comes from. We should be able to block a request (hang here) to mimic an in flight request.

Copy link
Contributor Author

@stu-elastic stu-elastic Jun 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added testStopForceClosesConnectionDuringRequest which does that.

transport.incomingRequest(httpRequest, httpChannel);
assertFalse(httpChannel.isOpen());

// TestHttpChannel will throw if closed twice, so this ensures close is not called.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit - this technically breaks the contract of Closeable, which says that close can be called multiple times...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opened #96564 for discussion to change this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thecoop is there any action in this PR?

@stu-elastic
Copy link
Contributor Author

the expected behavior would be in the following scenario:
connection is opened
request sent and returned
stop is called

Will stop return quickly, or wait for the timeout? I think the latter, which is IMO a problem.

There is a timeout for the entire shutdown process and a different timeout for shutting down the http server http.shutdown_grace_period.

In the case outlined above, we will wait for the http.shutdown_grace_period to shutdown the server and the rest of the shutdown, shuffling shards, etc will continue until it finishes or the (presumably much longer) timeout for the entire shutdown expires.

The shutdown timeout could be long (lots of io could be necessary to move shards around).

That is a different timeout, which is for the entire process.

@rjernst
Copy link
Member

rjernst commented Jun 5, 2023

While it's true that this is a separate timeout, it still must be configured as at least the maximum time allowed for a single request. Let's say that is 5 minutes (which seems small, since we have async searches). There could conceivably be little shard migration work to do (eg slow indexing rate so not much to push/pull on the old/new primary). Yet having to wait this timeout means each node must wait at least that long to shutdown.

I'm wondering if there a way to reject the next requests on open channels after current requests finish. Another possibility might be a separate, smaller timeout, to wait after the current request finishes. I'm sure there are other possibilities too.

@stu-elastic
Copy link
Contributor Author

Let's say that is 5 minutes (which seems small, since we have async searches). There could conceivably be little shard migration work to do (eg slow indexing rate so not much to push/pull on the old/new primary). Yet having to wait this timeout means each node must wait at least that long to shutdown.

It's not clear to me that this is different from a request that just started that takes five minutes.

@stu-elastic
Copy link
Contributor Author

I'm wondering if there a way to reject the next requests on open channels after current requests finish. Another possibility might be a separate, smaller timeout, to wait after the current request finishes. I'm sure there are other possibilities too.

We could introduce a barrier that is checked before every request on the channel. However, that is necessarily more expensive than today (new memory barrier per request) and it's not clear that there is much benefit.

Given that we have already agreed to this approach, I suggest any functionality of that sort be discussed and implemented as a follow up.

Copy link
Member

@rjernst rjernst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the iterations on the tests. This looks good, assuming a followup for detecting idle connections.

My only last suggestion is about the test threads. I don't think we need to block the dispatcher, just not write a response to the channel, just as we would do in a non-blocking, in progress request that exceeds the timeout.

try (TestHttpServerTransport transport = new TestHttpServerTransport(gracePeriod(10), () -> {
inDispatch.countDown();
try {
blockingDispatch.await();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may have added some confusion here in my earlier comment about blocking. What I meant to say was, if we don't write a response in the dispatcher, then the channel has a pending request. Until we are tracking the pending requests, we won't be able to assert on them, but in the meantime simply not sending a response should better mimic the real non-blocking behavior Elasticsearch has, and it should set this test up for more realistic checking of idle connection tracking as a followup.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

).start();
try {
inDispatch.await();
} catch (InterruptedException ie) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: just have add to the test method throws Exception and catches like this won't be necessary, any exception will fail the test

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added.

@stu-elastic stu-elastic merged commit 18e0fea into elastic:main Jun 6, 2023
stu-elastic added a commit that referenced this pull request Jun 29, 2023
Close all idle connections before waiting for outstanding requests during graceful shutdown.

Rejects new connections when shutting down.

After stop has been called, the channel is closed after an in-flight http request finishes.  The client will not receive the `Connection: close` header, added in #96363, because we no longer accept new requests after stop has been called.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Core/Infra/Node Lifecycle Node startup, bootstrapping, and shutdown >feature Team:Core/Infra Meta label for core/infra team v8.9.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature request: graceful stop/restart
6 participants