Gracefully shutdown elasticsearch #96363

stu-elastic · 2023-05-25T21:26:31Z

Early in shutdown, stop listening for HTTP requests and gracefully close all HTTP connections.

Adds http.shutdown_grace_period setting, the maximum amount of time to wait for in-flight HTTP requests to finish. After that time, the http channels are all closed.

Graceful shutdown procedure:

Stop listening for new HTTP connections
Tell all new requests to add Connection: close response header and close the channel after the request.
Wait up to the grace period for all open connections to close
If grace period expired, close all remaining connections

Fixes: #96147

rjernst

The approach looks good to me. I think we'll want some new tests to check the edge cases.

rjernst · 2023-05-25T22:35:27Z

server/src/main/java/org/elasticsearch/node/Node.java

+        ThreadPool threadPool = injector.getInstance(ThreadPool.class);
+        HttpServerTransport httpServerTransport = injector.getInstance(HttpServerTransport.class);
+
+        Future<?> httpServerStopped = threadPool.generic().submit(httpServerTransport::stop);


I think we can do this in a dedicated thread created here. We don't need to balance the work of this thread compared to other work, and this isn't a generic task in the system, it is special for shutdown.

Changed to using a newly created thread.

rjernst · 2023-05-25T22:39:41Z

server/src/main/java/org/elasticsearch/http/AbstractHttpServerTransport.java

+                FutureUtils.get(allClientsClosedListener, shutdownGracePeriodMillis, TimeUnit.MILLISECONDS);
+                closed = true;
+            } catch (ElasticsearchTimeoutException t) {
+                logger.trace(


Should this be a warning? I would think we would want more visibility into whether this grace period was exhausted and connections are being closed forcefully.

Changed to warning.

rjernst · 2023-05-25T22:40:29Z

server/src/main/java/org/elasticsearch/http/AbstractHttpServerTransport.java

+
+            try {
+                allClientsClosedListener.get();
+                logger.warn("STU: done force closing clients");


leftover debugging?

ChrisHegarty · 2023-05-26T13:52:15Z

The approach seems sound to me. Just a few minor clarifications (without looking too deeply at the code)

Stop listening for new HTTP connections

👍

Tell all new requests to add Connection: close response header and close the channel after the request.

Since the listening socket has been closed there are no new request. This is all outstanding in-flight requests, correct? If so, then that makes perfect sense to me.

Wait up to the grace period for all open connections to close

If grace period expired, close all remaining connections

👍 .. and I assume that terminates outstanding requests in a forceful way, i.e. the TCP socket just gets closed. Good.

stu-elastic · 2023-05-26T14:39:15Z

Since the listening socket has been closed there are no new request. This is all outstanding in-flight requests, correct?

There's some subtlety based on the actual implementation.

After 1, no new TCP connections can be established.

All new HTTP requests must come in on established connections.

If we've started handling a request before getting shut down, we will not close on that request as DefaultRestChannel has already been created.

The next request will get a new DefaultRestChannel that will close the connection after that request.

We could do slightly better so long as we had not finished sending the HTTP headers but I don't think it's worth the complexity.

.. and I assume that terminates outstanding requests in a forceful way, i.e. the TCP socket just gets closed. Good.

Yup.

ChrisHegarty · 2023-05-29T14:15:09Z

Thanks for the explanation @stu-elastic 👍

stu-elastic · 2023-05-31T14:57:22Z

Ok, looks like this is a good approach. I will add some tests, remove RFC from the title and remove the WIP label and we can proceed with review.

Tim-Brooks

This approach looks good to me.

elasticsearchmachine · 2023-06-01T23:21:42Z

Hi @stu-elastic, I've created a changelog YAML for you.

elasticsearchmachine · 2023-06-01T23:21:42Z

Pinging @elastic/es-core-infra (Team:Core/Infra)

stu-elastic · 2023-06-01T23:28:36Z

@rjernst (and other interested reviewers) I've added tests, please take another pass.

…ess-sigterm-04d-graceful-shutdown

rjernst

Thanks for adding tests! I have a couple thoughts.

rjernst · 2023-06-02T16:52:43Z

server/src/main/java/org/elasticsearch/http/AbstractHttpServerTransport.java

+    /**
+     * Gracefully shut down.  If {@link HttpTransportSettings#SETTING_HTTP_SERVER_SHUTDOWN_GRACE_PERIOD} is zero, the default, then
+     * forcefully close all open connections immediately.
+     * Serially run through the following step:


pluralized.

rjernst · 2023-06-02T17:01:08Z

server/src/main/java/org/elasticsearch/node/Node.java

+            httpServerTransport.stop();
+            return null;
+        });
+        new Thread(stopper).start();


We should give the thread a name, so that a thread dump while shutting down will be understandable.

Named http-server-transport-stop

rjernst · 2023-06-02T17:15:07Z

server/src/test/java/org/elasticsearch/http/AbstractHttpServerTransportTests.java

+            transport.incomingRequest(httpRequest, httpChannel);
+
+            transport.doStop();
+            assertThat("stop listening first", transport.stopListeningForNewConnectionsOrder(), is(1));


Relying on the exact order internal methods are called seems fragile. Could we instead take a more black box approach to the transport by testing the behavior we expect? That is eg open a new connection with the transport, but mock out the dispatcher so that you can hang the response when desired. I realize it's more complicated than that in practice, but I think testing method calls is relying too heavily on low level implementation details which will make any future modifications to this code difficult.

Removed the helper functions. The only one I kept was gracefullyCloseConnections() which was already there (but I moved it closer to it's use in doStop()). The tests need that hook, still.

Everything else is doing black box testing now.

rjernst

Thanks for the iterations on the tests @stu-elastic. I left a few more nits.

More broadly from reading through the tests now, I want to check what the expected behavior would be in the following scenario:

connection is opened
request sent and returned
stop is called

Will stop return quickly, or wait for the timeout? I think the latter, which is IMO a problem. The shutdown timeout could be long (lots of io could be necessary to move shards around). We can't expect any more requests will be received after sigterm (assume in the normal case the node has already been removed from receiving new requests). It's certainly something we can iterate on, and I believe you've mentioned this edge case before, but I wanted to call it out more clearly (and verify it is actually as I stated).

rjernst · 2023-06-03T04:15:26Z

server/src/test/java/org/elasticsearch/http/AbstractHttpServerTransportTests.java

+            TestHttpChannel httpChannel = new TestHttpChannel();
+            transport.serverAcceptedChannel(httpChannel);
+            transport.incomingRequest(httpRequest, httpChannel);
+            // idle connection


nit: isn't this an in flight request, not an idle connection?

After transport.incomingRequest(httpRequest, httpChannel); returns, the httpChannel is idle. We'd need to block in the Transport Dispatcher's dispatchRequest for the httpChannel to be active.

I added testStopForceClosesConnectionDuringRequest to perform that test.

rjernst · 2023-06-03T04:20:50Z

server/src/test/java/org/elasticsearch/http/AbstractHttpServerTransportTests.java

+            transport.serverAcceptedChannel(httpChannel);
+            transport.incomingRequest(new TestHttpRequest(HttpRequest.HttpVersion.HTTP_1_1, RestRequest.Method.GET, "/"), httpChannel);
+
+            TestHttpChannel idleChannel = new TestHttpChannel();


nit: same comment as before, this channel has an open request, it's not sitting idle (no in flight requests)?

Same as above, after incomingRequest completes, the channel is idle.

rjernst · 2023-06-03T04:24:48Z

server/src/test/java/org/elasticsearch/http/AbstractHttpServerTransportTests.java

+            }).start();
+
+            try {
+                assertTrue(transport.gracePeriodCalled.await(10, TimeUnit.SECONDS));


nit: gracePeriodCalled -> gracefullyCloseCalled?

rjernst · 2023-06-03T04:26:29Z

server/src/test/java/org/elasticsearch/http/AbstractHttpServerTransportTests.java

+
+            assertFalse(transport.testHttpServerChannel.isOpen());
+            assertFalse(idleChannel.isOpen());
+            assertFalse(httpChannel.isOpen());


Can this be asserted after the incomingRequest above, since that would have sent back the connection close header?

rjernst · 2023-06-03T04:31:24Z

server/src/test/java/org/elasticsearch/http/AbstractHttpServerTransportTests.java

+                new HttpServerTransport.Dispatcher() {
+                    @Override
+                    public void dispatchRequest(RestRequest request, RestChannel channel, ThreadContext threadContext) {
+                        channel.sendResponse(emptyResponse(RestStatus.OK));


I think this is where some of my confusion above on idle terminology comes from. We should be able to block a request (hang here) to mimic an in flight request.

Added testStopForceClosesConnectionDuringRequest which does that.

server/src/main/java/org/elasticsearch/node/Node.java

thecoop · 2023-06-05T09:53:34Z

server/src/test/java/org/elasticsearch/http/AbstractHttpServerTransportTests.java

+            transport.incomingRequest(httpRequest, httpChannel);
+            assertFalse(httpChannel.isOpen());
+
+            // TestHttpChannel will throw if closed twice, so this ensures close is not called.


nit - this technically breaks the contract of Closeable, which says that close can be called multiple times...

Opened #96564 for discussion to change this.

@thecoop is there any action in this PR?

stu-elastic · 2023-06-05T13:27:34Z

the expected behavior would be in the following scenario:
connection is opened
request sent and returned
stop is called

Will stop return quickly, or wait for the timeout? I think the latter, which is IMO a problem.

There is a timeout for the entire shutdown process and a different timeout for shutting down the http server http.shutdown_grace_period.

In the case outlined above, we will wait for the http.shutdown_grace_period to shutdown the server and the rest of the shutdown, shuffling shards, etc will continue until it finishes or the (presumably much longer) timeout for the entire shutdown expires.

The shutdown timeout could be long (lots of io could be necessary to move shards around).

That is a different timeout, which is for the entire process.

rjernst · 2023-06-05T20:34:08Z

While it's true that this is a separate timeout, it still must be configured as at least the maximum time allowed for a single request. Let's say that is 5 minutes (which seems small, since we have async searches). There could conceivably be little shard migration work to do (eg slow indexing rate so not much to push/pull on the old/new primary). Yet having to wait this timeout means each node must wait at least that long to shutdown.

I'm wondering if there a way to reject the next requests on open channels after current requests finish. Another possibility might be a separate, smaller timeout, to wait after the current request finishes. I'm sure there are other possibilities too.

stu-elastic · 2023-06-05T21:23:07Z

Let's say that is 5 minutes (which seems small, since we have async searches). There could conceivably be little shard migration work to do (eg slow indexing rate so not much to push/pull on the old/new primary). Yet having to wait this timeout means each node must wait at least that long to shutdown.

It's not clear to me that this is different from a request that just started that takes five minutes.

stu-elastic · 2023-06-05T21:27:12Z

I'm wondering if there a way to reject the next requests on open channels after current requests finish. Another possibility might be a separate, smaller timeout, to wait after the current request finishes. I'm sure there are other possibilities too.

We could introduce a barrier that is checked before every request on the channel. However, that is necessarily more expensive than today (new memory barrier per request) and it's not clear that there is much benefit.

Given that we have already agreed to this approach, I suggest any functionality of that sort be discussed and implemented as a follow up.

rjernst

Thanks for all the iterations on the tests. This looks good, assuming a followup for detecting idle connections.

My only last suggestion is about the test threads. I don't think we need to block the dispatcher, just not write a response to the channel, just as we would do in a non-blocking, in progress request that exceeds the timeout.

rjernst · 2023-06-06T17:56:48Z

server/src/test/java/org/elasticsearch/http/AbstractHttpServerTransportTests.java

+        try (TestHttpServerTransport transport = new TestHttpServerTransport(gracePeriod(10), () -> {
+            inDispatch.countDown();
+            try {
+                blockingDispatch.await();


I may have added some confusion here in my earlier comment about blocking. What I meant to say was, if we don't write a response in the dispatcher, then the channel has a pending request. Until we are tracking the pending requests, we won't be able to assert on them, but in the meantime simply not sending a response should better mimic the real non-blocking behavior Elasticsearch has, and it should set this test up for more realistic checking of idle connection tracking as a followup.

rjernst · 2023-06-06T17:58:13Z

server/src/test/java/org/elasticsearch/http/AbstractHttpServerTransportTests.java

+            ).start();
+            try {
+                inDispatch.await();
+            } catch (InterruptedException ie) {


nit: just have add to the test method throws Exception and catches like this won't be necessary, any exception will fail the test

Close all idle connections before waiting for outstanding requests during graceful shutdown. Rejects new connections when shutting down. After stop has been called, the channel is closed after an in-flight http request finishes. The client will not receive the `Connection: close` header, added in #96363, because we no longer accept new requests after stop has been called.

Gracefully shutdown elasticsearch

ad1861c

stu-elastic added WIP :Core/Infra/Node Lifecycle Node startup, bootstrapping, and shutdown labels May 25, 2023

elasticsearchmachine added the v8.9.0 label May 25, 2023

spotless

1457937

stu-elastic marked this pull request as ready for review May 25, 2023 21:40

stu-elastic requested review from rjernst, Tim-Brooks, gwbrown and pgomulka May 25, 2023 21:40

rjernst reviewed May 25, 2023

View reviewed changes

stu-elastic added 2 commits May 26, 2023 09:42

remove debug logging

5ae6b64

trace to warn for expiration of grace

c9ad8b5

stu-elastic added 2 commits May 31, 2023 11:55

stop httpservertransport in new thread outside of the threadpool

8815902

rm slowsearch plugin

1e23ce4

Tim-Brooks reviewed Jun 1, 2023

View reviewed changes

stu-elastic added 2 commits June 1, 2023 12:36

new thread for stoppping http server transport

2ae4dec

Add corner case tests and refactor shutdown procedure

3a9c0fc

stu-elastic changed the title ~~RFC: Gracefully shutdown elasticsearch~~ Gracefully shutdown elasticsearch Jun 1, 2023

stu-elastic added >feature and removed WIP labels Jun 1, 2023

elasticsearchmachine added the Team:Core/Infra Meta label for core/infra team label Jun 1, 2023

Update docs/changelog/96363.yaml

ee2dbfe

merge

f812b11

stu-elastic added 2 commits June 1, 2023 20:38

checkstyle test

6981468

Merge branch 'main' of github.com:stu-elastic/elasticsearch into srvl…

f515ac0

…ess-sigterm-04d-graceful-shutdown

rjernst reviewed Jun 2, 2023

View reviewed changes

stu-elastic added 2 commits June 2, 2023 13:15

step->steps, Node.java name

067277b

less proscriptive tests, remove shutdown stages

1ef6707

rjernst reviewed Jun 3, 2023

View reviewed changes

thecoop reviewed Jun 5, 2023

View reviewed changes

server/src/main/java/org/elasticsearch/node/Node.java Show resolved Hide resolved

thecoop reviewed Jun 5, 2023

View reviewed changes

gracefullyCloseCalled, added in flight request test

fd4d97a

rjernst approved these changes Jun 6, 2023

View reviewed changes

empty dispatcher, throws Exception

afe7154

stu-elastic merged commit 18e0fea into elastic:main Jun 6, 2023

stu-elastic mentioned this pull request Jun 9, 2023

Drain HTTP connections more gracefully at shutdown #86983

Closed

stu-elastic mentioned this pull request Jun 23, 2023

Idle connections do not delay graceful shutdown #96989

Merged

blacktek mentioned this pull request Aug 7, 2023

Feature request: graceful stop/restart #96147

Closed

rjernst mentioned this pull request Sep 8, 2023

Elasticsearch doesn't handle SIGTERM correctly #45236

Closed

Gracefully shutdown elasticsearch #96363

Gracefully shutdown elasticsearch #96363

Conversation

stu-elastic commented May 25, 2023 • edited Loading

rjernst left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ChrisHegarty commented May 26, 2023

stu-elastic commented May 26, 2023 • edited Loading

ChrisHegarty commented May 29, 2023

stu-elastic commented May 31, 2023

Tim-Brooks left a comment

Choose a reason for hiding this comment

elasticsearchmachine commented Jun 1, 2023

elasticsearchmachine commented Jun 1, 2023

stu-elastic commented Jun 1, 2023

rjernst left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rjernst left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stu-elastic Jun 5, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stu-elastic commented Jun 5, 2023

rjernst commented Jun 5, 2023

stu-elastic commented Jun 5, 2023

stu-elastic commented Jun 5, 2023

rjernst left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stu-elastic commented May 25, 2023 •

edited

Loading

rjernst left a comment •

edited

Loading

stu-elastic commented May 26, 2023 •

edited

Loading

stu-elastic Jun 5, 2023 •

edited

Loading