-
Notifications
You must be signed in to change notification settings - Fork 186
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid killing server before it had chance to shut down gracefully #1660
Comments
So if the exit notification is a notification
I conclude that the client should never exect a response for the exit notification. Or did I get the spec wrong? |
@predragnikolic That is a reasonable conclusion to draw from there - if it is applied to notifications in general. I am not proposing for the client to wait for any response to any notification, but very specifically to wait for the server to shut down gracefully after I do believe that server should not be just straight killed after the notification is sent, because then it then defeats the purpose of sending the notification if it the server is killed prior to processing it IMO? |
And when I say wait I mean wait ~subsecond after sending the notification with a timeout - it still seems reasonable to SIGKILL a server which didn't shut itself down gracefully and was given the chance. Relatedly I believe that SIGINT should be attempted before SIGKILL as in Unix SIGKILL is usually used as last resort - not a standard way of shutting a process down, but that might be a separate discussion and not necessarily related to this. |
When the server is closed gracefully, we should be waiting at least 1 second after notifying with “exit”. Does this not happen? |
See Line 119 in 82120a8
|
Oh, interesting 👀 , I must have missed that while reading through what appeared to be relevant code here Lines 175 to 176 in 988cd02
Lines 224 to 236 in 988cd02
I do not believe this is happening but I'm also having hard times verifying this purely from the server side as catching SIGKILL is impossible. I can try to add some debugging around line you mentioned and see if the plugin is hitting it before sending the signal. |
In the case of ST app shutdown, we are given a blocking exit handler, so we must kill all subprocesses immediately, unfortunately (although I have a few ideas around this). graceful shutdown of language servers happens when you close the last document applicable to an LS. After 3 seconds we do the shutdown-exit dance. |
I see, FWIW I was able to reproduce this on Darwin by closing the window via I can try to reproduce by leaving the window open, but closing all tabs then. |
Even in case of the 3 second timeout after closing last relevant tab it doesn't seem to shut down the server gracefully:
|
I downloaded the latest terraform-ls binary and am using this config "terraform": {
"command": [
"$home/Downloads/terraform-ls", "serve"
],
"selector": "source.terraform",
"enabled": true
}, and used this hello world example: https://github.com/gruntwork-io/terratest/tree/master/examples/terraform-hello-world-example Using the following diff: diff --git a/plugin/core/transports.py b/plugin/core/transports.py
index 542b4b4..d2768e6 100644
--- a/plugin/core/transports.py
+++ b/plugin/core/transports.py
@@ -1,4 +1,4 @@
-from .logging import exception_log, debug
+from .logging import exception_log, debug, trace
from .types import TCP_CONNECT_TIMEOUT
from .types import TransportConfig
from .typing import Dict, Any, Optional, IO, Protocol, List, Callable, Tuple
@@ -113,23 +113,33 @@ class JsonRpcTransport(Transport):
def _end(self, exception: Optional[Exception]) -> None:
exit_code = 0
+ trace()
if not exception:
try:
# Allow the process to stop itself.
+ trace()
exit_code = self._process.wait(1)
+ self._process = None
+ trace()
except (AttributeError, ProcessLookupError, subprocess.TimeoutExpired):
+ trace()
pass
if self._process:
try:
# The process didn't stop itself. Terminate!
+ trace()
self._process.kill()
# still wait for the process to die, or zombie processes might be the result
# Ignore the exit code in this case, it's going to be something non-zero because we sent SIGKILL.
self._process.wait()
+ trace()
except (AttributeError, ProcessLookupError):
+ trace()
pass
except Exception as ex:
+ trace()
exception = ex # TODO: Old captured exception is overwritten
+ trace()
def invoke() -> None:
callback_object = self._callback_object() the following is printed to the console after three seconds after closing the last applicable tab:
It's not visible in text but I can also see a one second waiting period between the trace of line 120 and line 125. So everything seems to be correct from the client's POV. I do notice that, even running Would it perhaps be an idea to do all of the tear down in the |
Also, according to the spec:
so you really have all of your logging data at the point of receiving the shutdown request, as any other request/notification of the client is considered invalid. |
Describe the bug
I was trying to debug a memory consumption issue in
terraform-ls
and typically the memory profile is written upon (graceful) exit of the server. I realized that the memory profile is never being created because the server is just killed (i.e. SIGKILL-ed) so it has no time to do anything.It appears that the client (LSP plugin) does send
exit
notification as expected, but (contrary to the LSP spec) doesn't wait for the server to exit itself gracefully, or even start processing that request.Here I admit the spec can be a little bit vague, but it does say:
The way I'd interpret that is that "client asks and expects server to exit", when the implementation seems to be "client asks, but doesn't expect the server to exit".
I am aware that most of the cleanup should be done as part of
shutdown
- and in case ofterraform-ls
it indeed is.From client's perspective it's sensible to expect
exit
to exit within a very short time period as all cleanup that takes time should already be finished by then. However it seems to me the server is not given any time to do anything, so it practically never exits with0
.Sublime Text LSP
VS Code (Terraform extension using the official LSP client library)
Admittedly the implementation there also doesn't seem to follow the spec exactly as it first sends EOF and then never gets to send the
exit
notification. However, with the EOF the server is at least given chance to stop itself gracefully.To Reproduce
Steps to reproduce the behavior:
terraform-ls
tail -f /path/to/log-file
*.tf
files inshutdown
request andexit
notificationExpected behavior
Server is given some short time to exit itself and only if it doesn't exit,
SIGINT
is attempted and only thenSIGKILL
.Environment (please complete the following information):
Additional context
This is a snippet of my settings for the LS:
The text was updated successfully, but these errors were encountered: