-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add on_disconnect scopeguard for cancelled requests #965
Add on_disconnect scopeguard for cancelled requests #965
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
dropshot/src/server.rs
Outdated
// In the case the client disconnects early, the scopeguard allows us | ||
// to perform extra housekeeping before this task is dropped. | ||
let on_disconnect = guard((), |_| { | ||
warn!(request_log, "client disconnected before response returned"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Being really nitty, but that's because these log messages are most often used when debugging really tricky issues so I want to make sure they're precise about what they say:
warn!(request_log, "client disconnected before response returned"); | |
warn!(request_log, "request handling cancelled (client disconnected)"); |
Could you add latency_us
to this log message as well? Similar to L840:
let latency_us = start_time.elapsed().as_micros();
and then include that in the log message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not nitty at all, makes perfect sense. Done 👍
let result = handler.handle_request(rqctx, request).await; | ||
|
||
// If this send fails, our spawning task has been cancelled in | ||
// the `rx.await` below; log such a result. | ||
if tx.send(result).is_err() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to keep a log message here to help debug cross-system issues. (It can be useful to know when the server finished doing something, even if the client long ago gave up, so you can put together a timeline of events.) How about something like "request completed after handler was already cancelled"
? I wonder if we can include the result here, too -- status code and/or error message?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, I've kept the log and added the status code and error messages to it.
Sorry for the delay on this -- I was out for a while and had a lot to catch up on. This PR looks promising and I think it obviates #958 so let's just do this one? |
Okay, this is looking great. Now I'm just wondering about testing. We said it'd be hard to do automated tests for the logs and DTrace probes. Were you able to verify at least the log stuff by hand? Any chance you've been able to try out the DTrace probe? |
Heyo! Sorry for the delay, I got busy debugging some of this. Trying to test this programatically from the tests (if only just to make sure it works) made me realise that the
This is very racy and I'm struggling on writing a deterministic test for it. I don't mind trying to resolve this issue but I believe it might not be important either. Let me know what the preferred way of solving this (if any) is! Hyper Logs:Successful connection drop before handler completion:
Unsuccessful connection drop before handler completion:
|
That test is exercising I was not assuming that we would write automated tests for this. mostly because it's annoying to verify the log contents and USDT probes that get fired, but this wouldn't be a bad thing to do! If you did, I think you'd want to look at |
All makes sense, I did not mean to add any automated tests for this use case. I was making some changes locally to test whether the I was able to confirm that |
I did some quick checking by applying this diff: dap@ivanova dropshot $ git diff
diff --git a/dropshot/examples/basic.rs b/dropshot/examples/basic.rs
index dbd7ea4..28720db 100644
--- a/dropshot/examples/basic.rs
+++ b/dropshot/examples/basic.rs
@@ -86,6 +86,8 @@ async fn example_api_get_counter(
) -> Result<HttpResponseOk<CounterValue>, HttpError> {
let api_context = rqctx.context();
+ tokio::time::sleep(std::time::Duration::from_secs(5)).await;
+
Ok(HttpResponseOk(CounterValue {
counter: api_context.counter.load(Ordering::SeqCst),
})) I ran it on Helios like this:
I got the pid with
and I did two tests:
I'd expect to see:
Indeed, here's the log output:
and here's the DTrace output:
Now, that example uses dap@ivanova dropshot $ git diff
diff --git a/dropshot/examples/basic.rs b/dropshot/examples/basic.rs
index dbd7ea4..38b14e0 100644
--- a/dropshot/examples/basic.rs
+++ b/dropshot/examples/basic.rs
@@ -6,6 +6,7 @@ use dropshot::ApiDescription;
use dropshot::ConfigDropshot;
use dropshot::ConfigLogging;
use dropshot::ConfigLoggingLevel;
+use dropshot::HandlerTaskMode;
use dropshot::HttpError;
use dropshot::HttpResponseOk;
use dropshot::HttpResponseUpdatedNoContent;
@@ -24,7 +25,10 @@ async fn main() -> Result<(), String> {
// since it's available and won't expose this server outside the host. We
// request port 0, which allows the operating system to pick any available
// port.
- let config_dropshot: ConfigDropshot = Default::default();
+ let config_dropshot = ConfigDropshot {
+ default_handler_task_mode: HandlerTaskMode::CancelOnDisconnect,
+ ..Default::default()
+ };
// For simplicity, we'll configure an "info"-level logger that writes to
// stderr assuming that it's a terminal.
@@ -86,6 +90,8 @@ async fn example_api_get_counter(
) -> Result<HttpResponseOk<CounterValue>, HttpError> {
let api_context = rqctx.context();
+ tokio::time::sleep(std::time::Duration::from_secs(5)).await;
+
Ok(HttpResponseOk(CounterValue {
counter: api_context.counter.load(Ordering::SeqCst),
})) Here we get this output for the same two tests:
I think all this matches what I'd expect so I'm going to land this. |
Thanks again for doing this! |
Thanks for taking the time to review this! |
This PR should resolve #914 and remove the need for the change in my original PR as well (#958).