Skip to content

Commit

Permalink
litep2p: Improve metric reasons for RequestResponse errors (#5077)
Browse files Browse the repository at this point in the history
This PR improves the metrics reported by litep2p on request-response
errors.

Discovered while investigating:
- #4985


We are experiencing many requests that are `Refused` by litep2p in
comparison with libp2p.
The metric roughly approximates the sum of other reasons from libp2p.
This PR aims to provide more insights.

```
{__name__="substrate_sub_libp2p_requests_out_failure_total", chain="ksmcc3", instance="localhost:9615", job="substrate_node", protocol="/b0a8d493285c2df73290dfb7e61f870f17b41801197a149ca93654499ea3dafe/sync/2", reason="Remote has closed the substream before answering, thereby signaling that it considers the request as valid, but refused to answer it."}

    Last *: 3365
    Min: 3363
    Max: 3365
    Mean: 3365
    
    
{__name__="substrate_sub_libp2p_requests_out_failure_total", chain="ksmcc3", instance="localhost:9615", job="substrate_node", protocol="/b0a8d493285c2df73290dfb7e61f870f17b41801197a149ca93654499ea3dafe/beefy/justifications/1", reason="Remote has closed the substream before answering, thereby signaling that it considers the request as valid, but refused to answer it."}

    Last *: 3461
    Min: 3461
    Max: 3461
    Mean: 3461
```

Part of:
- #4681

cc @paritytech/networking

---------

Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
  • Loading branch information
lexnv authored Jul 22, 2024
1 parent d1979d4 commit 623b68e
Showing 1 changed file with 12 additions and 10 deletions.
22 changes: 12 additions & 10 deletions substrate/client/network/src/litep2p/shim/request_response/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ use crate::{
peer_store::PeerStoreProvider,
request_responses::{IncomingRequest, OutgoingResponse},
service::{metrics::Metrics, traits::RequestResponseConfig as RequestResponseConfigT},
IfDisconnected, ProtocolName, RequestFailure,
IfDisconnected, OutboundFailure, ProtocolName, RequestFailure,
};

use futures::{channel::oneshot, future::BoxFuture, stream::FuturesUnordered, StreamExt};
Expand Down Expand Up @@ -369,10 +369,12 @@ impl RequestResponseProtocol {
return
};

let error = match error {
RequestResponseError::NotConnected => Some(RequestFailure::NotConnected),
RequestResponseError::Rejected | RequestResponseError::Timeout =>
Some(RequestFailure::Refused),
let status = match error {
RequestResponseError::NotConnected =>
Some((RequestFailure::NotConnected, "not-connected")),
RequestResponseError::Rejected => Some((RequestFailure::Refused, "rejected")),
RequestResponseError::Timeout =>
Some((RequestFailure::Network(OutboundFailure::Timeout), "timeout")),
RequestResponseError::Canceled => {
log::debug!(
target: LOG_TARGET,
Expand All @@ -387,7 +389,7 @@ impl RequestResponseProtocol {
"{}: tried to send too large request to {peer:?} ({request_id:?})",
self.protocol,
);
Some(RequestFailure::Refused)
Some((RequestFailure::Refused, "payload-too-large"))
},
RequestResponseError::UnsupportedProtocol => match fallback_request {
Some((request, protocol)) => match self.request_tx.get(&protocol) {
Expand Down Expand Up @@ -426,15 +428,15 @@ impl RequestResponseProtocol {
peer,
);

Some(RequestFailure::Refused)
Some((RequestFailure::Refused, "invalid-fallback-protocol"))
},
},
None => Some(RequestFailure::Refused),
None => Some((RequestFailure::Refused, "unsupported-protocol")),
},
};

if let Some(error) = error {
self.metrics.register_outbound_request_failure(error.to_string().as_ref());
if let Some((error, reason)) = status {
self.metrics.register_outbound_request_failure(reason);
let _ = tx.send(Err(error));
}
}
Expand Down

0 comments on commit 623b68e

Please sign in to comment.