Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider token as good when node doesn't support trace calls #3002

Merged
merged 5 commits into from
Sep 20, 2024

Conversation

fleupold
Copy link
Contributor

Description

Not all of our fallback nodes support the trace_call API. While we can configure the tracing URL separately, in case of outage the API may not be able to classify any tokens as good and thus effectively prevent trading.

Changes

  • If trace call many fails due to a TransportError (which is what happens if you call an unsupported method), consider the token as good.
  • Add tracing spans with token address context for any logs that happen during detection

How to test

cargo test --package shared --lib -- bad_token::trace_call::tests::mainnet_tokens --exact --show-output 
--ignored

both with a Node that does and does not support the trace call API. Observe that in the case of non-support all tokens are of quality good.

@fleupold fleupold requested a review from a team as a code owner September 18, 2024 08:48
Copy link
Contributor

@MartinquaXD MartinquaXD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change makes sense to me. I'm mostly worried about the seemingly duplicate .instrument() calls as I think the instrumentation should happen inside the token detector itself and not in its callers.

@@ -434,6 +435,10 @@ impl OrderValidating for OrderValidator {
if let TokenQuality::Bad { reason } = self
.bad_token_detector
.detect(token)
.instrument(tracing::info_span!(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit surprised by all these instrumented futures and think these could lead to duplicated tracing spans. Wouldn't it be sufficient to instrument only the inner.detect() calls inside the CachingDetector?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would assume we always use a CachingDetector in our estimating chain (which is true now but might change when and we don't want the observability to change in that case). I'm fine with this, but I thought the callsite that creates the outmost

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I found InstrumentedBadTokenDetector which lends itself perfectly for this.

@@ -20,17 +21,15 @@ pub async fn trace_many(requests: Vec<CallRequest>, web3: &Web3) -> Result<Vec<B
serde_json::to_value(vec![TraceType::Trace])?,
])
})
.collect::<Result<Vec<_>>>()?;
.collect::<Result<Vec<_>>>()
.map_err(|e| Error::Decoder(e.to_string()))?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we can only hit this .map_err() if our JSON serialization fails so using the Error::Decoder variant seems odd. OTOH web3 doesn't offer a more suitable error variant anyway.

.context("trace_many")?;
let traces = match trace_many::trace_many(request, &self.web3).await {
Ok(result) => result,
Err(web3::Error::Transport(e)) => {
Copy link
Contributor

@sunce86 sunce86 Sep 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we know we want to catch this type of Error? For example, I would rather expect RPCError with ErrorCode::MethodNotFound (-32601)

Copy link
Contributor

@sunce86 sunce86 Sep 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you received Transport Error in practice instead, this might mean the specific node provider has unexpected handling of this error, so this code change would be effective only for that node provider and not for the rest.
In that case I would suggest also adding Rpc variant and handle it similarly as Err(web3::Error::Transport(e))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good point. Seems to be an Alchemy specific behavior (Nodereal returns with the response code you expected). Will add this case.

Copy link
Contributor

@m-lord-renkse m-lord-renkse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I am also concern about duplicate spans as @MartinquaXD mentioned.

@fleupold fleupold merged commit 185586d into main Sep 20, 2024
11 checks passed
@fleupold fleupold deleted the bad_token_detection_non_tracing_node branch September 20, 2024 15:17
@github-actions github-actions bot locked and limited conversation to collaborators Sep 20, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants