-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Meta Issue: Fixing high impact correctness and performance problems in ETH RPC API for snapshot synced nodes #12293
Comments
For any and all of these that have ways to reproduce it would be great to coordinate and add the RPC call behind any of these issues to the RPC benchmark tool that Fil-B has been maintaining - https://github.com/fil-builders/benchmark-rpc/blob/main/pages/index.js#L21 For example this issue #10940 should be easy to reproduce in a live test. I created a ticket for it FIL-Builders/benchmark-rpc#1 to add it to the web app http://benchmark-rpc.fil.builders/ |
msgindex is @ Line 647 in 718fc03
It at least has a pattern we can follow for others. But it also overlaps with a backfill operation, so we may end up taking care of snapshot import with a general backfill routine if we get that right. |
This list seems pretty complete. IMO, the highest priority is fixing the indexing issues:
IMO, the best way to handle this case is the dance we discussed on the call:
This will miss uncles, but I took a look at how geth handles stuff like this and... they also appear to index asynchronously and handle this case by returning an error if the node is currently indexing a block. That's not a terrible option... but it would be a larger breaking change. |
This is a great overview @aarshkshah1992 - thanks for writing it up. A few questions, some of which are coming from a newbie/ignorant-of-the-code perspective. I'm happy to chat on any of these elsewhere or offline, but figured to ask here so it's public.
|
Let me look into 2. |
That depends on what we do with the API. If F3 is "fast enough", we could just not expose anything after finality. But... that's probably not going to work well.
I agree there's no reason to keep them separate. |
In the first pass of this work, we're not going to work on merging the DBs for these as that is a larger refactor and will need a non-trivial migration for users and we've not estimated it yet. Let's get to it once we've fixed all the other problems here. |
For visibility, it was decided that it would be useful to merge the DBs into a single DB. The work is happening in #12421 |
@rvagg : how much of this is still relevant? Can we close out any of the linked items? |
We're not actively getting complaints or input on performance related to these so I think we can close and open new issues as things arise. |
This issue aims to be a meta-issue to capture and track work that needs to be done to enhance correctness, performance, and stability of the ETH RPC API on snapshot synced nodes. Note that improving performance for ETH RPC API on archival nodes is out of scope for this issue and will be addressed by a future issue.
Our goal is to improve the developer experience (DX) for key partners, including:
Correctness and data availability issues in the chain state Indexes used by the ETH RPC API
Currently, we maintain three primary indices on the chain state, which are essential for both correctness and performance of multiple ETH RPC APIs.
Transaction Index
Message Index
Event Index
All of the above indices suffer from some or all of the following problems that need to be fixed:
lotus-shed
backfilling CLI that users rely on for manually backfilling the indices is broken as all the Indices are persisted in Sqlite and Sqllite only supports a single writer. This effectively means that backfilling races with indexing new/ongoing state transitionsCorrectness problems in the ETH Events API
Events: TOCTOU Race when subscribing to new events #12111 - Event Filter APIs have raciness that can return incorrect results.
The block hash does not match #10911 - Mismatch between the block hash returned by ETH Get Block API and the block hash returned by the ETH Events API. This one could have been caused by a re-org but a solid itest to verify that this is no longer a problem would be great.
eth_getFilterChanges
returns"filter not found"
#11589 AND Reject Eth subscriptions & filters through the gateway over HTTP #11153 Event Filter APIs should work with the HTTP Gateway as expected by ETH tooling.Eth RPC: EthGetLogs should return explicit error when queried for not-existing block hash on all providers #10940 -
eth_getLogs
should differentiate between "processed the block it has no events" vs "never seen this block" errors. We already have the required scaffolding and metdata for this in place but need to fix the error handling here and write some solid testsIn-memory block caching for perf improvements
Multiple ETH RPC APIs frequently need to lookup Filecoin Tipsets and convert them to the correspondong Ethereum block representations. These lookups are performed on the chainstore which is expensive. We should cache these tipsets/blocks in an LRU cache. See #10520.
Miscellaneous correctness bugs from the backlog
getBlock return an Error #10909 -
eth_getBlock
does not confirm to ETH RPC spec for Filecoin null rounds (null rounds are a quirk in Filecoin and need to be handled correctly here).Eth API: Trace installed bytecode in the Ethereum JSON-RPC
trace_block
output #11635 - The ETH Trace API currently fails to include the byte code of the deployed smart contract in the trace output for transactions that deploy smart contracts. IIRC, Blockscout really needed this to be able to show the contract byte code on their explorer.EthGetTransactionCount ignores input #10357 - Correctness bug in
eth_getTransactionCount
.The text was updated successfully, but these errors were encountered: