[PROPOSAL] Event indexer counters #2965

squadgazzz · 2024-09-10T12:07:42Z

Description

During one of the bugfixes(#2931) it was noticed that some of the event handlers might spend a lot of time and resources on each restart due to the absence of the corresponding items in the DB. In that case, preconfigured start block numbers are used that, in most cases, are too old. As a result, on each restart, a background task might fetch millions of blocks on networks with a short round time(such as arbitrum one).

Changes

As a proposal, store indexer counters in a separate table each time a portion of new blocks are processed. It is currently impossible to use existing append_events function since it operates with pre-filtered events only that can never happen if we don't have a single event for the specified type on the chain.

Driver crate doesn't have access to the DB currently, which will be required in case this proposal is accepted.

github-actions · 2024-09-10T12:09:46Z

Reminder: Please update the DB Readme.

Caused by:

database/sql/V071__create_event_indexer_counters.sql

MartinquaXD · 2024-09-11T10:23:29Z

I think the idea of storing which block was indexed (even if it didn't contain an event we are interested in) makes a lot of sense.
However, instead of solving this problem ourselves we should consider researching if this was not already solved better by other people.
Also the driver requiring access to the DB would not be great. But there are ways around this which we might have to implement sooner rather than later anyway.

squadgazzz · 2024-09-16T15:08:22Z

However, instead of solving this problem ourselves we should consider researching if this was not already solved better by other people.

We can use other indexers, such as The Graph, but introducing a new dependency and integrating it into the system seems like an overkill for the output of the whole activity.

Also the driver requiring access to the DB would not be great. But there are ways around this which we might have to implement sooner rather than later anyway.

The same applies to the persistence layer. Bringing more dependencies, even with lightweight/embedded solutions, seems like a much bigger change than adding a small table. For the driver specifically, either add a URL to the existing PG DB or a completely new embedded DB dependency with k8s volumes within will be storing a few counter, where I would still be more inclined with the former.

We can probably start with the PG table and, if we have time, integrate The Graph or an embedded DB.

Wdyt?

github-actions · 2024-09-24T00:08:00Z

This pull request has been marked as stale because it has been inactive a while. Please update this pull request or it will be automatically closed.

github-actions · 2024-10-02T00:08:14Z

This pull request has been marked as stale because it has been inactive a while. Please update this pull request or it will be automatically closed.

github-actions · 2024-10-10T00:07:57Z

This pull request has been marked as stale because it has been inactive a while. Please update this pull request or it will be automatically closed.

@squadgazzz

# Description Our event indexing logic currently can't handle some cases well during restarts. Let's say a contract we index doesn't emit an event for 1 month our current logic would fetch the last indexed event from that contract and continue indexing from there. That means on restarts we might index ~1 month (theoretically unbounded) of blocks before we can start building new auctions (we recently started to require the indexing logic to have seen the tip of the chain before we can start building auctions). Especially on fast chains like `arbitrum` this is a huge issue. Dusts off @squadgazzz's previous [PR](#2965) # Changes * introduced a new table `last_processed_blocks` which associates a block number for an arbitrary string * populated table with data for `settlements`, `ethflow_refunds`, and `onchain_orders` ## How to test - 1 database specific test - actually not sure how to best test this e2e. We could spawn an autopilot, index an event, mint a ton of empty blocks, and restart the autopilot to trigger the code we are interested in. But it's not super clear to me what would be a good assertion that can be tested with reasonable effort. 🤔 --------- Co-authored-by: ilya <ilya@cow.fi>

Event indexer counters

ce6ea6f

squadgazzz marked this pull request as ready for review September 10, 2024 12:23

squadgazzz requested a review from a team as a code owner September 10, 2024 12:23

github-actions bot added the stale label Sep 24, 2024

squadgazzz removed the stale label Sep 24, 2024

github-actions bot added the stale label Oct 2, 2024

squadgazzz removed the stale label Oct 2, 2024

github-actions bot added the stale label Oct 10, 2024

github-actions bot closed this Oct 17, 2024

MartinquaXD mentioned this pull request Oct 21, 2024

Track event indexing progress #3075

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[PROPOSAL] Event indexer counters #2965

[PROPOSAL] Event indexer counters #2965

squadgazzz commented Sep 10, 2024 •

edited

Loading

github-actions bot commented Sep 10, 2024

MartinquaXD commented Sep 11, 2024

squadgazzz commented Sep 16, 2024

github-actions bot commented Sep 24, 2024

github-actions bot commented Oct 2, 2024

github-actions bot commented Oct 10, 2024

[PROPOSAL] Event indexer counters #2965

[PROPOSAL] Event indexer counters #2965

Conversation

squadgazzz commented Sep 10, 2024 • edited Loading

Description

Changes

github-actions bot commented Sep 10, 2024

MartinquaXD commented Sep 11, 2024

squadgazzz commented Sep 16, 2024

github-actions bot commented Sep 24, 2024

github-actions bot commented Oct 2, 2024

github-actions bot commented Oct 10, 2024

squadgazzz commented Sep 10, 2024 •

edited

Loading