Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Track event indexing progress #3075

Merged
merged 10 commits into from
Oct 23, 2024
Merged

Track event indexing progress #3075

merged 10 commits into from
Oct 23, 2024

Conversation

MartinquaXD
Copy link
Contributor

Description

Our event indexing logic currently can't handle some cases well during restarts. Let's say a contract we index doesn't emit an event for 1 month our current logic would fetch the last indexed event from that contract and continue indexing from there. That means on restarts we might index ~1 month (theoretically unbounded) of blocks before we can start building new auctions (we recently started to require the indexing logic to have seen the tip of the chain before we can start building auctions).
Especially on fast chains like arbitrum this is a huge issue.

Dusts off @squadgazzz's previous PR

Changes

  • introduced a new table last_processed_blocks which associates a block number for an arbitrary string
  • populated table with data for settlements, ethflow_refunds, and onchain_orders

How to test

  • 1 database specific test
  • actually not sure how to best test this e2e. We could spawn an autopilot, index an event, mint a ton of empty blocks, and restart the autopilot to trigger the code we are interested in. But it's not super clear to me what would be a good assertion that can be tested with reasonable effort. 🤔

@MartinquaXD MartinquaXD requested a review from a team as a code owner October 21, 2024 11:09
Copy link

Reminder: Please update the DB Readme.


Caused by:

database::last_processed_blocks::update(
&mut ex,
index_name,
i64::try_from(last_block).context("new value of counter is not i64")?,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be better to store in database u64 type for the last block instead of converting it to i64?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately postgres doesn't support u64 natively. The alternative would be to use a bignumber type or sth like that.

Copy link
Contributor

@sunce86 sunce86 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing blocking.

Although, personally, I wouldn't introduce the function persist_last_processed_block, but would rather adjust append_events and replace_events to save the last block atomically together with the events.

Mostly because of reorgs that could lower the block height and then autopilot restart could happen without updating the last_processed_blocks.

So, bottom line, at the point in time when you store the new events (via append or replace) those functions have the responsibility to also update last_processed_blocks to keep the db in consistent state and shouldn't move this responsibility to client.

self.update_events_from_latest_blocks(&event_range.latest_blocks, event_range.is_reorg)
.await?;
self.store_mut()
.persist_last_processed_block(last_block.0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I'd move this call inside update_last_handled_blocks as this function is where we update the local cache so aligning local cache and db makes sense.

database/sql/V073__track_event_indexing_progress.sql Outdated Show resolved Hide resolved
crates/database/src/lib.rs Outdated Show resolved Hide resolved
@MartinquaXD
Copy link
Contributor Author

Ah, I see now that we don't even save anything when we don't have events, too bad. So in order to do it my way, non-significant change should be done in EventStoring logic. Then better leave it as is.

Yeah, that is why I continued on that path rather than adjusting append_events.

@sunce86
Copy link
Contributor

sunce86 commented Oct 22, 2024

Ah, I see now that we don't even save anything when we don't have events, too bad. So in order to do it my way, non-significant change should be done in EventStoring logic. Then better leave it as is.

Yeah, that is why I continued on that path rather than adjusting append_events.

I double checked and it appears we after all have a list of checked blocks (not just blocks with events). https://github.com/cowprotocol/services/blob/main/crates/shared/src/event_handling.rs#L373-L377

Meaning, past_events_by_block_hashes is expected to return full list of blocks from the input. So, it should be safe to use range.last() as last processed block.

@MartinquaXD
Copy link
Contributor Author

MartinquaXD commented Oct 22, 2024

There is a code path here where we wouldn't know which "latest_block" to pass into append_events(). At that point we know the total range over which we index blocks but the events come in chunks so we can't just take the last block of the total range. We could read the block number of the last event of each chunk but that is optional so we get into type issues.

Not sure if it's okay to make the meta field of the event non-optional. I assume this is probably never None but that would require a change in ethcontract-rs which probably has far reaching consequences. :/

Edit: nevermind we wouldn't be able to store the underlying event anyway if we don't know at which block it happened so it makes sense to err if meta is None.

Edit2: there seem to be quite a few places where replace_events is technically implemented wrong. Often it would delete all the events starting from the first block of the range. Also replace_events() often calls append_events() and then all bets are off in terms of only storing the latest_block in append_events() either.
Given that the indexing logic appears quite complicated I think I would prefer introducing the new function and simply calling it after the entire range of latest blocks was processed to not get into these weird issues.

@MartinquaXD MartinquaXD merged commit fad84f1 into main Oct 23, 2024
11 checks passed
@MartinquaXD MartinquaXD deleted the track-event-indexing-progress branch October 23, 2024 12:45
@github-actions github-actions bot locked and limited conversation to collaborators Oct 23, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants