-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add event entries count in validation API #12506
feat: add event entries count in validation API #12506
Conversation
@@ -200,14 +206,30 @@ func (si *SqliteIndexer) verifyIndexedData(ctx context.Context, ts *types.TipSet | |||
return xerrors.Errorf("failed to get next tipset for height %d: %w", ts.Height(), err) | |||
} | |||
|
|||
// if non-reverted events exist which means that tipset `ts` has been executed, there should be 0 reverted events in the DB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aarshkshah1992 I have moved this part of top of the function, because if we have reverted events in the tipset then we can just return without loading all the messages.
&ps.getNonRevertedTipsetMessageCountStmt: "SELECT COUNT(*) FROM tipset_message WHERE tipset_key_cid = ? AND reverted = 0 AND message_cid IS NOT NULL", | ||
&ps.getNonRevertedTipsetEventCountStmt: "SELECT COUNT(*) FROM event WHERE reverted = 0 AND message_id IN (SELECT message_id FROM tipset_message WHERE tipset_key_cid = ? AND reverted = 0)", | ||
&ps.hasRevertedEventsInTipsetStmt: "SELECT EXISTS(SELECT 1 FROM event WHERE reverted = 1 AND message_id IN (SELECT message_id FROM tipset_message WHERE tipset_key_cid = ?))", | ||
&ps.getNonRevertedMsgInfoStmt: "SELECT tipset_key_cid, height FROM tipset_message WHERE message_cid = ? AND reverted = 0 LIMIT 1", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@akaladarshi Why is the diff so big here ? Makes it hard to review the change. Please can you fix this ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only change we should see is the new getNonRevertedTipsetEventEntriesCountStmt
query.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aarshkshah1992 I think that's because getNonRevertedTipsetEventEntriesCountStmt
variable name is bigger than all other variables, so go fmt
shifted everything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please can you put the getNonRevertedTipsetEventEntriesCountStmt
statement at the bottom rather than inserting it before hasRevertedEventsInTipsetStmt
and re-fmt ? I think that will reduce the diff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aarshkshah1992 I don't think that will work as well, formatting happens based on the largest variable size. Otherwise it will look odd.
Anyways, I have move towards the end but result is the same.
chain/index/ddls.go
Outdated
@@ -52,37 +52,40 @@ var ddls = []string{ | |||
`CREATE INDEX IF NOT EXISTS idx_height ON tipset_message (height)`, | |||
|
|||
`CREATE INDEX IF NOT EXISTS event_entry_event_id ON event_entry(event_id)`, | |||
|
|||
`CREATE INDEX IF NOT EXISTS idx_tipset_key_reverted_message_id ON tipset_message (tipset_key_cid, reverted, message_id)`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@akaladarshi Let's remove this as having complex composite Indices like this has caused issues in the past for read performance. We already have the tabled indexed by tipset_key_cid, message_id
and that should be enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just added it because in many places we are fetching based on these three columns, especially tipset_key_cid
and reverted
.
Anyways, I will remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@akaladarshi What you're saying here makes sense.
But the Query Plan looks okay without this Index and there will only be so many entries that have different values of reverted
for the same tipset_key_cid
:
sqlite3 chainindex.db "EXPLAIN QUERY PLAN SELECT COUNT(ee.event_id) AS entry_count
FROM event_entry ee
JOIN event e ON ee.event_id = e.event_id
JOIN tipset_message tm ON e.message_id = tm.message_id
WHERE tm.tipset_key_cid = ?
AND tm.reverted = 0"
QUERY PLAN
|--SEARCH tm USING INDEX idx_tipset_key_cid (tipset_key_cid=?)
|--SEARCH e USING COVERING INDEX idx_event_message_id (message_id=?)
`--SEARCH ee USING COVERING INDEX event_entry_event_id (event_id=?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sense. Thanks for clarifying 🙏.
Also I have removed the index.
chain/index/ddls.go
Outdated
&ps.countTipsetsAtHeightStmt: "SELECT COUNT(CASE WHEN reverted = 1 THEN 1 END) AS reverted_count, COUNT(CASE WHEN reverted = 0 THEN 1 END) AS non_reverted_count FROM (SELECT tipset_key_cid, MAX(reverted) AS reverted FROM tipset_message WHERE height = ? GROUP BY tipset_key_cid) AS unique_tipsets", | ||
&ps.getNonRevertedTipsetMessageCountStmt: "SELECT COUNT(*) FROM tipset_message WHERE tipset_key_cid = ? AND reverted = 0 AND message_cid IS NOT NULL", | ||
&ps.getNonRevertedTipsetEventCountStmt: "SELECT COUNT(*) FROM event WHERE reverted = 0 AND message_id IN (SELECT message_id FROM tipset_message WHERE tipset_key_cid = ? AND reverted = 0)", | ||
&ps.getNonRevertedTipsetEventEntriesCountStmt: "SELECT COUNT(ee.event_id) AS event_entry_count FROM tipset_message AS t INNER JOIN event AS ev ON t.message_id = ev.message_id INNER JOIN event_entry AS ee ON ev.event_id = ee.event_id WHERE t.tipset_key_cid = ? AND t.reverted = 0", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@akaladarshi A simpler query to use is
SELECT COUNT(ee.event_id) AS entry_count
FROM event_entry ee
JOIN event e ON ee.event_id = e.event_id
JOIN tipset_message tm ON e.message_id = tm.message_id
WHERE tm.tipset_key_cid = ?
AND tm.reverted = 0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
chain/index/README.MD
Outdated
@@ -107,6 +107,8 @@ type IndexValidation struct { | |||
IndexedMessagesCount uint64 | |||
// IndexedEventsCount signifies the number of indexed events for the canonical tipset at this epoch. | |||
IndexedEventsCount uint64 | |||
// IndexedEventEntriesCount signifies the number of indexed event entries for the canonical tipset at this epoch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"is the number of"
Please can I request you to fix it in the above as well ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@akaladarshi Please can you run |
@akaladarshi : if this PR is ready for @aarshkshah1992 to review again, please re-request review and set the status to be "awaiting review" to make it clear it's in the reviewer's court: Thanks! |
@BigLep I don't have permission to change the status of the PR, although I have asked Aarsh to review it manually. |
@BigLep I've changed the status to |
chain/index/README.MD
Outdated
IndexedMessagesCount uint64 | ||
// IndexedEventsCount signifies the number of indexed events for the canonical tipset at this epoch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@akaladarshi This file has changed. Please can you rebase this PR and then make the changes again ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, please can you update the docs for the IndexValidation
struct in node/api/v1api/api_full.go
? You can see the docs I've added there for the ChainValidateIndex
RPC API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
e72e1c8
to
6259731
Compare
@@ -1241,10 +1241,12 @@ type IndexValidation struct { | |||
TipSetKey TipSetKey |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aarshkshah1992 I think we should remove the IndexValidation
struct from here (api-v1-unstable-methods.md), it's not readable at all in the preview mode
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
163c6db
to
6c0b88d
Compare
@akaladarshi Thanks for your work here ! |
4c94b28
into
filecoin-project:feat/implement-index-validation-api
…for the `ChainIndexer` (#12450) * fix conflicts with chain indexer * feat: chain indexer todos [skip changelog] (#12462) * feat: finish todos of validation api * feat: add indexed data verification with chain store * feat: address comments and finish TODO * fix: build issue * address comments * fix: ci issue * Apply suggestions from code review Co-authored-by: Rod Vagg <rod@vagg.org> * changes to Index Validation API based on Rodds first review * build chain indexer API * improve error handling * feat: lotus-shed tooling for chain indexer (#12474) * feat: add lotus-shed command for backfilling chain indexer * feat: add lotus-shed command for inspecting the chain indexer * feat: use single lotus-shed command to inspect and backfill * fix: remove the unused queries * small changes * add change log * backfilling improvements and fixes * finish chain index validation and backfill tooling * user documentation for the * validate from epoch * Apply suggestions from code review Suggestions from Steve's read of the user doc. Co-authored-by: Steve Loeppky <biglep@filoz.org> * changes to user doc as per review * Apply suggestions from code review Co-authored-by: Steve Loeppky <biglep@filoz.org> * changes to user doc as per review * Apply suggestions from code review Co-authored-by: Steve Loeppky <biglep@filoz.org> * changes as per review * feat: add event entries count in validation API (#12506) * feat: add event entry count in validation API * address comments * use sqllite defaults (#12504) * Apply suggestions from code review Co-authored-by: Steve Loeppky <biglep@filoz.org> * write chain index to a different dir * Apply suggestions from code review Co-authored-by: Steve Loeppky <biglep@filoz.org> * fix conflicts * UX improvements to backfilling * feat: tests for the chain indexer (#12521) * ddl tests * tests for the chain indexer * finish unit tests for chain indexer * fix formatting * cleanup reverted tipsets to avoid db bloat * fix logging * test for filter by address * test gc cascade delete * fix db locked error during backfilling * fix var name * increase db locked timeout * fix db locked issue * reduce db lock timeout * no lock in gc * reconcile does not need lock * improved error handling * Update chain-indexing-overview-for-rpc-providers.md Doc updates based on @jennijuju feedack. * Update chain-indexing-overview-for-rpc-providers.MD Fixes after reviewing 33c1ca1 * better metrics for backfilling * Update chain/index/chain-indexing-overview-for-rpc-providers.MD Co-authored-by: Rod Vagg <rod@vagg.org> * Update chain/index/chain-indexing-overview-for-rpc-providers.MD Co-authored-by: Rod Vagg <rod@vagg.org> * Update chain/index/chain-indexing-overview-for-rpc-providers.MD Co-authored-by: Rod Vagg <rod@vagg.org> * Update chain/index/chain-indexing-overview-for-rpc-providers.MD Co-authored-by: Rod Vagg <rod@vagg.org> * Update chain/index/chain-indexing-overview-for-rpc-providers.MD Co-authored-by: Rod Vagg <rod@vagg.org> * Update chain/index/chain-indexing-overview-for-rpc-providers.MD Co-authored-by: Rod Vagg <rod@vagg.org> * Update chain/index/chain-indexing-overview-for-rpc-providers.MD Co-authored-by: Rod Vagg <rod@vagg.org> * tests for changes to event addressing * Apply suggestions from code review Co-authored-by: Rod Vagg <rod@vagg.org> * changes as per review -> round 1 * Apply suggestions from code review Co-authored-by: Rod Vagg <rod@vagg.org> * Apply suggestions from code review Co-authored-by: Rod Vagg <rod@vagg.org> * log tipset key cid * Apply suggestions from code review Co-authored-by: Rod Vagg <rod@vagg.org> * Apply suggestions from code review Co-authored-by: Rod Vagg <rod@vagg.org> * fix docs * Apply suggestions from code review Co-authored-by: Rod Vagg <rod@vagg.org> * fix tests * fix tests * make jen * fix conflicts --------- Co-authored-by: Aryan Tikarya <aryan.tikarya@dojima.network> Co-authored-by: Rod Vagg <rod@vagg.org> Co-authored-by: Steve Loeppky <biglep@filoz.org>
* chain index complete for msgs and txns * dont need observer changes for now * changes * fix tests * fix tests * use th right context * index empty tipsets correctly * implement automated backfilling * add event indexing and remove all old indices * fix test * revert deployment test changes * revert test changes and better error handling for eth tx index lookups * fix sql statments naming convention * address review for Index GC * more changes as per review * changes as per review * fix config * mark events as reverted during reconciliation * better reconciliation; pens down and code complete; also reconcile events * fix tests * improve config and docs * improve docs and error handling * improve read logic * improve docs * better logging and handle ennable event storage * improve logs and index init proc * better logging * fix bugs based on calibnet testing * create sqliite Indices * gc should be based on epochs * fix event query * foreign keys should be enabled on the DB * reverted tipsets should be removed as part of GC * release read lock * make it easy to backfill an empty index using reconciliation * better docs for reconciliation * fix conflicts with master * Apply suggestions from code review Co-authored-by: Rod Vagg <rod@vagg.org> * fix go mod * fix formatting * revert config changes * address changes in observer * remove top level chainindex package * changes as per review * changes as per review * changes as per review * handle index with reverted tipsets during reconciliation * changes as per review * fix type of max reconcile epoch * changes to reconciliation as per review * log ipld error * better logging of progress * disable chain indexer hydrate from snapshot based on config * always populate index * make config easy to reason about * fix config * fix messaging * revert config changes * Apply suggestions from code review Co-authored-by: Rod Vagg <rod@vagg.org> * changes as per review * make error messages homogenous * fix indentation * changes as per review * feat: recompute tipset to generate missing events if event indexing is enabled (#12463) * auto repair events * make jen * fix leaky abstraction * better docs for gc retention epoch * imrpove DB handling (#12485) * fix conflict * fix lite node config for indexer * exclude reverted events from eth get logs if client queries by epoch * Simply addressing for event lookups in the index. simply addressing for event lookups * Apply suggestions from code review Co-authored-by: Rod Vagg <rod@vagg.org> * Apply suggestions from code review Co-authored-by: Rod Vagg <rod@vagg.org> * fix tests * Apply suggestions from code review Co-authored-by: Rod Vagg <rod@vagg.org> * feat: migration("re-indexing"), backfilling and diasgnostics tooling for the `ChainIndexer` (#12450) * fix conflicts with chain indexer * feat: chain indexer todos [skip changelog] (#12462) * feat: finish todos of validation api * feat: add indexed data verification with chain store * feat: address comments and finish TODO * fix: build issue * address comments * fix: ci issue * Apply suggestions from code review Co-authored-by: Rod Vagg <rod@vagg.org> * changes to Index Validation API based on Rodds first review * build chain indexer API * improve error handling * feat: lotus-shed tooling for chain indexer (#12474) * feat: add lotus-shed command for backfilling chain indexer * feat: add lotus-shed command for inspecting the chain indexer * feat: use single lotus-shed command to inspect and backfill * fix: remove the unused queries * small changes * add change log * backfilling improvements and fixes * finish chain index validation and backfill tooling * user documentation for the * validate from epoch * Apply suggestions from code review Suggestions from Steve's read of the user doc. Co-authored-by: Steve Loeppky <biglep@filoz.org> * changes to user doc as per review * Apply suggestions from code review Co-authored-by: Steve Loeppky <biglep@filoz.org> * changes to user doc as per review * Apply suggestions from code review Co-authored-by: Steve Loeppky <biglep@filoz.org> * changes as per review * feat: add event entries count in validation API (#12506) * feat: add event entry count in validation API * address comments * use sqllite defaults (#12504) * Apply suggestions from code review Co-authored-by: Steve Loeppky <biglep@filoz.org> * write chain index to a different dir * Apply suggestions from code review Co-authored-by: Steve Loeppky <biglep@filoz.org> * fix conflicts * UX improvements to backfilling * feat: tests for the chain indexer (#12521) * ddl tests * tests for the chain indexer * finish unit tests for chain indexer * fix formatting * cleanup reverted tipsets to avoid db bloat * fix logging * test for filter by address * test gc cascade delete * fix db locked error during backfilling * fix var name * increase db locked timeout * fix db locked issue * reduce db lock timeout * no lock in gc * reconcile does not need lock * improved error handling * Update chain-indexing-overview-for-rpc-providers.md Doc updates based on @jennijuju feedack. * Update chain-indexing-overview-for-rpc-providers.MD Fixes after reviewing 33c1ca1 * better metrics for backfilling * Update chain/index/chain-indexing-overview-for-rpc-providers.MD Co-authored-by: Rod Vagg <rod@vagg.org> * Update chain/index/chain-indexing-overview-for-rpc-providers.MD Co-authored-by: Rod Vagg <rod@vagg.org> * Update chain/index/chain-indexing-overview-for-rpc-providers.MD Co-authored-by: Rod Vagg <rod@vagg.org> * Update chain/index/chain-indexing-overview-for-rpc-providers.MD Co-authored-by: Rod Vagg <rod@vagg.org> * Update chain/index/chain-indexing-overview-for-rpc-providers.MD Co-authored-by: Rod Vagg <rod@vagg.org> * Update chain/index/chain-indexing-overview-for-rpc-providers.MD Co-authored-by: Rod Vagg <rod@vagg.org> * Update chain/index/chain-indexing-overview-for-rpc-providers.MD Co-authored-by: Rod Vagg <rod@vagg.org> * tests for changes to event addressing * Apply suggestions from code review Co-authored-by: Rod Vagg <rod@vagg.org> * changes as per review -> round 1 * Apply suggestions from code review Co-authored-by: Rod Vagg <rod@vagg.org> * Apply suggestions from code review Co-authored-by: Rod Vagg <rod@vagg.org> * log tipset key cid * Apply suggestions from code review Co-authored-by: Rod Vagg <rod@vagg.org> * Apply suggestions from code review Co-authored-by: Rod Vagg <rod@vagg.org> * fix docs * Apply suggestions from code review Co-authored-by: Rod Vagg <rod@vagg.org> * fix tests * fix tests * make jen * fix conflicts --------- Co-authored-by: Aryan Tikarya <aryan.tikarya@dojima.network> Co-authored-by: Rod Vagg <rod@vagg.org> Co-authored-by: Steve Loeppky <biglep@filoz.org> * fix lint * Apply suggestions from code review Co-authored-by: Rod Vagg <rod@vagg.org> * Apply suggestions from code review Co-authored-by: Rod Vagg <rod@vagg.org> * remove reverted flag from RPC * Apply suggestions from code review Co-authored-by: Rod Vagg <rod@vagg.org> * fix testing of events and dummy chain store * remove lotus shed commands for old Indices * change type of event counts to uint64 * only recompute events if theyre not found * short-circuit empty events path for older tipsets * chain indexer must be enabled if ETH RPC is enabled * change name of message_id column to id in tipset_message table * only expose SetRecomputeTipSetStateFunc * dont block on head indexing for reading messages * document why we're only checking for missing events for a single tipset * document when we query for reverted events * simplify event collection * Apply suggestions from code review Co-authored-by: Rod Vagg <rod@vagg.org> * fix test * change event_id to id in the event table * change head indexed timeout * remove deprecated config options * fail ETH RPC calls if ChainIndexer is disabled * fix docs * remove the tipset key cid func from lotus shed * address review comments * Apply suggestions from code review Co-authored-by: Rod Vagg <rod@vagg.org> * chore(events): remove unnecessary DisableRealTimeFilterAPI (#12610) * feat(cli): add --quiet to chainindex validate-backfill + cleanups (#12611) * fix tests * Apply suggestions from code review Co-authored-by: Rod Vagg <rod@vagg.org> * error type for disabled chainindexer * fix(chainindex): recompute tipset when we find no receipts * fix(chainindexer): backfilling should halt when chain state data is missing and not backfill parents (#12619) * fix backfilling UX * Update chain/index/api.go Co-authored-by: Rod Vagg <rod@vagg.org> * address review --------- Co-authored-by: Rod Vagg <rod@vagg.org> * reduce log noise * make jen * make jen * docs: finishing chain-indexer-overview-for-operators.md (#12600) * Followup to PR #12450 for doc updates This is being used to resolve the unresolved items in #12450 since that PR is unwieldly at this point. * Incorporated some items and added TODOs based on unresolved items from #12450 * Incorporating more feedback * Pointing to issue to learn about benefits * Formatting fixes * Apply most of the suggestions from @rvagg code review Co-authored-by: Rod Vagg <rod@vagg.org> * Incorporating feedback from #12600 (comment) * Addressing #12600 (comment) and more * Moved chain-indexer docs to documentation Renamed Added ToC We can move to lotus-docs later * Update documentation/en/chain-indexer-overview-for-operators.md Co-authored-by: Rod Vagg <rod@vagg.org> * Update documentation/en/chain-indexer-overview-for-operators.md Co-authored-by: Rod Vagg <rod@vagg.org> * Added upgrade path when importing chain state from a snapshot. * Typo fixes * Update documentation/en/chain-indexer-overview-for-operators.md Co-authored-by: Rod Vagg <rod@vagg.org> * chore(doc): "regular checks" section for chainindexer docs (#12612) * Apply suggestions from @rvagg code review Co-authored-by: Rod Vagg <rod@vagg.org> * Incorporating @aarshkshah1992 feedback * Update documentation/en/chain-indexer-overview-for-operators.md Co-authored-by: Rod Vagg <rod@vagg.org> --------- Co-authored-by: Rod Vagg <rod@vagg.org> Co-authored-by: Aarsh Shah <aarshkshah1992@gmail.com> * remove go mod replace * remove unnecessary changes from CHANGELOG * fix test * compare events AMT root (#12632) * fix(chainindex): retry transaction if database connection is lost (#12657) * retry database lost connection * log context cancellation * address review * fix gateway itest: no chainindexer for lite nodes * fix changelog --------- Co-authored-by: Rod Vagg <rod@vagg.org> Co-authored-by: Aryan Tikarya <aryan.tikarya@dojima.network> Co-authored-by: Steve Loeppky <biglep@filoz.org>
This PR is part of #12450, it changes:
IndexedEventEntriesCount
inIndexValidation
ChainValidateIndex
API