-
Notifications
You must be signed in to change notification settings - Fork 504
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
horizon: history_claimable_balances
is not cleared out by the reaper.
#4396
Comments
We have three tables which aren't cleaned up when deleting a range. Those are:
They all pair an entity (claimable balance, liquidity pool, account) with an internal id:
In theory, every time we clear out a range of data (either for reingesting or for reaping) we should also remove any orphan entries in the tables above (an orphan entry is one whose internal id isn't used anymore, e.g. for However, the queries required to do so may be too expensive. As I see it we could either:
I think both (1) and (2) may be too expensive for large enough |
maybe @sydneynotthecity may have a better querying suggestion? |
Do you know how expensive of a query cost is too expensive? Another option is that we could create a trigger that tracked any record deletions in the Once the records were cleared out, we could either update an indicator in the event tables for |
@2opremio the other point is the It appears that the equivalent |
let's wait for the indices to be added and retake this after that. |
…4518) While Horizon removes history data when `--history-retention-count` flag is set it doesn't clear lookup historical tables. Lookup tables are `[id, key name]` pairs that allow setting pointers to keys in historical tables, thus saving disk space. This data can occupy a vast space on disk and is never used when old historical data is deleted. This commit adds code responsible for clearing orphaned rows in lookup historical tables. Orphaned rows can appear when old data is removed by reaper. The new code is separate from the existing reaper code (see "Alternative solutions" below) and activates after each ledger if there are no more ledgers to ingest in the backend. This has two advantages: it does not slow down catchup and it works only when ingestion is idle which shouldn't affect ingestion at all. To ensure performance is not affected, the `ReapLookupTables` method is called with context with 5 seconds timeout which means that if it does not finish the work in specified time it will simply be cancelled. The solution here requires new indexes added in c2d52f0 (without it finding the rows to delete is slow). For each lookup table, we check the number of occurences of a given lookup ID in all the tables in which lookup table is used. If no occurences are found, the row is removed from a lookup table. Rows are removed in batches of 10000 rows (can be modified in the future). The cursor is updated when tables is processed so after next ledger ingesion the next chunk of rows is checked. When cursor reaches the end of table it is reset back to 0. This ensures that all the orphaned rows are removed eventually (some rows can be skipped because new rows are added to lookup tables by ingestion and some are removed by reaper so `offset` does not always skip to the place is should to cover entire table). #### Alternative solutions While working on this I tried to implement @fons'es idea from #4396 which was removing rows before clearing historical data which are not present in other ranges. There is a general problem with this solution. The lookup tables are actively used by ingestion which means that if rows are deleted while ingestion reads a given row it can create inconsistent data. We could modify reaper to aquire ingestion lock but if there are many ledgers to remove it can affect ingestion. We could also write a query that finds and removes all the orphaned rows but it's too slow to be executed between ingestion of two consecutive ledgers.
#4518 clears orphans from |
We should:
history_claimable_balances
history_
tables in the DB are cleared out (in order prevent this problem from happening to tables added in the future).The text was updated successfully, but these errors were encountered: