feat: Store mapping of eth transaction hashes to message cids #9965

geoff-vball · 2023-01-04T16:44:37Z

Related Issues

Proposed Changes

Adds a database to keep a mapping of Eth Transaction Hash to Filecoin Message CID. Originally we were just using reversible transformation of the Message CID for the hash, but Eth tooling expects a specific method of generation, so we have changed it to match. This is only true for Ethereum messages, all other messages will use the old reversible transformation.

I've created a separate database from the events database just for ease of use. If we really want them to be in the same database, we should refactor the database to be more globally accessible.

@arajasek is worried that saving this mapping for all mpool messages is a possible attack vector to spam the node.

I do want to add options for being able to clear out old entries to the database. I'm welcome to suggestions on how this might be implemented.

Additional Info

Checklist

Before you mark the PR ready for review, please make sure that:

Commits have a clear commit message.
PR title is in the form of of <PR type>: <area>: <change being made>
- example: fix: mempool: Introduce a cache for valid signatures
- PR type: fix, feat, build, chore, ci, docs, perf, refactor, revert, style, test
- area, e.g. api, chain, state, market, mempool, multisig, networking, paych, proving, sealing, wallet, deps
New features have usage guidelines and / or documentation updates in
- Lotus Documentation
- Discussion Tutorials
Tests exist for new functionality or change in behavior
CI is green

raulk

Great work here! Feel free to ping me if you have any doubts about these comments.

raulk · 2023-01-13T17:22:41Z

api/api_full.go

@@ -778,6 +778,7 @@ type FullNode interface {
 	EthGetBlockByHash(ctx context.Context, blkHash ethtypes.EthHash, fullTxInfo bool) (ethtypes.EthBlock, error)                               //perm:read
 	EthGetBlockByNumber(ctx context.Context, blkNum string, fullTxInfo bool) (ethtypes.EthBlock, error)                                        //perm:read
 	EthGetTransactionByHash(ctx context.Context, txHash *ethtypes.EthHash) (*ethtypes.EthTx, error)                                            //perm:read
+	EthGetTransactionHashByCid(ctx context.Context, cid cid.Cid) (*ethtypes.EthHash, error)                                                    //perm:read


Interesting, this is a Filecoin specific eth_ method but I don't think it matters. Can we separate such Filecoin specific Eth methods to another interface that we embed here?

Could you explain exactly what you mean? Do you want another API in EthAPI on the same level as EthModuleAPI and EthEventAPI?

chain/ethhash/eth_transaction_hash_lookup.go

raulk · 2023-01-13T20:40:56Z

node/repo/interface.go

@@ -69,6 +69,9 @@ type LockedRepo interface {
 	// SplitstorePath returns the path for the SplitStore
 	SplitstorePath() (string, error)

+	// SqlitePath returns the path for the Sqlite database
+	SqlitePath() (string, error)


I think this is a bit ambiguous, since there's already a sqlite for events. Either we add that path here too, or we qualify this path to say that it's for the EthTxHash mapping.

I do think we should map the events db to the same path. It gives an easy place to find the databases that we don't need to configure.

It also allows us to keep the database inside the LOTUS_PATH, which I wasn't able to do with the setup from the events db (though that might have been my error with the way our dependency injection works).

raulk · 2023-01-13T20:45:28Z

node/impl/full/eth.go

+func (m EthTxHashManager) Apply(ctx context.Context, from, to *types.TipSet) error {
+	for _, blk := range to.Blocks() {
+		_, smsgs, err := m.StateAPI.Chain.MessagesForBlock(ctx, blk)
+		if err != nil {
+			return err
+		}
+
+		for _, smsg := range smsgs {
+			if smsg.Signature.Type != crypto.SigTypeDelegated {
+				continue
+			}
+
+			hash, err := EthTxHashFromSignedFilecoinMessage(ctx, smsg, m.StateAPI)
+			if err != nil {
+				return err
+			}
+
+			err = m.TransactionHashLookup.InsertTxHash(hash, smsg.Cid(), int64(to.Height()))
+			if err != nil {
+				return err
+			}
+		}
+	}
+
+	return nil
+}
+
+type EthTxHashManager struct {
+	StateAPI              StateAPI
+	TransactionHashLookup *ethhashlookup.TransactionHashLookup
+}
+
+func (m EthTxHashManager) Revert(ctx context.Context, from, to *types.TipSet) error {
+	return nil
+}


The edge case where a reorg happens and messages get included in a future epoch is handled by the ON CONFLICT clause, which is great. However, if a message is added in tipset 1A, then we reorg to tipset 1B which does not include the message, and it only lands in tipset 4A, we would've spent 3 tipsets thinking that the message exists and it landed on epoch 1. Is that correct?

Similarly, there's the edge case where the chain reorgs and completely removes a message such that it never again lands on chain.

I'm not sure how important consistency in such scenarios is -- it might be acceptable to have wrong data in the index if the epoch is only used for garbage collection. But we should never return data over the interface that is not truthful!

I think I want to use just raw timestamps instead of epochs here. The column should just be for optionally periodically clearing out the db so users can manage the size of their state. The problem with the current setup is that we don't know when to clear out mpool mappings if they're removed. We don't need the epoch information in this table, we can look it up once we get the cid.

If we're not storing the epoch, I don't think there are any inconsistency issues.

AFAIK once a transaction is successfully added in the mpool (has passed validation), I believe that four logical outcomes are possible:

It can land on chain, and is eventually finalized.

It can land on chain and then be reverted due to a reorg, "returning" to the mpool.

It can be replaced.

It can drop.

Are we handling all of these? Wanna make sure that there's no possible issue here with stale or bad data in the cache?

We don't explicitly remove mappings for messages in scenarios 2-4, so we might end up with a few mappings in the database for transactions that never fully land on chain. I don't think this is an issue. I've removed the epoch which could've led to inconsistencies. There's no harm in returning a cid for a message that is no longer relevant - the caller will look up the cid and will be unable to find it in their store.

@geoff-vball are you saying that we have covered the proper use cases? To your point, stale or useless messages are likely harmless to leave around outside of the space it takes up. When it comes to getting a canonical version of the data, I imagine that comes from a snapshot?

Either way, being able to garbage collect that information seems orthogonal to the criticality of how re-orgs are handled.

Are you saying that we are safe against the 4 considered cases? Is there any other blocker?

Yeah, the only situations where you'd be missing something from your DB are:

It has been garbage collected by the retention policy

You had the DB turned off when you synced.

We can add tooling later to rebuild the index if desired.

Just to summarize how it works:
Adds mapping when:

Tx added to mpool

Tx processed in block

Removes mappings when:

Checks every hour for mappings that have been in the database longer than the retention policy.

What is the default retention policy? Curious.

@scotthconner Default sets the database to not garbage collect at all. If they notice their db getting too large they can change the retention policy.

Thinking about this some more, I think the database should mirror the retention policy of messages in the blockstore. The mappings become (relatively) useless if we prune a message from our store. You'll be able to find the cid, but you won't be able to use to cid to look up the message.

You could theoretically use that cid to look up the transaction on filfox or filscan. Do we know if they're planning on implementing lookups by Transaction Hashes?

node/impl/full/eth.go

…ge Cids

raulk

Great work here. Some loose ends we need to tie up in subsequent PRs:

Populating the index on snapshot import.
Mpool index management (see threads above).

mur-me · 2023-01-18T13:40:17Z

Howdy!
Glif is here.

Our RPC endpoints is still having issues with EthGetBlockByHash operation on the Metamask -> RPC.

I've go deeper and it looks like that something goes wrong on the lotus side, because it putting from ETH Hash into the txhash.db

Example:
0xbc51be0da51511bcf2f9b7fa4915b7fee936d3cf34e4d6ac811f1aefb283fc5d|bafy2bzaceb7qsy5gvx7kygtol5sw7yypomslher73vbp2ssb6rnzrwo6rykku|2023-01-18 13:24:00 - line from db

Just look to the glif explorer hashes, e.x bafy2bzaceb7qsy5gvx7kygtol5sw7yypomslher73vbp2ssb6rnzrwo6rykku - https://explorer.glif.io/tx/bafy2bzaceb7qsy5gvx7kygtol5sw7yypomslher73vbp2ssb6rnzrwo6rykku/?network=hyperspace

You can see that there is no 0xbc51be0da51511bcf2f9b7fa4915b7fee936d3cf34e4d6ac811f1aefb283fc5d like it stored in the db, but there is 0x7f0963a6adfeac1a6e5f656fe30f7324b3923fdd42fd4a41f45b98d9de8e14aa

Ignore this ^, right now this should be implemented on the explorer's side.

geoff-vball force-pushed the gstuart/eth-hash branch 7 times, most recently from b686a5e to 7618fa6 Compare January 5, 2023 17:42

geoff-vball marked this pull request as ready for review January 5, 2023 18:47

geoff-vball requested a review from a team as a code owner January 5, 2023 18:47

geoff-vball requested review from raulk and arajasek January 5, 2023 18:50

jennijuju linked an issue Jan 6, 2023 that may be closed by this pull request

Transaction Hash Mismatch when deploying contract using Open Zeppelin Upgrades #9839

Closed

geoff-vball force-pushed the gstuart/eth-hash branch 18 times, most recently from 3140076 to 5c3c863 Compare January 12, 2023 17:13

geoff-vball force-pushed the gstuart/eth-hash branch from a461a06 to 5704779 Compare January 13, 2023 17:35

Base automatically changed from feat/nv18-fevm to release/v1.20.0 January 13, 2023 19:11

raulk force-pushed the gstuart/eth-hash branch from 5704779 to c5c19db Compare January 13, 2023 20:04

raulk requested changes Jan 13, 2023

View reviewed changes

Store mapping from hashes for Ethereum transactions to Filecoin Messa…

a843607

…ge Cids

geoff-vball force-pushed the gstuart/eth-hash branch 2 times, most recently from 108f7cc to 47023ec Compare January 16, 2023 06:38

review fixes

f8dee09

geoff-vball force-pushed the gstuart/eth-hash branch 3 times, most recently from b8a14fe to e92f754 Compare January 16, 2023 11:43

Add gc for eth tx database

f8121c8

geoff-vball force-pushed the gstuart/eth-hash branch from e92f754 to f8121c8 Compare January 16, 2023 12:04

Geoff Stuart added 2 commits January 16, 2023 07:08

Remove maybe unnecessary check

6b0f111

Fix test

3b28368

geoff-vball mentioned this pull request Jan 16, 2023

fix: Non-eth transactions can be queried by hash #10026

Closed

7 tasks

Explain config more clearly

72f4250

arajasek mentioned this pull request Jan 17, 2023

Refactor: Unify EthTx to FilecoinMessage methods #10017

Closed

7 tasks

raulk approved these changes Jan 18, 2023

View reviewed changes

jennijuju mentioned this pull request Jan 18, 2023

Populating the index on snapshot import #10044

Closed

3 tasks

jennijuju merged commit fcefc87 into release/v1.20.0 Jan 18, 2023

jennijuju deleted the gstuart/eth-hash branch January 18, 2023 01:06

jennijuju mentioned this pull request Jan 18, 2023

Lotus Release Handoff: v1.20.0-hyperspace-nv19 filecoin-project/testnet-hyperspace#5

Closed

geoff-vball mentioned this pull request Jan 18, 2023

chore: Integrate new bundle, revert accidental ffi #10053

Merged

7 tasks

jimmylee mentioned this pull request Jan 18, 2023

FVM HeatCheck application-research/outercore-eng-kb#23

Closed

jennijuju mentioned this pull request Jan 20, 2023

Ethereum transactions aren't always being treated as EIP 1559 compliant when they are #9983

Closed

18 tasks

This was referenced Jan 30, 2023

[venus] merge nv18 consensus / NV18 代码合并 filecoin-project/venus#5550

Closed

feat: Store mapping of eth transaction hashes to message cids filecoin-project/venus#5681

Merged

jennijuju added this to the Network v18 milestone Feb 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Store mapping of eth transaction hashes to message cids #9965

feat: Store mapping of eth transaction hashes to message cids #9965

geoff-vball commented Jan 4, 2023 •

edited

Loading

raulk left a comment

raulk Jan 13, 2023

geoff-vball Jan 16, 2023

raulk Jan 13, 2023

geoff-vball Jan 15, 2023

raulk Jan 13, 2023

geoff-vball Jan 15, 2023

raulk Jan 17, 2023

geoff-vball Jan 17, 2023

scotthconner Jan 17, 2023

geoff-vball Jan 17, 2023 •

edited

Loading

geoff-vball Jan 17, 2023

scotthconner Jan 18, 2023

geoff-vball Jan 18, 2023

raulk left a comment

mur-me commented Jan 18, 2023 •

edited

Loading

feat: Store mapping of eth transaction hashes to message cids #9965

feat: Store mapping of eth transaction hashes to message cids #9965

Conversation

geoff-vball commented Jan 4, 2023 • edited Loading

Related Issues

Proposed Changes

Additional Info

Checklist

raulk left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

geoff-vball Jan 17, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

raulk left a comment

Choose a reason for hiding this comment

mur-me commented Jan 18, 2023 • edited Loading

geoff-vball commented Jan 4, 2023 •

edited

Loading

geoff-vball Jan 17, 2023 •

edited

Loading

mur-me commented Jan 18, 2023 •

edited

Loading