Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core: lookup txs by block number instead of block hash #19431

Merged
merged 2 commits into from
Apr 25, 2019

Conversation

Matthalp-zz
Copy link
Contributor

@Matthalp-zz Matthalp-zz commented Apr 10, 2019

Transaction hashes now store a reference to their corresponding
block number as opposed to their hash. In benchmarks this was
shown to reduce storage by over 12 GB.

The main limitation of this approach is that transactions on
non-canonical blocks could never be looked up, however that is
currently not supported.

The database version has been upgraded to version 5 and the
transaction lookup process is backwards-compatible with the
prior two transaction lookup formats prexisting in the
database instance. Tests have been added to ensure this.

@karalabe
Copy link
Member

karalabe commented Apr 15, 2019

I've built an image with your last commit (the hacky one) on top of the latest version of your previous PR (currently pending merge) and deployed them on 2 VMs (06/07 (old-pr/old-pr-and-this) for Martin). Will report on the fast sync results in half a day.

@Matthalp-zz
Copy link
Contributor Author

Thanks @karalabe ! Excited to see if we get any additional savings here.

@karalabe
Copy link
Member

Sync is not yet done, but your numbers seem to be fairly accurate. PR is about 12GB smaller than without it.

@karalabe
Copy link
Member

Fast sync done, 12GB saved!

@Matthalp-zz Matthalp-zz force-pushed the optimize-tx-lookup-storage branch from d90ef2b to 3ab452b Compare April 16, 2019 14:33
@Matthalp-zz Matthalp-zz changed the title [BENCHMARK] Only store block number for transaction look ups core: lookup txs by block number instead of block hash Apr 16, 2019
@Matthalp-zz
Copy link
Contributor Author

@karalabe That's great! I've update the PR with the actual code and backwards compatibility with the prior transaction lookup formats.

@Matthalp-zz Matthalp-zz force-pushed the optimize-tx-lookup-storage branch 2 times, most recently from 8ae92dd to 48177b5 Compare April 16, 2019 14:37
Transaction hashes now store a reference to their corresponding
block number as opposed to their hash. In benchmarks this was
shown to reduce storage by over 12 GB.

The main limitation of this approach is that transactions on
non-canonical blocks could never be looked up, however that is
currently not supported.

The database version has been upgraded to version 5 and the
transaction lookup process is backwards-compatible with the
prior two transaction lookup formats prexisting in the
database instance. Tests have been added to ensure this.
@Matthalp-zz Matthalp-zz force-pushed the optimize-tx-lookup-storage branch from 48177b5 to 835abc4 Compare April 17, 2019 13:39
@@ -27,28 +29,36 @@ import (

// ReadTxLookupEntry retrieves the positional metadata associated with a transaction
// hash to allow retrieving the transaction or receipt by hash.
func ReadTxLookupEntry(db ethdb.Reader, hash common.Hash) common.Hash {
func ReadTxLookupEntry(db ethdb.Reader, hash common.Hash) *uint64 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be worthwhile to have two methods? One that returns a uint64 and one that returns hash? That way, we would sometimes not needs to do lookup tx -> hash, read hashto number -> read cannon hash (number).

It would also make the code for the callers a bit simpler

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@holiman Could you point me to the code that you think would benefit from this? How performance critical is that code as well?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about ReadTransaction and ReadReceipt.
They do
ReadTxLookupEntry, ReadCanonicalHash,ReadBody .
If it's an old tx, it would internally be
read a hash, ReadHeaderNumber, ReadCanonicalHash,ReadBody .
which could be
read a hash -> ReadBody .
Maybe it would just overcomplicate things

Copy link
Contributor Author

@Matthalp-zz Matthalp-zz Apr 25, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@holiman I agree that legacy databases pay a penalty here. If a user was using the client for constantly looking up transactions they may feel the effects of this slowdown. However, if this was a serious user I would probably expect them to resync. We could also offer a CLI tool to do these various database upgrades. @karalabe let me know if this is something that would be valuable to have. It would be done in a follow up PR.

@karalabe
Copy link
Member

Apart from my 2 tiny nits, LGTM

@karalabe karalabe added this to the 1.9.0 milestone Apr 25, 2019
Copy link
Member

@rjl493456442 rjl493456442 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a small question, otherwise LGTM

return nil
}
// Database v6 tx lookup just stores the block number
if len(data) < common.HashLength {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am thinking is this the final step of txLookup? Since there is no flag at all now for the v6 format.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not clear what you mean by a flag.

Copy link
Member

@rjl493456442 rjl493456442 Apr 25, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean the "flag" for blob. For example, if the blob can be decoded into v3 structure, this is kind of "flag". Also if the length of blob is 32, it is kind of flag for v4, v5.

So probably in the future if we want to change the format again, it can be painful.

But just thinking.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, the flag here is < 32 bytes :) But all in all I think we always want to try to decode into the newest version first. People who upgrade to 1.9 will with a high chance resync, so might as well use the happy path and not do 3 decoding to get it right.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rjl493456442 My intention was how @karalabe interpreted the code. If either of you feel like this needs to be made more clear and have a suggestion about how to do it please feel free to adjust it or let me know what to do.

@karalabe karalabe merged commit 9374175 into ethereum:master Apr 25, 2019
rjl493456442 pushed a commit to rjl493456442/go-ethereum that referenced this pull request May 6, 2019
* core: lookup txs by block number instead of block hash

Transaction hashes now store a reference to their corresponding
block number as opposed to their hash. In benchmarks this was
shown to reduce storage by over 12 GB.

The main limitation of this approach is that transactions on
non-canonical blocks could never be looked up, however that is
currently not supported.

The database version has been upgraded to version 5 and the
transaction lookup process is backwards-compatible with the
prior two transaction lookup formats prexisting in the
database instance. Tests have been added to ensure this.

* core/rawdb: tiny review nit fixes
gzliudan added a commit to gzliudan/XDPoSChain that referenced this pull request Jan 23, 2025
gzliudan added a commit to gzliudan/XDPoSChain that referenced this pull request Jan 23, 2025
gzliudan added a commit to gzliudan/XDPoSChain that referenced this pull request Jan 23, 2025
gzliudan added a commit to gzliudan/XDPoSChain that referenced this pull request Jan 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants