-
Notifications
You must be signed in to change notification settings - Fork 671
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add database analyse-gas-usage command #10240
Conversation
Add a command which will allow to analyse gas usage on the blockchain. The command can be executed by running: ```bash ./neard database analyse-gas-usage ``` The command will look at gas used in all of the blocks, shards and accounts, and print out an analysis report. This commit just adds the command, the implementation will be added in the following commits.
Add a flag that can be used to specify how many latest blocks should be analysed using the analyse-gas-usage command. Now running: ```bash ./neard database analyse-gas-usage --last-blocks 1000 ``` Will go over the last 1000 blocks in the blockchain and analyse them.
For each block we want to gather information about how much gas was used in every shard and account. Gas is used when transactions and receipts are executed, and every such execution produces an ExecutionOutcome, which is saved in the database. We can read all ExecutionOutcomes originating from a given shard from the database, and get all of the needed information from there. Every ExecutionOutcome contains the AccountId of the executor and the amount of gas that was burned during the execution. It's everything that is needed for analysing gas usage. Usage from all blocks is merged into a single instace of GasUsageStats and the interesting pieces of information are displayed for the user.
Add a flag that can be used to specify a range of blocks to be analysed. Now running: ```bash ./neard database analyse-gas-usage --from-block-height 120 --to-block-height 130 ``` Will analyse 11 blocks with heights in range of [120, 130] The logic of converting command line arguments to block iterators has been extracted to the block_iterators module. In the future it'll be possible to reuse it for another command that wants to analyse some subset of blocks.
For each shard, calculate the optimal account that the shard could be split at, so that gas usage on both sides of the split is similar. An exact split isn't always possible, because it's possible for one account to consume 40% of all gas, but the function tries its best to make it fair. Information about optimal splits is displayed for the user.
Find the accounts that consume the most gas and display them in the analysis. This information is interesting, because it might turn out that one account consumes 50% of all gas.
Notes:
|
Example output:
Old (v2) output
Old (buggy) output$ time ./target/release/neard database analyse-gas-usage --last-blocks 43200
2023-11-23T16:39:01.093282Z INFO neard: version="trunk" build="1.1.0-4597-gb7d470924" latest_protocol=64
2023-11-23T16:39:01.093504Z INFO config: Validating Config, extracted from config.json...
2023-11-23T16:39:01.095173Z WARN genesis: Skipped genesis validation
2023-11-23T16:39:01.095199Z INFO config: Validating Genesis config and records. This could take a few minutes...
2023-11-23T16:39:01.095523Z INFO config: All validations have passed!
2023-11-23T16:39:01.095555Z INFO db_opener: Opening NodeStorage path="/home/ubuntu/.near/data" cold_path="none"
2023-11-23T16:39:01.095589Z INFO db: Opened a new RocksDB instance. num_instances=1
2023-11-23T16:39:01.114325Z INFO db: Closed a RocksDB instance. num_instances=0
2023-11-23T16:39:01.114350Z INFO db_opener: The database exists. path=/home/ubuntu/.near/data
2023-11-23T16:39:01.114361Z INFO db: Opened a new RocksDB instance. num_instances=1
2023-11-23T16:39:01.273929Z INFO db: Closed a RocksDB instance. num_instances=0
2023-11-23T16:39:01.273965Z INFO db: Opened a new RocksDB instance. num_instances=1
2023-11-23T16:39:01.291481Z INFO db: Closed a RocksDB instance. num_instances=0
2023-11-23T16:39:01.291506Z INFO db: Opened a new RocksDB instance. num_instances=1
Analysed 43200 blocks between:
Block: height = 106240517, hash = HaqJ1P71gAgHDdMjju72CkZ43uuAS4W3qRVU1LAta4Cq
Block: height = 106196896, hash = AWVbrPJUgy645ZB6eLh3fgZohQ5dQK66RKsh1TrhWBQB
Total gas used: 22407270193906837544
Shard: s0.v1
Gas usage: 4835423212212028560 (21.6% of total)
Number of accounts: 197450
Optimal split:
split_account: auqurbjawabo.users.kaiching
gas(account < split_account): 2417713906472325694 (50.0% of shard)
gas(account >= split_account): 2417709305739702866 (50.0% of shard)
Shard: s1.v1
Gas usage: 435103165008707214 (1.9% of total)
Number of accounts: 1
No optimal split for this shard
Shard: s2.v1
Gas usage: 6984397849429433430 (31.2% of total)
Number of accounts: 118485
Optimal split:
split_account: kkute9tmsit8.users.kaiching
gas(account < split_account): 3492199439799673004 (50.0% of shard)
gas(account >= split_account): 3492198409629760426 (50.0% of shard)
Shard: s3.v1
Gas usage: 10152345967256668340 (45.3% of total)
Number of accounts: 81093
Optimal split:
split_account: zzzdclzbf6li.users.kaiching
gas(account < split_account): 5076173908397008194 (50.0% of shard)
gas(account >= split_account): 5076172058859660146 (50.0% of shard)
10 biggest accounts by gas usage:
#1: token.sweat
Used gas: 2299587291568145490 (10.3% of total)
#2: earn.kaiching
Used gas: 2130683133139892087 (9.5% of total)
#3: wallet.kaiching
Used gas: 612465086458210767 (2.7% of total)
#4: sweat_welcome.near
Used gas: 470514156151885936 (2.1% of total)
#5: v2.ref-finance.near
Used gas: 244591219782634243 (1.1% of total)
#6: asset-manager.orderly-network.near
Used gas: 226419584256414264 (1.0% of total)
#7: aurora
Used gas: 217558664457482245 (1.0% of total)
#8: app.nearcrowd.near
Used gas: 207878414604109841 (0.9% of total)
#9: spin.sweat
Used gas: 174138779272472700 (0.8% of total)
#10: jars.sweat
Used gas: 128436775874128340 (0.6% of total)
2023-11-23T16:46:26.809965Z INFO db: Closed a RocksDB instance. num_instances=0
real 7m25.741s
user 5m22.621s
sys 1m4.785s |
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## master #10240 +/- ##
==========================================
- Coverage 71.87% 71.77% -0.10%
==========================================
Files 707 711 +4
Lines 141791 142942 +1151
Branches 141791 142942 +1151
==========================================
+ Hits 101907 102596 +689
- Misses 35171 35613 +442
- Partials 4713 4733 +20
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
It'd be nice to test this tool, but for that I would have to mock |
That's fine for tooling in a subcommand. In the usual neard flow we avoid unwraps however and propagate errors up. Personally I also like to log the error at the place where it happens to make it clear what's the place where it occured. |
That's fine, we can run it offline. |
Can you display gas in some nicer format e.g. tera gas? Funny how all the split accounts are belong to kaiching.
That's a bit surprising, it's really close to the end of the range of accounts - tripple 'z' really gets there. It would be good to do a follow up PR to check how well will the current candidate split the shard. Let me review this one first though. |
Don't worry about testing it, we don't usually write tests for the tools unless we intend them to be used in production. This is just a one off analysis. We can add tests if and when we upgrade it to production code. |
Can you make that state per shard? 1% global usage may be way more significant when narrowed down to a shard. Perhaps just print the top 5 per shard. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, thanks, looks good, but can you double check the results please? The 'zzz...' split account in shard 3 is sus.
type BigGas = u128; | ||
|
||
#[derive(Parser)] | ||
pub(crate) struct AnalyseGasUsageCommand { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mini nit: I think there is a neat way to make the command an enum or maybe just have an enum field inside of it. It's fine as is but if you have some spare time you can check it out and see if it would make sense here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked around and I couldn't find a clean way to do this using enums.
There is an open issue in clap
for creating mutually exclusive sets of arguments, but it isn't ready yet: clap-rs/clap#2621
It'd be possible to make two subcommands: last-blocks
and block-range
, but that doesn't feel right to me. I want a command analyse-gas
along with arguments that specify which blocks it should analyse.
I don't see a better way to do this, I think we can leave it as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found an example in our codebase and it is indeed using a subcommand. If you're curious you can have a look at the StateRootSelector. I don't remember how does it end up working in the cli. Not a big deal, feel free to ignore.
let mut near_config = | ||
nearcore::config::load_config(home, near_chain_configs::GenesisValidationMode::Full) | ||
.unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I would import those, not reason to suffer the long names
// The outcome of each transaction and receipt executed in this chunk is saved in the database as an ExecutionOutcome. | ||
// Go through all ExecutionOutcomes from this chunk and record the gas usage. | ||
let outcome_ids: Vec<CryptoHash> = | ||
chain_store.get_outcomes_by_block_hash_and_shard_id(block.hash(), shard_id).unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems alright but I'll double check that as I'm also not sure
Setting the target to 0% seems reasonable enough to me, but I'm not sure I have codecov permissions to make any changes here. Can you please file an issue for this? |
|
Nice catch! Yeah I screwed something up. I added code that manually counts gas on both sides of the split, and it doesn't look right at all:
Debugging.... |
GasUsageInShard::merge() merges results from two instances of GasUsageInShard into one. The merge is performed by adding information about each account from the other shard. Then used_gas_total is updated as well. However this is the wrong thing to do. Calling `add_used_gas` already increases `used_gas_total`, so doing it again doubles the total gas, which is incorrect. Fix the bug by removing the line that doubles the gas.
The field GasUsageInShard::used_gas_total keeps the total amount of gas used in a shard. The problem is that keeping it in sync with the rest of the struct requires special care and is prone to bugs. A mistake in this logic has already caused one bug in calculating shard splits. It's safer to get rid of the field and just calculate the total amount of used gas when needed. It doesn't take that long to calculate this value from scratch.
The previous implementation of calculate_split() worked okayish, but it had some faults. The first fault was that it split the shard into two halves where the left one consistend of accounts smaller than the boundary account. This doesn't match the logic that NEAR uses for boundary accounts. In NEAR the left half includes the boundary account, i.e it's <= boundary_acc, not < boundary_acc. The new function uses the same division as NEAR: (left <= split_account), (right > split_account). The second fault was that the old function looked for a split where gas_left >= gas_right. It isn't easy to prove that this is the optimal one - why is the left half supposed to be bigger? Let's implement it in a way that is easier to understand. The new calculate_split() looks for a split that minimizes the difference between the two halves. This makes sense, we want the halves to be as similar as possible. The commit also adds unit tests to make sure that calculate_split() works correctly.
The bug was in Fixed in 2919011. Eventually I decided to remove |
v2:
I updated the example output to show how the changes look like: #10240 (comment) |
I won't be able to review today, sorry. |
&
I don't think this is how it works but I'll double check. Looking at the current shard layout here shard 1 should contains the single account "aurora" which means that the boundary is the first account of the shard, not the last. For a new boundary aka the split account we want |
Can you make that % of the original shard - without split? The goal is to split the shard as evenly as possible so we want to measure everything relative to the original shard. |
Thanks, it looks much better now! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM and big thanks!
Just address the final comments before merging please.
Damn, you're right. ShardLayout::V1(ShardLayoutV1 { boundary_accounts, .. }) => {
// Note: As we scale up the number of shards we can consider
// changing this method to do a binary search rather than linear
// scan. For the time being, with only 4 shards, this is perfectly fine.
let mut shard_id: ShardId = 0;
for boundary_account in boundary_accounts {
if boundary_account.cmp(account_id) == Greater {
break;
}
shard_id += 1;
}
shard_id
} It would be much more readable if it was written using the standard let mut shard_id: ShardId = 0;
for boundary_account in boundary_accounts {
if account_id < boundary_account {
break;
}
shard_id += 1;
} Thanks! I will fix it. |
When splitting a shard into two halves the shard is divided into two on the split_account (boundary_account). The split should look like this: left < boundary_account right >= boundary_account Previously I misunderstood the semantics and implemented the split as if it was left <= boundary and right > boundary. Fix it.
Previously gas usage of some accounts was displayed as percentage of shard split half, but this was hard to reason about. Let's change it to a percentage of total gas usage on a shard.
v3:
Updated the example output to show how the latest version looks like. |
v4:
|
The following code is currently used for comparing `boundary_account` and `account_id`: ```rust if boundary_account.cmp(account_id) == Greater { ... } ``` IMO it's a bit confusing - it isn't immediately obvious whether this means `boundary_account` > `account_id` or `account_id` > `boundary_account`. I misread this line and because of that I made a mistake in near#10240. Let's change it to something that is easier to read: ```rust if account_id < boundary_account { ... } ```
The following code is currently used for comparing `boundary_account` and `account_id`: ```rust if boundary_account.cmp(account_id) == Greater { ... } ``` IMO it's a bit confusing - it isn't immediately obvious whether this means `boundary_account` > `account_id` or `account_id` > `boundary_account`. I misread this line and because of that I made a mistake in near#10240. Let's change it to something that is easier to read: ```rust if account_id < boundary_account { ... } ```
I guess VSCode froze and didn't run format on save :C
…#10268) The following code is currently used for comparing `boundary_account` and `account_id`: ```rust if boundary_account.cmp(account_id) == Greater { ... } ``` IMO it's a bit confusing - it isn't immediately obvious whether this means `boundary_account` > `account_id` or `account_id` > `boundary_account`. I misread this line and because of that I made a mistake in #10240. Let's change it to something that is easier to read: ```rust if account_id < boundary_account { ... } ```
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks! The numbers are really interesting!
Add a command which allows to analyse gas usage on some part of the blockchain.
The command goes over the specified blocks and gathers information about how much
gas was used by each block, shard and account.
Then it displays an analysis of all the data.
The analysis contains:
Help page:
Example usage: