Skip to content

Commit

Permalink
perf: Amortize clearing unsorted cache entries (Juno genesis fix) (#1…
Browse files Browse the repository at this point in the history
…2885)

This change fixes a bounty by the Juno team.  Juno's invariant checks took 10 hours during their most recent chain halt. This PR cuts that down to 30 seconds.  See https://github.com/CosmosContracts/bounties#improve-speed-of-invariant-checks.

The root problem is deep in the `can-withdraw` invariant check, which calls this repeatedly: https://github.com/cosmos/cosmos-sdk/blob/main/x/distribution/keeper/store.go#L337.  Iterators have a chain of parents and in this case creates an iterator from the `cachekv` store.  For the genesis file, it has a cache of 500,000+ unsorted entries, which are sorted as strings here: https://github.com/cosmos/cosmos-sdk/blob/main/store/cachekv/store.go#L314.  Each delegation from `can-withdraw` uses this cache and many of the cache checks miss or are a very small range.  This means very few entries get removed from the unsorted cache and they have to be re-sorted on the next call.  With a full cache it takes about 180ms on my machine to sort them.

This change introduce a minimum number of entries that will get processed and removed from the unsorted list. It's set at the same value that directs the code to sort them in the first place.  This ensures the unsorted values get removed in a relative short amount of time, and amortizes the cost to ensure an individual check does not have to process the entire cache.

## Benchmarks
On running the benchmarks included in this change produces:
```shell
name                    old time/op    new time/op    delta
LargeUnsortedMisses-32     21.2s ± 9%      0.0s ± 1%   -99.91%  (p=0.000 n=20+17)

name                    old alloc/op   new alloc/op   delta
LargeUnsortedMisses-32    1.64GB ± 0%    0.00GB ± 0%   -99.83%  (p=0.000 n=19+19)

name                    old allocs/op  new allocs/op  delta
LargeUnsortedMisses-32     20.0k ± 0%     41.1k ± 0%  +105.23%  (p=0.000 n=19+20)
```

## Invariant checks results
This is what the invariant checks for Juno look like with this change (on a Hetzner AX101):

```shell
INF starting node with ABCI Tendermint in-process
4:11PM INF Starting multiAppConn service impl=multiAppConn module=proxy
4:11PM INF Starting localClient service connection=query impl=localClient module=abci-client
4:11PM INF Starting localClient service connection=snapshot impl=localClient module=abci-client
4:11PM INF Starting localClient service connection=mempool impl=localClient module=abci-client
4:11PM INF Starting localClient service connection=consensus impl=localClient module=abci-client
4:11PM INF Starting EventBus service impl=EventBus module=events
4:11PM INF Starting PubSub service impl=PubSub module=pubsub
4:11PM INF Starting IndexerService service impl=IndexerService module=txindex
4:11PM INF ABCI Handshake App Info hash= height=0 module=consensus protocol-version=0 software-version=v9.0.0-36-g8fd6f16
4:11PM INF ABCI Replay Blocks appHeight=0 module=consensus stateHeight=0 storeHeight=0
4:12PM INF asserting crisis invariants inv=1/11 module=x/crisis name=gov/module-account
4:12PM INF asserting crisis invariants inv=2/11 module=x/crisis name=distribution/nonnegative-outstanding
4:12PM INF asserting crisis invariants inv=3/11 module=x/crisis name=distribution/can-withdraw
4:12PM INF asserting crisis invariants inv=4/11 module=x/crisis name=distribution/reference-count
4:12PM INF asserting crisis invariants inv=5/11 module=x/crisis name=distribution/module-account
4:12PM INF asserting crisis invariants inv=6/11 module=x/crisis name=bank/nonnegative-outstanding
4:12PM INF asserting crisis invariants inv=7/11 module=x/crisis name=bank/total-supply
4:12PM INF asserting crisis invariants inv=8/11 module=x/crisis name=staking/module-accounts
4:12PM INF asserting crisis invariants inv=9/11 module=x/crisis name=staking/nonnegative-power
4:12PM INF asserting crisis invariants inv=10/11 module=x/crisis name=staking/positive-delegation
4:12PM INF asserting crisis invariants inv=11/11 module=x/crisis name=staking/delegator-shares
4:12PM INF asserted all invariants duration=28383.559601 height=4136532 module=x/crisis
```

## Alternatives
There is another PR which fixes this problem for the Juno genesis file #12886. However, because of its concurrent nature, it happens to hit a large range relatively early, clearing the unsorted entries and allowing the rest of the checks to not sort it.

(cherry picked from commit 4fc1f73)

# Conflicts:
#	CHANGELOG.md
  • Loading branch information
blazeroni authored and mergify[bot] committed Aug 18, 2022
1 parent 09321d7 commit ffd7b32
Show file tree
Hide file tree
Showing 3 changed files with 159 additions and 1 deletion.
101 changes: 101 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,108 @@ Ref: https://keepachangelog.com/en/1.0.0/

## Unreleased

<<<<<<< HEAD
## v0.45.7 - 2022-08-04
=======
### Features

* (x/authz) [#12648](https://github.com/cosmos/cosmos-sdk/pull/12648) Add an allow list, an optional list of addresses allowed to receive bank assests via authz MsgSend grant.
* (cli) [#12028](https://github.com/cosmos/cosmos-sdk/pull/12028) Add the `tendermint key-migrate` to perform Tendermint v0.35 DB key migration.
* (sdk.Coins) [#12627](https://github.com/cosmos/cosmos-sdk/pull/12627) Make a Denoms method on sdk.Coins.

### Improvements

* [#12352](https://github.com/cosmos/cosmos-sdk/pull/12352) Move the `RegisterSwaggerAPI` logic into a separate helper function in the server package.
* [#12876](https://github.com/cosmos/cosmos-sdk/pull/12876) Remove proposer-based rewards.
* [#12892](https://github.com/cosmos/cosmos-sdk/pull/12892) `make format` now runs only gofumpt and golangci-lint run ./... --fix, replacing `goimports` `gofmt` and `misspell`
* [#12846](https://github.com/cosmos/cosmos-sdk/pull/12846) Remove `RandomizedParams` from the `AppModuleSimulation` interface which is no longer needed.
* (events) [#12850](https://github.com/cosmos/cosmos-sdk/pull/12850) Add a new `fee_payer` attribute to the `tx` event that is emitted from the `DeductFeeDecorator` AnteHandler decorator.
* (ci) [#12854](https://github.com/cosmos/cosmos-sdk/pull/12854) Use ghcr.io to host the proto builder image. Update proto builder image to go 1.19
* (x/bank) [#12706](https://github.com/cosmos/cosmos-sdk/pull/12706) Added the `chain-id` flag to the `AddTxFlagsToCmd` API. There is no longer a need to explicitly register this flag on commands whens `AddTxFlagsToCmd` is already called.
* [#12791](https://github.com/cosmos/cosmos-sdk/pull/12791) Bump the math library used in the sdk and replace old usages of sdk.*
* (x/params) [#12615](https://github.com/cosmos/cosmos-sdk/pull/12615) Add `GetParamSetIfExists` function to params `Subspace` to prevent panics on breaking changes.
* [#12717](https://github.com/cosmos/cosmos-sdk/pull/12717) Use injected encoding params in simapp.
* (x/bank) [#12674](https://github.com/cosmos/cosmos-sdk/pull/12674) Add convenience function `CreatePrefixedAccountStoreKey()` to construct key to access account's balance for a given denom.
* [#12702](https://github.com/cosmos/cosmos-sdk/pull/12702) Linting and tidiness, fixed two minor security warnings.
* [#12634](https://github.com/cosmos/cosmos-sdk/pull/12634) Move `sdk.Dec` to math package.
* [#12596](https://github.com/cosmos/cosmos-sdk/pull/12596) Remove all imports of the non-existent gogo/protobuf v1.3.3 to ease downstream use and go workspaces.
* [#12187](https://github.com/cosmos/cosmos-sdk/pull/12187) Add batch operation for x/nft module.
* [#12693](https://github.com/cosmos/cosmos-sdk/pull/12693) Make sure the order of each node is consistent when emitting proto events.
* [#12455](https://github.com/cosmos/cosmos-sdk/pull/12455) Show attempts count in error for signing.
* [#12886](https://github.com/cosmos/cosmos-sdk/pull/12886) Amortize cost of processing cache KV store

### State Machine Breaking

* (x/bank) [#12610](https://github.com/cosmos/cosmos-sdk/pull/12610) `MsgMultiSend` now allows only a single input.
* (x/bank) [#12630](https://github.com/cosmos/cosmos-sdk/pull/12630) Migrate `x/bank` to self-managed parameters and deprecate its usage of `x/params`.
* (x/auth) [#12475](https://github.com/cosmos/cosmos-sdk/pull/12475) Migrate `x/auth` to self-managed parameters and deprecate its usage of `x/params`.
* (x/slashing) [#12399](https://github.com/cosmos/cosmos-sdk/pull/12399) Migrate `x/slashing` to self-managed parameters and deprecate its usage of `x/params`.
* (x/mint) [#12363](https://github.com/cosmos/cosmos-sdk/pull/12363) Migrate `x/mint` to self-managed parameters and deprecate it's usage of `x/params`.
* (x/distribution) [#12434](https://github.com/cosmos/cosmos-sdk/pull/12434) Migrate `x/distribution` to self-managed parameters and deprecate it's usage of `x/params`.
* (x/crisis) [#12445](https://github.com/cosmos/cosmos-sdk/pull/12445) Migrate `x/crisis` to self-managed parameters and deprecate it's usage of `x/params`.
* (x/gov) [#12631](https://github.com/cosmos/cosmos-sdk/pull/12631) Migrate `x/gov` to self-managed parameters and deprecate it's usage of `x/params`.
* (x/staking) [#12409](https://github.com/cosmos/cosmos-sdk/pull/12409) Migrate `x/staking` to self-managed parameters and deprecate it's usage of `x/params`.
* (x/bank) [#11859](https://github.com/cosmos/cosmos-sdk/pull/11859) Move the SendEnabled information out of the Params and into the state store directly.
* (x/gov) [#12771](https://github.com/cosmos/cosmos-sdk/pull/12771) Initial deposit requirement for proposals at submission time.

### API Breaking Changes

* (x/bank) [#12706](https://github.com/cosmos/cosmos-sdk/pull/12706) Removed the `testutil` package from the `x/bank/client` package.
* (simapp) [#12747](https://github.com/cosmos/cosmos-sdk/pull/12747) Remove `simapp.MakeTestEncodingConfig`. Please use `moduletestutil.MakeTestEncodingConfig` (`types/module/testutil`) in tests instead.
* (x/bank) [#12648](https://github.com/cosmos/cosmos-sdk/pull/12648) `NewSendAuthorization` takes a new argument of an optional list of addresses allowed to receive bank assests via authz MsgSend grant. You can pass `nil` for the same behavior as before, i.e. any recipient is allowed.
* (x/bank) [\#12593](https://github.com/cosmos/cosmos-sdk/pull/12593) Add `SpendableCoin` method to `BaseViewKeeper`
* (x/slashing) [#12581](https://github.com/cosmos/cosmos-sdk/pull/12581) Remove `x/slashing` legacy querier.
* (types) [\#12355](https://github.com/cosmos/cosmos-sdk/pull/12355) Remove the compile-time `types.DBbackend` variable. Removes usage of the same in server/util.go
* (x/gov) [#12368](https://github.com/cosmos/cosmos-sdk/pull/12369) Gov keeper is now passed by reference instead of copy to make post-construction mutation of Hooks and Proposal Handlers possible at a framework level.
* (simapp) [#12270](https://github.com/cosmos/cosmos-sdk/pull/12270) Remove `invCheckPeriod uint` attribute from `SimApp` struct as per migration of `x/crisis` to app wiring
* (simapp) [#12334](https://github.com/cosmos/cosmos-sdk/pull/12334) Move `simapp.ConvertAddrsToValAddrs` and `simapp.CreateTestPubKeys ` to respectively `simtestutil.ConvertAddrsToValAddrs` and `simtestutil.CreateTestPubKeys` (`testutil/sims`)
* (simapp) [#12312](https://github.com/cosmos/cosmos-sdk/pull/12312) Move `simapp.EmptyAppOptions` to `simtestutil.EmptyAppOptions` (`testutil/sims`)
* (simapp) [#12312](https://github.com/cosmos/cosmos-sdk/pull/12312) Remove `skipUpgradeHeights map[int64]bool` and `homePath string` from `NewSimApp` constructor as per migration of `x/upgrade` to app-wiring.
* (testutil) [#12278](https://github.com/cosmos/cosmos-sdk/pull/12278) Move all functions from `simapp/helpers` to `testutil/sims`
* (testutil) [#12233](https://github.com/cosmos/cosmos-sdk/pull/12233) Move `simapp.TestAddr` to `simtestutil.TestAddr` (`testutil/sims`)
* (x/staking) [#12102](https://github.com/cosmos/cosmos-sdk/pull/12102) Staking keeper now is passed by reference instead of copy. Keeper's SetHooks no longer returns keeper. It updates the keeper in place instead.
* (linting) [#12141](https://github.com/cosmos/cosmos-sdk/pull/12141) Fix usability related linting for database. This means removing the infix Prefix from `prefix.NewPrefixWriter` and such so that it is `prefix.NewWriter` and making `db.DBConnection` and such into `db.Connection`
* (x/distribution) [#12434](https://github.com/cosmos/cosmos-sdk/pull/12434) `x/distribution` module `SetParams` keeper method definition is now updated to return `error`.
* (x/staking) [#12409](https://github.com/cosmos/cosmos-sdk/pull/12409) `x/staking` module `SetParams` keeper method definition is now updated to return `error`.
* (x/crisis) [#12445](https://github.com/cosmos/cosmos-sdk/pull/12445) `x/crisis` module `SetConstantFee` keeper method definition is now updated to return `error`.
* (x/gov) [#12631](https://github.com/cosmos/cosmos-sdk/pull/12631) `x/gov` module refactored to use `Params` as single struct instead of `DepositParams`, `TallyParams` & `VotingParams`.
* (x/gov) [#12631](https://github.com/cosmos/cosmos-sdk/pull/12631) Migrate `x/gov` to self-managed parameters and deprecate it's usage of `x/params`.
* (x/bank) [#12630](https://github.com/cosmos/cosmos-sdk/pull/12630) `x/bank` module `SetParams` keeper method definition is now updated to return `error`.
* (x/bank) [\#11859](https://github.com/cosmos/cosmos-sdk/pull/11859) Move the SendEnabled information out of the Params and into the state store directly.
The information can now be accessed using the BankKeeper.
Setting can be done using MsgSetSendEnabled as a governance proposal.
A SendEnabled query has been added to both GRPC and CLI.
* (appModule) Remove `Route`, `QuerierRoute` and `LegacyQuerierHandler` from AppModule Interface.
* (x/modules) Remove all LegacyQueries and related code from modules
* (store) [#11825](https://github.com/cosmos/cosmos-sdk/pull/11825) Make extension snapshotter interface safer to use, renamed the util function `WriteExtensionItem` to `WriteExtensionPayload`.

### CLI Breaking Changes

* /

### Bug Fixes

* [#12548](https://github.com/cosmos/cosmos-sdk/pull/12548) Prevent signing from wrong key while using multisig.
* (genutil) [#12140](https://github.com/cosmos/cosmos-sdk/pull/12140) Fix staking's genesis JSON migrate in the `simd migrate v0.46` CLI command.
* (types) [#12154](https://github.com/cosmos/cosmos-sdk/pull/12154) Add `baseAccountGetter` to avoid invalid account error when create vesting account.
* (x/authz) [#12184](https://github.com/cosmos/cosmos-sdk/pull/12184) Fix MsgExec not verifying the validity of nested messages.
* (x/staking) [#12303](https://github.com/cosmos/cosmos-sdk/pull/12303) Use bytes instead of string comparison in delete validator queue
* (x/auth/tx) [#12474](https://github.com/cosmos/cosmos-sdk/pull/12474) Remove condition in GetTxsEvent that disallowed multiple equal signs, which would break event queries with base64 strings (i.e. query by signature).
* (store/rootmulti) [#12487](https://github.com/cosmos/cosmos-sdk/pull/12487) Fix non-deterministic map iteration.
* (x/group) [#12888](https://github.com/cosmos/cosmos-sdk/pull/12888) Fix event propagation to the current context of `x/group` message execution `[]sdk.Result`.
* (sdk/dec_coins) [#12903](https://github.com/cosmos/cosmos-sdk/pull/12903) Fix nil `DecCoin` creation when converting `Coins` to `DecCoins`
* (x/upgrade) [#12906](https://github.com/cosmos/cosmos-sdk/pull/12906) Fix upgrade failure by moving downgrade verification logic after store migration.
* (store) [#12945](https://github.com/cosmos/cosmos-sdk/pull/12945) Fix nil end semantics in store/cachekv/iterator when iterating a dirty cache.

### Deprecated

* (x/bank) [#11859](https://github.com/cosmos/cosmos-sdk/pull/11859) The Params.SendEnabled field is deprecated and unusable.
The information can now be accessed using the BankKeeper.
Setting can be done using MsgSetSendEnabled as a governance proposal.
A SendEnabled query has been added to both GRPC and CLI.

## [v0.46.0](https://github.com/cosmos/cosmos-sdk/releases/tag/v0.46.0) - 2022-07-26
>>>>>>> 4fc1f73c8 (perf: Amortize clearing unsorted cache entries (Juno genesis fix) (#12885))
### Features

Expand Down
43 changes: 43 additions & 0 deletions store/cachekv/search_benchmark_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
package cachekv

import (
db "github.com/tendermint/tm-db"
"strconv"
"testing"
)

func BenchmarkLargeUnsortedMisses(b *testing.B) {
for i := 0; i < b.N; i++ {
b.StopTimer()
store := generateStore()
b.StartTimer()

for k := 0; k < 10000; k++ {
// cache has A + Z values
// these are within range, but match nothing
store.dirtyItems([]byte("B1"), []byte("B2"))
}
}
}

func generateStore() *Store {
cache := map[string]*cValue{}
unsorted := map[string]struct{}{}
for i := 0; i < 5000; i++ {
key := "A" + strconv.Itoa(i)
unsorted[key] = struct{}{}
cache[key] = &cValue{}
}

for i := 0; i < 5000; i++ {
key := "Z" + strconv.Itoa(i)
unsorted[key] = struct{}{}
cache[key] = &cValue{}
}

return &Store{
cache: cache,
unsortedCache: unsorted,
sortedCache: db.NewMemDB(),
}
}
16 changes: 15 additions & 1 deletion store/cachekv/store.go
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ import (
"github.com/cosmos/cosmos-sdk/store/types"
"github.com/cosmos/cosmos-sdk/telemetry"
"github.com/cosmos/cosmos-sdk/types/kv"
"github.com/tendermint/tendermint/libs/math"
)

// If value is nil but deleted is false, it means the parent doesn't have the
Expand Down Expand Up @@ -275,6 +276,8 @@ const (
stateAlreadySorted
)

const minSortSize = 1024

// Constructs a slice of dirty items, to use w/ memIterator.
func (store *Store) dirtyItems(start, end []byte) {
startStr, endStr := conv.UnsafeBytesToStr(start), conv.UnsafeBytesToStr(end)
Expand All @@ -291,7 +294,7 @@ func (store *Store) dirtyItems(start, end []byte) {
// O(N^2) overhead.
// Even without that, too many range checks eventually becomes more expensive
// than just not having the cache.
if n < 1024 {
if n < minSortSize {
for key := range store.unsortedCache {
if dbm.IsKeyInDomain(conv.UnsafeStrToBytes(key), start, end) {
cacheValue := store.cache[key]
Expand Down Expand Up @@ -322,6 +325,17 @@ func (store *Store) dirtyItems(start, end []byte) {
startIndex = 0
}

// Since we spent cycles to sort the values, we should process and remove a reasonable amount
// ensure start to end is at least minSortSize in size
// if below minSortSize, expand it to cover additional values
// this amortizes the cost of processing elements across multiple calls
if endIndex-startIndex < minSortSize {
endIndex = math.MinInt(startIndex+minSortSize, len(strL)-1)
if endIndex-startIndex < minSortSize {
startIndex = math.MaxInt(endIndex-minSortSize, 0)
}
}

kvL := make([]*kv.Pair, 0)
for i := startIndex; i <= endIndex; i++ {
key := strL[i]
Expand Down

0 comments on commit ffd7b32

Please sign in to comment.