-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inspect and fix a slow pattern in InitGenesis of certain modules #7766
Comments
I've done some investigation of this issue and come closer to isolating the problem. First I added logging and determined virtually the entire start up time is spent initializing the bank module. I'll note that both the auth and bank modules perform O(n) operations on the store where n is number of accounts. Profiling SetAccount and SetBalance though have given inconclusive results. https://github.com/cosmos/cosmos-sdk/blob/master/x/auth/keeper/account.go#L50-L60 SetAccount appears to have an execution time of 3-5 microseconds. The execution time of SetBalance on 0.41.* takes 3-50 microseconds. Now I observe that Robert refactored this code path in #8473 but I don't see anything that I would expect to change performance. This looks firmly like an IAVL performance issue to me. |
I'll add an observation: while this is happening, we're barely touching the CPU. It did not look like a single-thread bottleneck. |
Thank you for this report @robert-zaremba, thank you @zmanian and @faddat for the input! I just encountered this while creating benchmarks with 100K accounts and thanks to @ethanfrey for pointing me to a similar issue. Preliminary investigationGiven this code func BenchmarkOneBankMultiSendTxPerBlock100K(b *testing.B) {
exp, xerr := exporter.New(context.Background(), exporter.WithInsecureAddress(":8444"))
if xerr != nil {
panic(xerr)
}
defer exp.Stop()
startTime := time.Now()
b.ReportAllocs()
db, err := dbm.NewGoLevelDB("onebankmultisend100k", ".")
if err != nil {
b.Fatal(err)
}
defer db.Close()
// 1. Generate n validators.
valSet, _ := tmtypes.RandValidatorSet(100, 1e6)
println("generated validatorset", time.Since(startTime).String())
// 3. Generate balances and accounts.
addrs, _ := makeRandomAddressesAndPublicKeys(100000)
println("generated random addresses and publickeys", time.Since(startTime).String())
genTokens := sdk.TokensFromConsensusPower(67)
genCoin := sdk.NewCoin("benchcoin", genTokens)
var balances []banktypes.Balance
var accounts []types.GenesisAccount
coins := sdk.Coins{genCoin}
for i, addr := range addrs {
addrStr := addr.String()
balances = append(balances, banktypes.Balance{
Address: addrStr,
Coins: coins,
})
accounts = append(accounts, &authtypes.BaseAccount{
Address: addrStr,
AccountNumber: uint64(i + 1),
})
}
println("generated balances and accounts", time.Since(startTime).String())
app := simapp.SetupWithGenesisValSetAndDB(b, db, valSet, accounts, balances...)
println("finished generating the app", time.Since(startTime).String())
benchmarkOneBankMultiSendTxPerBlock(b, app)
}
func makeRandomAddressesAndPublicKeys(n int) (accL []sdk.AccAddress, pkL []*ed25519.PubKey) {
for i := 0; i < n; i++ {
pk := ed25519.GenPrivKey().PubKey().(*ed25519.PubKey)
pkL = append(pkL, pk)
accL = append(accL, sdk.AccAddress(pk.Address()))
}
return accL, pkL
} TL;DR:
Deep diveI have some continuous profiling internally at my company and it collects profiles for 2 minutes periodically, and indeed InitGenesis is taking forever and the culprits so far are: Total: 41.48s
ROUTINE ======================== github.com/cosmos/cosmos-sdk/types/module.(*Manager).InitGenesis in /Users/emmanuelodeke/go/src/github.com/cosmos/cosmos-sdk/types/module/module.go
0 16.32s (flat, cum) 39.34% of Total
. . 305: continue
. . 306: }
. . 307:
. . 308: t0 := time.Now()
. . 309: println("starting initGeneis for ", moduleName, time.Since(startTime).String())
. 16.32s 310: moduleValUpdates := m.Modules[moduleName].InitGenesis(ctx, cdc, genesisData[moduleName])
. . 311: println("finished initGeneis for ", moduleName, time.Since(startTime).String(), "spent", time.Since(t0).String())
. . 312:
. . 313: // use these validator updates if provided, the module manager assumes
. . 314: // only one module will update the validator set
. . 315: if len(moduleValUpdates) > 0 {
ROUTINE ======================== github.com/cosmos/cosmos-sdk/x/auth.AppModule.InitGenesis in /Users/emmanuelodeke/go/src/github.com/cosmos/cosmos-sdk/x/auth/module.go
0 1.96s (flat, cum) 4.73% of Total
. . 133:
. . 134:// InitGenesis performs genesis initialization for the auth module. It returns
. . 135:// no validator updates.
. . 136:func (am AppModule) InitGenesis(ctx sdk.Context, cdc codec.JSONMarshaler, data json.RawMessage) []abci.ValidatorUpdate {
. . 137: var genesisState types.GenesisState
. 1.40s 138: cdc.MustUnmarshalJSON(data, &genesisState)
. 560ms 139: InitGenesis(ctx, am.accountKeeper, genesisState)
. . 140: return []abci.ValidatorUpdate{}
. . 141:}
. . 142:
. . 143:// ExportGenesis returns the exported genesis state as raw bytes for the auth
. . 144:// module.
ROUTINE ======================== github.com/cosmos/cosmos-sdk/x/auth.InitGenesis in /Users/emmanuelodeke/go/src/github.com/cosmos/cosmos-sdk/x/auth/genesis.go
0 560ms (flat, cum) 1.35% of Total
. . 11:// CONTRACT: old coins from the FeeCollectionKeeper need to be transferred through
. . 12:// a genesis port script to the new fee collector account
. . 13:func InitGenesis(ctx sdk.Context, ak keeper.AccountKeeper, data types.GenesisState) {
. . 14: ak.SetParams(ctx, data.Params)
. . 15:
. 10ms 16: accounts, err := types.UnpackAccounts(data.Accounts)
. . 17: if err != nil {
. . 18: panic(err)
. . 19: }
. 20ms 20: accounts = types.SanitizeGenesisAccounts(accounts)
. . 21:
. . 22: for _, a := range accounts {
. 70ms 23: acc := ak.NewAccount(ctx, a)
. 460ms 24: ak.SetAccount(ctx, acc)
. . 25: }
. . 26:
. . 27: ak.GetModuleAccount(ctx, types.FeeCollectorName)
. . 28:}
. . 29:
ROUTINE ======================== github.com/cosmos/cosmos-sdk/x/bank.AppModule.InitGenesis in /Users/emmanuelodeke/go/src/github.com/cosmos/cosmos-sdk/x/bank/module.go
0 14.36s (flat, cum) 34.62% of Total
. . 136:// InitGenesis performs genesis initialization for the bank module. It returns
. . 137:// no validator updates.
. . 138:func (am AppModule) InitGenesis(ctx sdk.Context, cdc codec.JSONMarshaler, data json.RawMessage) []abci.ValidatorUpdate {
. . 139: start := time.Now()
. . 140: var genesisState types.GenesisState
. 1.17s 141: cdc.MustUnmarshalJSON(data, &genesisState)
. . 142: telemetry.MeasureSince(start, "InitGenesis", "crisis", "unmarshal")
. . 143:
. 13.19s 144: am.keeper.InitGenesis(ctx, &genesisState)
. . 145: return []abci.ValidatorUpdate{}
. . 146:}
. . 147:
. . 148:// ExportGenesis returns the exported genesis state as raw bytes for the bank
. . 149:// module.
ROUTINE ======================== github.com/cosmos/cosmos-sdk/x/bank/keeper.BaseKeeper.InitGenesis in /Users/emmanuelodeke/go/src/github.com/cosmos/cosmos-sdk/x/bank/keeper/genesis.go
0 13.19s (flat, cum) 31.80% of Total
. . 11:func (k BaseKeeper) InitGenesis(ctx sdk.Context, genState *types.GenesisState) {
. . 12: k.SetParams(ctx, genState.Params)
. . 13:
. . 14: var totalSupply sdk.Coins
. . 15:
. 6.39s 16: genState.Balances = types.SanitizeGenesisBalances(genState.Balances)
. . 17: for _, balance := range genState.Balances {
. 20ms 18: addr, err := sdk.AccAddressFromBech32(balance.Address)
. . 19: if err != nil {
. . 20: panic(err)
. . 21: }
. . 22:
. 6.78s 23: if err := k.setBalances(ctx, addr, balance.Coins); err != nil {
. . 24: panic(fmt.Errorf("error on setting balances %w", err))
. . 25: }
. . 26:
. . 27: totalSupply = totalSupply.Add(balance.Coins...)
. . 28: } Culprits
and to further zoom in ROUTINE ======================== github.com/cosmos/cosmos-sdk/x/bank/types.SanitizeGenesisBalances.func1 in /Users/emmanuelodeke/go/src/github.com/cosmos/cosmos-sdk/x/bank/types/balance.go
20ms 6.36s (flat, cum) 15.33% of Total
. . 66:}
. . 67:
. . 68:// SanitizeGenesisBalances sorts addresses and coin sets.
. . 69:func SanitizeGenesisBalances(balances []Balance) []Balance {
. . 70: sort.Slice(balances, func(i, j int) bool {
10ms 2.64s 71: addr1, _ := sdk.AccAddressFromBech32(balances[i].Address)
10ms 3.71s 72: addr2, _ := sdk.AccAddressFromBech32(balances[j].Address)
. 10ms 73: return bytes.Compare(addr1.Bytes(), addr2.Bytes()) < 0
. . 74: })
. . 75:
. . 76: for _, balance := range balances {
. . 77: balance.Coins = balance.Coins.Sort()
. . 78: } Keeper.setBalancesROUTINE ======================== github.com/cosmos/cosmos-sdk/x/bank/keeper.BaseKeeper.InitGenesis in /Users/emmanuelodeke/go/src/github.com/cosmos/cosmos-sdk/x/bank/keeper/genesis.go
0 13.19s (flat, cum) 31.80% of Total
. . 11:func (k BaseKeeper) InitGenesis(ctx sdk.Context, genState *types.GenesisState) {
. . 12: k.SetParams(ctx, genState.Params)
. . 13:
. . 14: var totalSupply sdk.Coins
. . 15:
. 6.39s 16: genState.Balances = types.SanitizeGenesisBalances(genState.Balances)
. . 17: for _, balance := range genState.Balances {
. 20ms 18: addr, err := sdk.AccAddressFromBech32(balance.Address)
. . 19: if err != nil {
. . 20: panic(err)
. . 21: }
. . 22:
. 6.78s 23: if err := k.setBalances(ctx, addr, balance.Coins); err != nil {
. . 24: panic(fmt.Errorf("error on setting balances %w", err))
. . 25: }
. . 26:
. . 27: totalSupply = totalSupply.Add(balance.Coins...)
. . 28: } and goes down to Total: 41.48s
ROUTINE ======================== github.com/cosmos/cosmos-sdk/x/bank/keeper.BaseSendKeeper.setBalances in /Users/emmanuelodeke/go/src/github.com/cosmos/cosmos-sdk/x/bank/keeper/send.go
10ms 6.77s (flat, cum) 16.32% of Total
. . 235:
. . 236:// setBalances sets the balance (multiple coins) for an account by address. It will
. . 237:// clear out all balances prior to setting the new coins as to set existing balances
. . 238:// to zero if they don't exist in amt. An error is returned upon failure.
. . 239:func (k BaseSendKeeper) setBalances(ctx sdk.Context, addr sdk.AccAddress, balances sdk.Coins) error {
. 6.75s 240: k.ClearBalances(ctx, addr)
. . 241:
10ms 10ms 242: for _, balance := range balances {
. 10ms 243: err := k.setBalance(ctx, addr, balance)
. . 244: if err != nil {
. . 245: return err
. . 246: }
. . 247: }
. . 248: and into ClearBalances we see Total: 41.48s
ROUTINE ======================== github.com/cosmos/cosmos-sdk/x/bank/keeper.BaseSendKeeper.ClearBalances in /Users/emmanuelodeke/go/src/github.com/cosmos/cosmos-sdk/x/bank/keeper/send.go
0 6.75s (flat, cum) 16.27% of Total
. . 219:}
. . 220:
. . 221:// ClearBalances removes all balances for a given account by address.
. . 222:func (k BaseSendKeeper) ClearBalances(ctx sdk.Context, addr sdk.AccAddress) {
. . 223: keys := [][]byte{}
. 6.75s 224: k.IterateAccountBalances(ctx, addr, func(balance sdk.Coin) bool {
. . 225: keys = append(keys, []byte(balance.Denom))
. . 226: return false
. . 227: })
. . 228:
. . 229: accountStore := k.getAccountStore(ctx, addr) and down to Total: 41.48s
ROUTINE ======================== github.com/cosmos/cosmos-sdk/x/bank/keeper.BaseViewKeeper.IterateAccountBalances in /Users/emmanuelodeke/go/src/github.com/cosmos/cosmos-sdk/x/bank/keeper/view.go
0 6.74s (flat, cum) 16.25% of Total
. . 114:// provides the token balance to a callback. If true is returned from the
. . 115:// callback, iteration is halted.
. . 116:func (k BaseViewKeeper) IterateAccountBalances(ctx sdk.Context, addr sdk.AccAddress, cb func(sdk.Coin) bool) {
. . 117: accountStore := k.getAccountStore(ctx, addr)
. . 118:
. 6.74s 119: iterator := accountStore.Iterator(nil, nil)
. . 120: defer iterator.Close()
. . 121:
. . 122: for ; iterator.Valid(); iterator.Next() {
. . 123: var balance sdk.Coin
. . 124: k.cdc.MustUnmarshalBinaryBare(iterator.Value(), &balance) then to ROUTINE ======================== github.com/cosmos/cosmos-sdk/store/prefix.Store.Iterator in /Users/emmanuelodeke/go/src/github.com/cosmos/cosmos-sdk/store/prefix/store.go
0 6.74s (flat, cum) 16.25% of Total
. . 90: newend = cpIncr(s.prefix)
. . 91: } else {
. . 92: newend = cloneAppend(s.prefix, end)
. . 93: }
. . 94:
. 6.74s 95: iter := s.parent.Iterator(newstart, newend)
. . 96:
. . 97: return newPrefixIterator(s.prefix, start, end, iter)
. . 98:}
. . 99:
. . 100:// ReverseIterator implements KVStore Suggestions
|
@zmanian spot on, iavl is quite slow; along with SanitizeGenesisBalances. |
@zmanian there are about 3 optimizations we can do to heavily alleviate that iavl code in dirtyItems, without having to replace it entirely: If we look at this code cosmos-sdk/store/cachekv/store.go Lines 182 to 192 in 128cb54
1. Avoid unconditional map lookupswe'll notice that even from the CPU profile, there was a heavy map access with runtime.mapaccess1_faststr, which is the result of this unconditionally executed code cosmos-sdk/store/cachekv/store.go Line 185 in 128cb54
We can change it to this for key := range store.unsortedCache {
- cacheValue := store.cache[key]
-
- if dbm.IsKeyInDomain([]byte(key), start, end) {
+ if dbm.IsKeyInDomain(strToByte(key), start, end) {
+ cacheValue := store.cache[key]
+ unsorted = append(unsorted, &kv.Pair{Key: []byte(key), Value: cacheValue.value})
delete(store.unsortedCache, key)
+ }
+ } 2. Let the Go compiler use the map clearing idiom which is super fast -- black magicWe can detect and see if all the elements are to be cleared from so use this pattern, and avoid deleting in the same loop for k := range m {
delete(m, k)
} and to do this run through n := len(store.unsortedCache)
for key := range... {
}
if len(unsorted) == n { // All the elements are to be cleared
for key := range store.unsortedCache {
delete(sotre.unsortedCache, key)
}
} else {
// Delete normally
} 3. Use zero allocations to convert from kv.Pair.Key a []byte slice to stringAfter fix 2, we can do this func strToByte(s string) []byte {
var b []byte
hdr := (*reflect.SliceHeader)(unsafe.Pointer(&b))
hdr.Cap = len(s)
hdr.Len = len(s)
hdr.Data = (*reflect.StringHeader)(unsafe.Pointer(&s)).Data
return b
}
func (store *Store) dirtyItems(start, end []byte) {
....
if len(unsorted) == n {
for key := range store.unsortedCache {
delete(store.unsortedCache, key)
}
} else {
for _, kv := range unsorted {
delete(store.unsortedCache, byteSliceToStr(kv.Key))
}
} which cleans up the CPU profile like this with that, it takes my runtime for bank down to 5m50s instead of like 20+ minutes of which at times it didn't terminate before. |
This is amazing @odeke-em. I'm happy to delay v0.41.4 release if we manage to get these fixes in in a timely fashion. Do we have any PRs fixing this already? |
Sorry for the late reply, I was asleep. Sure, I'll mail the PRs for these fixes. |
I've mailed out PR #8719 and added some benchmarks too. |
…code After continuously profiling InitGensis with 100K accounts, it showed pathologically slow code, that was the result of a couple of patterns: * Unconditional and not always necessary map lookups * O(n^2) sdk.AccAddressFromBech32 retrievals when the code is expensive, during a quicksort The remedy involved 4 parts: * O(n) sdk.AccAddressFromBech32 invocations, down from O(n^2) in the quicksort * Only doing map lookups when the domain key check has passed * Using a black magic compiler technique of the map clearing idiom * Zero allocation []byte<->string conversion With 100K accounts, this brings InitGenesis down to ~6min, instead of 20+min, it reduces the sort code from ~7sec down to 50ms. Also some simple benchmark reflect the change: ```shell name old time/op new time/op delta SanitizeBalances500-8 19.3ms ±10% 1.5ms ± 5% -92.46% (p=0.000 n=20+20) SanitizeBalances1000-8 41.9ms ± 8% 3.0ms ±12% -92.92% (p=0.000 n=20+20) name old alloc/op new alloc/op delta SanitizeBalances500-8 9.05MB ± 6% 0.56MB ± 0% -93.76% (p=0.000 n=20+18) SanitizeBalances1000-8 20.2MB ± 3% 1.1MB ± 0% -94.37% (p=0.000 n=20+19) name old allocs/op new allocs/op delta SanitizeBalances500-8 72.4k ± 6% 4.5k ± 0% -93.76% (p=0.000 n=20+20) SanitizeBalances1000-8 162k ± 3% 9k ± 0% -94.40% (p=0.000 n=20+20) ``` The CPU profiles show the radical change as per #7766 (comment) Later on, we shall do more profiling and fixes but for now this brings down the run-time for InitGenesis. Fixes #7766
…code (#8719) After continuously profiling InitGensis with 100K accounts, it showed pathologically slow code, that was the result of a couple of patterns: * Unconditional and not always necessary map lookups * O(n^2) sdk.AccAddressFromBech32 retrievals when the code is expensive, during a quicksort The remedy involved 4 parts: * O(n) sdk.AccAddressFromBech32 invocations, down from O(n^2) in the quicksort * Only doing map lookups when the domain key check has passed * Using a black magic compiler technique of the map clearing idiom * Zero allocation []byte<->string conversion With 100K accounts, this brings InitGenesis down to ~6min, instead of 20+min, it reduces the sort code from ~7sec down to 50ms. Also some simple benchmark reflect the change: ```shell name old time/op new time/op delta SanitizeBalances500-8 19.3ms ±10% 1.5ms ± 5% -92.46% (p=0.000 n=20+20) SanitizeBalances1000-8 41.9ms ± 8% 3.0ms ±12% -92.92% (p=0.000 n=20+20) name old alloc/op new alloc/op delta SanitizeBalances500-8 9.05MB ± 6% 0.56MB ± 0% -93.76% (p=0.000 n=20+18) SanitizeBalances1000-8 20.2MB ± 3% 1.1MB ± 0% -94.37% (p=0.000 n=20+19) name old allocs/op new allocs/op delta SanitizeBalances500-8 72.4k ± 6% 4.5k ± 0% -93.76% (p=0.000 n=20+20) SanitizeBalances1000-8 162k ± 3% 9k ± 0% -94.40% (p=0.000 n=20+20) ``` The CPU profiles show the radical change as per #7766 (comment) Later on, we shall do more profiling and fixes but for now this brings down the run-time for InitGenesis. Fixes #7766 Co-authored-by: Alessio Treglia <alessio@tendermint.com>
…code (#8719) After continuously profiling InitGensis with 100K accounts, it showed pathologically slow code, that was the result of a couple of patterns: * Unconditional and not always necessary map lookups * O(n^2) sdk.AccAddressFromBech32 retrievals when the code is expensive, during a quicksort The remedy involved 4 parts: * O(n) sdk.AccAddressFromBech32 invocations, down from O(n^2) in the quicksort * Only doing map lookups when the domain key check has passed * Using a black magic compiler technique of the map clearing idiom * Zero allocation []byte<->string conversion With 100K accounts, this brings InitGenesis down to ~6min, instead of 20+min, it reduces the sort code from ~7sec down to 50ms. Also some simple benchmark reflect the change: ```shell name old time/op new time/op delta SanitizeBalances500-8 19.3ms ±10% 1.5ms ± 5% -92.46% (p=0.000 n=20+20) SanitizeBalances1000-8 41.9ms ± 8% 3.0ms ±12% -92.92% (p=0.000 n=20+20) name old alloc/op new alloc/op delta SanitizeBalances500-8 9.05MB ± 6% 0.56MB ± 0% -93.76% (p=0.000 n=20+18) SanitizeBalances1000-8 20.2MB ± 3% 1.1MB ± 0% -94.37% (p=0.000 n=20+19) name old allocs/op new allocs/op delta SanitizeBalances500-8 72.4k ± 6% 4.5k ± 0% -93.76% (p=0.000 n=20+20) SanitizeBalances1000-8 162k ± 3% 9k ± 0% -94.40% (p=0.000 n=20+20) ``` The CPU profiles show the radical change as per #7766 (comment) Later on, we shall do more profiling and fixes but for now this brings down the run-time for InitGenesis. Fixes #7766 Co-authored-by: Alessio Treglia <alessio@tendermint.com>
…code (#8719) After continuously profiling InitGensis with 100K accounts, it showed pathologically slow code, that was the result of a couple of patterns: * Unconditional and not always necessary map lookups * O(n^2) sdk.AccAddressFromBech32 retrievals when the code is expensive, during a quicksort The remedy involved 4 parts: * O(n) sdk.AccAddressFromBech32 invocations, down from O(n^2) in the quicksort * Only doing map lookups when the domain key check has passed * Using a black magic compiler technique of the map clearing idiom * Zero allocation []byte<->string conversion With 100K accounts, this brings InitGenesis down to ~6min, instead of 20+min, it reduces the sort code from ~7sec down to 50ms. Also some simple benchmark reflect the change: ```shell name old time/op new time/op delta SanitizeBalances500-8 19.3ms ±10% 1.5ms ± 5% -92.46% (p=0.000 n=20+20) SanitizeBalances1000-8 41.9ms ± 8% 3.0ms ±12% -92.92% (p=0.000 n=20+20) name old alloc/op new alloc/op delta SanitizeBalances500-8 9.05MB ± 6% 0.56MB ± 0% -93.76% (p=0.000 n=20+18) SanitizeBalances1000-8 20.2MB ± 3% 1.1MB ± 0% -94.37% (p=0.000 n=20+19) name old allocs/op new allocs/op delta SanitizeBalances500-8 72.4k ± 6% 4.5k ± 0% -93.76% (p=0.000 n=20+20) SanitizeBalances1000-8 162k ± 3% 9k ± 0% -94.40% (p=0.000 n=20+20) ``` The CPU profiles show the radical change as per #7766 (comment) Later on, we shall do more profiling and fixes but for now this brings down the run-time for InitGenesis. Fixes #7766 Co-authored-by: Alessio Treglia <alessio@tendermint.com>
* store/cachekv, x/bank/types: algorithmically fix pathologically slow code (#8719) After continuously profiling InitGensis with 100K accounts, it showed pathologically slow code, that was the result of a couple of patterns: * Unconditional and not always necessary map lookups * O(n^2) sdk.AccAddressFromBech32 retrievals when the code is expensive, during a quicksort The remedy involved 4 parts: * O(n) sdk.AccAddressFromBech32 invocations, down from O(n^2) in the quicksort * Only doing map lookups when the domain key check has passed * Using a black magic compiler technique of the map clearing idiom * Zero allocation []byte<->string conversion With 100K accounts, this brings InitGenesis down to ~6min, instead of 20+min, it reduces the sort code from ~7sec down to 50ms. Also some simple benchmark reflect the change: ```shell name old time/op new time/op delta SanitizeBalances500-8 19.3ms ±10% 1.5ms ± 5% -92.46% (p=0.000 n=20+20) SanitizeBalances1000-8 41.9ms ± 8% 3.0ms ±12% -92.92% (p=0.000 n=20+20) name old alloc/op new alloc/op delta SanitizeBalances500-8 9.05MB ± 6% 0.56MB ± 0% -93.76% (p=0.000 n=20+18) SanitizeBalances1000-8 20.2MB ± 3% 1.1MB ± 0% -94.37% (p=0.000 n=20+19) name old allocs/op new allocs/op delta SanitizeBalances500-8 72.4k ± 6% 4.5k ± 0% -93.76% (p=0.000 n=20+20) SanitizeBalances1000-8 162k ± 3% 9k ± 0% -94.40% (p=0.000 n=20+20) ``` The CPU profiles show the radical change as per #7766 (comment) Later on, we shall do more profiling and fixes but for now this brings down the run-time for InitGenesis. Fixes #7766 Co-authored-by: Alessio Treglia <alessio@tendermint.com> * Fix issues with merge * Apply suggestions from code review Co-authored-by: Emmanuel T Odeke <emmanuel@orijtech.com> Co-authored-by: Alessio Treglia <alessio@tendermint.com> Co-authored-by: Federico Kunze <31522760+fedekunze@users.noreply.github.com>
…code (#8719) After continuously profiling InitGensis with 100K accounts, it showed pathologically slow code, that was the result of a couple of patterns: * Unconditional and not always necessary map lookups * O(n^2) sdk.AccAddressFromBech32 retrievals when the code is expensive, during a quicksort The remedy involved 4 parts: * O(n) sdk.AccAddressFromBech32 invocations, down from O(n^2) in the quicksort * Only doing map lookups when the domain key check has passed * Using a black magic compiler technique of the map clearing idiom * Zero allocation []byte<->string conversion With 100K accounts, this brings InitGenesis down to ~6min, instead of 20+min, it reduces the sort code from ~7sec down to 50ms. Also some simple benchmark reflect the change: ```shell name old time/op new time/op delta SanitizeBalances500-8 19.3ms ±10% 1.5ms ± 5% -92.46% (p=0.000 n=20+20) SanitizeBalances1000-8 41.9ms ± 8% 3.0ms ±12% -92.92% (p=0.000 n=20+20) name old alloc/op new alloc/op delta SanitizeBalances500-8 9.05MB ± 6% 0.56MB ± 0% -93.76% (p=0.000 n=20+18) SanitizeBalances1000-8 20.2MB ± 3% 1.1MB ± 0% -94.37% (p=0.000 n=20+19) name old allocs/op new allocs/op delta SanitizeBalances500-8 72.4k ± 6% 4.5k ± 0% -93.76% (p=0.000 n=20+20) SanitizeBalances1000-8 162k ± 3% 9k ± 0% -94.40% (p=0.000 n=20+20) ``` The CPU profiles show the radical change as per cosmos/cosmos-sdk#7766 (comment) Later on, we shall do more profiling and fixes but for now this brings down the run-time for InitGenesis. Fixes #7766 Co-authored-by: Alessio Treglia <alessio@tendermint.com>
…code (#8719) After continuously profiling InitGensis with 100K accounts, it showed pathologically slow code, that was the result of a couple of patterns: * Unconditional and not always necessary map lookups * O(n^2) sdk.AccAddressFromBech32 retrievals when the code is expensive, during a quicksort The remedy involved 4 parts: * O(n) sdk.AccAddressFromBech32 invocations, down from O(n^2) in the quicksort * Only doing map lookups when the domain key check has passed * Using a black magic compiler technique of the map clearing idiom * Zero allocation []byte<->string conversion With 100K accounts, this brings InitGenesis down to ~6min, instead of 20+min, it reduces the sort code from ~7sec down to 50ms. Also some simple benchmark reflect the change: ```shell name old time/op new time/op delta SanitizeBalances500-8 19.3ms ±10% 1.5ms ± 5% -92.46% (p=0.000 n=20+20) SanitizeBalances1000-8 41.9ms ± 8% 3.0ms ±12% -92.92% (p=0.000 n=20+20) name old alloc/op new alloc/op delta SanitizeBalances500-8 9.05MB ± 6% 0.56MB ± 0% -93.76% (p=0.000 n=20+18) SanitizeBalances1000-8 20.2MB ± 3% 1.1MB ± 0% -94.37% (p=0.000 n=20+19) name old allocs/op new allocs/op delta SanitizeBalances500-8 72.4k ± 6% 4.5k ± 0% -93.76% (p=0.000 n=20+20) SanitizeBalances1000-8 162k ± 3% 9k ± 0% -94.40% (p=0.000 n=20+20) ``` The CPU profiles show the radical change as per cosmos/cosmos-sdk#7766 (comment) Later on, we shall do more profiling and fixes but for now this brings down the run-time for InitGenesis. Fixes #7766 Co-authored-by: Alessio Treglia <alessio@tendermint.com>
…code (#8719) After continuously profiling InitGensis with 100K accounts, it showed pathologically slow code, that was the result of a couple of patterns: * Unconditional and not always necessary map lookups * O(n^2) sdk.AccAddressFromBech32 retrievals when the code is expensive, during a quicksort The remedy involved 4 parts: * O(n) sdk.AccAddressFromBech32 invocations, down from O(n^2) in the quicksort * Only doing map lookups when the domain key check has passed * Using a black magic compiler technique of the map clearing idiom * Zero allocation []byte<->string conversion With 100K accounts, this brings InitGenesis down to ~6min, instead of 20+min, it reduces the sort code from ~7sec down to 50ms. Also some simple benchmark reflect the change: ```shell name old time/op new time/op delta SanitizeBalances500-8 19.3ms ±10% 1.5ms ± 5% -92.46% (p=0.000 n=20+20) SanitizeBalances1000-8 41.9ms ± 8% 3.0ms ±12% -92.92% (p=0.000 n=20+20) name old alloc/op new alloc/op delta SanitizeBalances500-8 9.05MB ± 6% 0.56MB ± 0% -93.76% (p=0.000 n=20+18) SanitizeBalances1000-8 20.2MB ± 3% 1.1MB ± 0% -94.37% (p=0.000 n=20+19) name old allocs/op new allocs/op delta SanitizeBalances500-8 72.4k ± 6% 4.5k ± 0% -93.76% (p=0.000 n=20+20) SanitizeBalances1000-8 162k ± 3% 9k ± 0% -94.40% (p=0.000 n=20+20) ``` The CPU profiles show the radical change as per cosmos/cosmos-sdk#7766 (comment) Later on, we shall do more profiling and fixes but for now this brings down the run-time for InitGenesis. Fixes #7766 Co-authored-by: Alessio Treglia <alessio@tendermint.com>
Summary of a Problem
Starting Gaia chain takes lot of time. The issue was reported here: #7682.
We removed unnecessary balance checks and made x/crisis AssertInvariants optional during
InitGenesis
.In the aforementioned issue we profiled the app and found that the bottleneck is the storage layer.
Here the goal is to inspect the storage patterns we are using and optimize it. Hints:
map
operations which is used when data is deserialized.InitGenesis
is usingcachekv
which potentially can be removed (though we are not sure about it).pprof
files are attached in Gaia 3.0 takes a long time to start on migrated hub3 state #7682For Admin Use
The text was updated successfully, but these errors were encountered: