Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Support concurrency for IAVL and fix Racing conditions #805

Merged
merged 6 commits into from
Aug 23, 2023

Conversation

mattverse
Copy link
Contributor

cref: #696

This PR fixes the problem stated above by replacing maps that causes racing conditions to sync map to be thread safe.

Tested this on various chains' nodes that's having concurrency issue by applying patch manually, confirmed that after applying this patch concurrency issue is not happening any more. (special shout out to @keplr-wallet for helping out testing!)

Also confirmed and tested on a testnet node with the following procedure:

  • Have node running
  • Spam it with tx bots and query bots
  • Previously node would crash within 1, maximum 2 minutes with the racing condition logs mentioned above

@mattverse mattverse requested a review from a team as a code owner August 7, 2023 09:01
@mattverse mattverse changed the title Support concurrency for IAVL and fix Racing conditions feat: Support concurrency for IAVL and fix Racing conditions Aug 7, 2023
Copy link
Member

@p0mvn p0mvn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! Only minor comments.

Do we have a good understanding of the performance changes under normal conditions? Might be useful to run this before and after the change and compare it with benchstat

// If next fast node from disk is to be removed, skip it.
iter.fastIterator.Next()
iter.Next()
return
}

nextUnsavedKey := iter.unsavedFastNodesToSort[iter.nextUnsavedNodeIdx]
nextUnsavedNode := iter.unsavedFastNodeAdditions[string(nextUnsavedKey)]
nextUnsavedNodeVal, _ := iter.unsavedFastNodeAdditions.Load(nextUnsavedKey)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably check whether the second return was set to true before proceeding

@@ -196,7 +203,8 @@ func (iter *UnsavedFastIterator) Next() {
// if only unsaved nodes are left, we can just iterate
if iter.nextUnsavedNodeIdx < len(iter.unsavedFastNodesToSort) {
nextUnsavedKey := iter.unsavedFastNodesToSort[iter.nextUnsavedNodeIdx]
nextUnsavedNode := iter.unsavedFastNodeAdditions[string(nextUnsavedKey)]
nextUnsavedNodeVal, _ := iter.unsavedFastNodeAdditions.Load(nextUnsavedKey)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worthwhile checking second return here as well

Copy link
Member

@tac0turtle tac0turtle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

amazing changes! do you know if there were any performance regressions by using sync.Map?

@mattverse
Copy link
Contributor Author

mattverse commented Aug 7, 2023

Oh sweet! I'll get back tmrw with the benchstats @p0mvn suggested

@yihuang
Copy link
Collaborator

yihuang commented Aug 9, 2023

https://medium.com/@nikolay.dimitrov/comparing-go-performance-of-mutex-sync-map-and-channels-for-syncing-a-map-of-counters-61810c201e17

I just run this benchmark script locally, I get results like this:

Start measuring Mutex ...
Finished measuring Mutex, time taken: 928.267951ms

Start measuring sync.Map ...
Finished measuring sync.Map, time taken: 3.284460863s

Start measuring Channels ...
Finished measuring Channels, time taken: 3.524194982s

so it seems sync.Map is slower than a mutex protected normal map, so how about we simply add a good old mutex to the mutable tree?

@mattverse
Copy link
Contributor Author

@yihuang Previously, I have attempted to use good old mutexes instead of sync.Maps(prev attempt here mattverse@34d4565) , but I have changed to sync.Map for couple of reasons.

  1. Using mutexes gives us the overhead of having to manage where we lock and unlock manually. Code becomes very very dirty as well as since we are manually locking & unlocking we might miss parts that actually cause concurrency issues. In my previous attempts to use mutexes, although I did use mutex locks and unlocks in places where I thought concurrency issue was happening, it did not seem to solve problem (although it did relieve some of the concurrency issue happening). I am suspecting this is because I probably have missed locking and unlocking mutexes in {unknown}{unknown} parts of the code

  2. The bench test given does not seem fair for comparing sync maps and mutexes. Sync Maps are especially perf wise better upon lower writes, higher reads. However, given bench test seems to iteratively write, which would indeed result in better bench stats for mutexes

lmk your thoughts on this!

@cool-develope
Copy link
Collaborator

I am still confused about the parallel use-case of iavl, the parallel writing will break the tree structure, so the root hash will not be deterministic. Actually, the IAVL doesn't provide concurrency now.
@yihuang, do you think it is a real issue?

@yihuang
Copy link
Collaborator

yihuang commented Aug 11, 2023

I am still confused about the parallel use-case of iavl, the parallel writing will break the tree structure, so the root hash will not be deterministic. Actually, the IAVL doesn't provide concurrency now.

@yihuang, do you think it is a real issue?

I guess it's concurrent reading and writing, still single writer but multiple readers?

@tac0turtle
Copy link
Member

I am still confused about the parallel use-case of iavl, the parallel writing will break the tree structure, so the root hash will not be deterministic. Actually, the IAVL doesn't provide concurrency now. @yihuang, do you think it is a real issue?

this is due to the fastnode system, there is an issue with reading and writing that causes a concurrent hashmap error. This aims to fix the issue. Its less so about using the iavl tree in parallel.

@yihuang
Copy link
Collaborator

yihuang commented Aug 11, 2023

my concern is consistency, nodedb use mutex to protect some maps operations, but here we use sync.Map, we should use a consistent approach towards the concurrency?

@cool-develope
Copy link
Collaborator

my concern is consistency, nodedb use mutex to protect some maps operations, but here we use sync.Map, we should use a consistent approach towards the concurrency?

yeah, I think RWMutex is more reasonable in this case

@mattverse
Copy link
Contributor Author

mattverse commented Aug 13, 2023

ok, if that's the consensus, I'm down to change it to use sync.RWMutex and re-open PR!

@tac0turtle
Copy link
Member

@mattverse any chance you can make the new pr? otherwise we can do it

@mattverse
Copy link
Contributor Author

@tac0turtle Code is ready, currently testing, give me 1~2 more days!

@mattverse
Copy link
Contributor Author

mattverse commented Aug 21, 2023

I'm half skeptical it would work, applying RW mutex was pain, alot of places caused deadlocks

@mattverse
Copy link
Contributor Author

yep getting error

{"level":"info","module":"consensus","commit_round":0,"height":2277484,"commit":"CC3E91B29F5D0820397C70A0EA459E5DA2AEC9F83B006597BAB2D7FA79B00A29","proposal":{},"time":"2023-08-21T11:08:01Z","message":"commit is for a block we do not know about; set ProposalBlock=nil"}
{"level":"info","module":"consensus","hash":"CC3E91B29F5D0820397C70A0EA459E5DA2AEC9F83B006597BAB2D7FA79B00A29","height":2277484,"time":"2023-08-21T11:08:01Z","message":"received complete proposal block"}

state.go:1563 +0x2ff\ngithub.com/tendermint/tendermint/consensus.(*State).handleCompleteProposal(0xc000f1bc00, 0xc0013cca00?)\n\tgithub.com/tendermint/tendermint@v0.37.0-rc1/consensus/state.go:1942 +0x399\ngithub.com/tendermint/tendermint/consensus.(*State).handleMsg(0xc000f1bc00, {{0x2a9ea40, 0xc0099e38a8}, {0xc00316f650, 0x28}})\n\tgithub.com/tendermint/tendermint@v0.37.0-rc1/consensus/state.go:834 +0x1b7\ngithub.com/tendermint/tendermint/consensus.(*State).receiveRoutine(0xc000f1bc00, 0x0)\n\tgithub.com/tendermint/tendermint@v0.37.0-rc1/consensus/state.go:760 +0x3f0\ncreated by github.com/tendermint/tendermint/consensus.(*State).OnStart\n\tgithub.com/tendermint/tendermint@v0.37.0-rc1/consensus/state.go:379 +0x12d\n","time":"2023-08-21T11:08:01Z","message":"CONSENSUS FAILURE!!!"}
{"level":"info","module":"consensus","wal":"/root/.osmosisd/data/cs.wal/wal","impl":{"Logger":{}},"msg":{},"time":"2023-08-21T11:08:01Z","message":"service stop"}
{"level":"info","module":"consensus","wal":"/root/.osmosisd/data/cs.wal/wal","impl":{"Logger":{},"ID":"group:U/panic.go:884 +0x213\ngithub.com/tendermint/tendermint/consensus.(*State).finalizeCommit(0xc000f1bc00, 0x22c06c)\n\tgithub.com/tendermint/tendermint@v0.37.0-rc1/consensus/state.go:1594 +0x1105\ngithub.com/tendermint/tendermint/consensus.(*State).tryFinalizeCommit(0xc000f1bc00, 0x22c06c)\n\tgithub.com/tendermint/tendermint@v0.37.0-rc1/consensus/state.go:1563 +0x2ff\ngithub.com/tendermint/tendermint/consensus.(*State).handleCompleteProposal(0xc000f1bc00, 0xc0013cca00?)\n\tgithub.com/tendermint/tendermint@v0.37.0-rc1/consensus/state.go:1942 +0x399\ngithub.com/tendermint/tendermint/consensus.(*State).handleMsg(0xc000f1bc00, {{0x2a9ea40, 0xc0099e38a8}, {0xc00316f650, 0x28}})\n\tgithub.com/tendermint/tendermint@v0.37.0-rc1/consensus/state.go:834 +0x1b7\ngithub.com/tendermint/tendermint/consensus.(*State).receiveRoutine(0xc000f1bc00, 0x0)\n\tgithub.com/tendermint/tendermint@v0.37.0-rc1/consensus/state.go:760 +0x3f0\ncreated by github.com/tendermint/tendermint/consensus.(*State).OnStart\n\tgithub.com/tendermint/tendermint@v0.37.0-rc1/consensus/state.go:379 +0x12d\n","time":"2023-08-21T11:08:01Z","message":"CONSENSUS FAILURE!!!"}
{"level":"info","module":"consensus","wal":"/root/.osmosisd/data/cs.wal/wal","impl":{"Logger":{}},"msg":{},"time":"2023-08-21T11:08:01Z","message":"service stop"}
{"level":"info","module":"consensus","wal":"/root/.osmosisd/data/cs.wal/wal","impl":{"Logger":{},"ID":"group:UL0yDN8TXC0W:/root/.osmosisd/data/cs.wal/wal","Head":{"ID":"UL0yDN8TXC0W:/root/.osmosisd/data/cs.wal/wal","Path":"/root/.osmosisd/data/cs.wal/wal"},"Dir":"/root/.osmosisd/data/cs.wal"},"msg":{},"time":"2023-08-21T11:08:01Z","message":"service stop"}
{"level":"info","module":"p2p","peer":{"id":"0e6f924a3afe15996bc3f2612a9dd3bcf1c64212","ip":"222.106.187.14","port":53400},"impl":"Peer{MConn{222.106.187.14:53400} 0e6f924a3afe15996bc3f2612a9dd3bcf1c64212 out}","msg":{},"time":"2023-08-21T11:08:01Z","message":"service start"}
{"level":"info","module":"p2p","peer":{"id":"0e6f924a3afe15996bc3f2612a9dd3bcf1c64212","ip":"222.106.187.14","port":53400},"impl":"MConn{222.106.187.14:53400}","msg":{},"time"

https://github.com/mattverse/iavl/tree/mattverse/rwmutex-iavl This was the attempt I have made, I would be more than happy if you guys want to take it over from this point! lmk on the updates

@tac0turtle
Copy link
Member

id say for the current iavl this pr is fine as is. In iavl 2.0 we will remove the fastnode system so its only present for a little. @cool-develope @yihuang thoughts?

@cool-develope
Copy link
Collaborator

I don't think the above error is related to RWMutex or sync.Map, it seems like trying parallel writing and breaking deterministic.
But yeah I agree with the temporary solution if this PR resolves them.

@yihuang
Copy link
Collaborator

yihuang commented Aug 22, 2023

id say for the current iavl this pr is fine as is. In iavl 2.0 we will remove the fastnode system so its only present for a little. @cool-develope @yihuang thoughts?

no problem, since it's simpler.

@tac0turtle tac0turtle merged commit ba6beb1 into cosmos:master Aug 23, 2023
@tac0turtle
Copy link
Member

@Mergifyio backport release/v1.x.x

@mergify
Copy link
Contributor

mergify bot commented Aug 23, 2023

backport release/v1.x.x

✅ Backports have been created

mergify bot pushed a commit that referenced this pull request Aug 23, 2023
Co-authored-by: Marko <marko@baricevic.me>
(cherry picked from commit ba6beb1)
tac0turtle pushed a commit that referenced this pull request Aug 23, 2023
…#805) (#825)

Co-authored-by: Matt, Park <45252226+mattverse@users.noreply.github.com>
@mattverse
Copy link
Contributor Author

@tac0turtle Can we get this in for sdk v0.50?

@tac0turtle
Copy link
Member

yup already backported, we will test it on existing chains for a bit then cut the final release alongside the sdk eden release

@mattverse
Copy link
Contributor Author

mattverse commented Aug 23, 2023

Sweet, should I also create subsequent PRs for old major versions so that its compatible with old sdk versions as well? It's not state breaking so should be patchable

@tac0turtle
Copy link
Member

we would need to backport it to v0.20 for older versions. We can do this and bump the old ones.

@mattverse
Copy link
Contributor Author

mattverse commented Aug 23, 2023

ok so the step would be first backport to v0.20 and then we'd be able to bump the older ones.
Do you want me to handle merge conflicts for backporting it to v0.20?

Copy link
Member

@tac0turtle tac0turtle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mattverse seems like this version differs from the original a little. it is causing some dead locks in testing within the sdk.

@mattverse
Copy link
Contributor Author

mattverse commented Aug 24, 2023

@tac0turtle wdym by "original"? The one without changes? If so, which version of sdk are you testing with? My node seemed fine

@tac0turtle
Copy link
Member

sorry different issue. git bisect led to tthis first but it was wrong

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants