-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix potential deadlock in the table manager #5472
Fix potential deadlock in the table manager #5472
Conversation
} | ||
|
||
table = NewTable(tableName, filepath.Join(tm.cfg.CacheDir, tableName), tm.indexStorageClient, tm.boltIndexClient, tm.metrics) | ||
tm.tables[tableName] = table |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't you need to write lock here for this ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I missed that we're using RLock
at the top. I've changed it to Lock
tm.tablesMtx.RLock() | ||
defer tm.tablesMtx.RUnlock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we're fixing a deadlock and introducing a new one. We use RLock
first as a lower-cost way to check if the table exists, but must RUnlock
it after (the previous PR did this). We cannot defer
it here because we need to Lock
it later if the table wasn't found. I think we can introduce a 1 line fix:defer tm.tablesMtx.Unlock()
immediately after the write lock is acquired, which will ensure we release the write lock correctly in all cases it's acquired.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I missed the Rlock
vs. Lock
The latest commit just defers tm.tablesMtx.Unlock()
after it's created.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! Thanks for fixing it!
}) | ||
|
||
t.Run("it doesn't deadlock when table create fails", func(t *testing.T) { | ||
tempDir := os.TempDir() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a nit, lets use t.TempDir
like other places.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I pushed this change and resolved the merge conflict because I wanted to cut a new release with this fix.
…patterson/fix-tablemanager-deadlock'
I was looking at how the tablemanager worked and noticed that
getOrCreateTable
had a deadlock. I've included a test that illustrates the issue and fixed it.