-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix race in dnode_check_slots_free() #7388
Conversation
58dabb4
to
648f599
Compare
This patch seems a little broken, trying to fix up. Not sure how it worked for you? There is only one os_synced_dnodes multilist, not one per TXG. I get an oops with the patch. |
module/zfs/dnode.c
Outdated
for (int i = 0; i < TXG_SIZE; i++) { | ||
boolean_t link_active; | ||
multilist_sublist_t *mls; | ||
multilist_t *synced_list = &dn->dn_objset->os_synced_dnodes[i]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dn->dn_objset->os_synced_dnodes
isn't per-tgx_size and may be NULL.
This change seems to fix it up, bit ugly with the list_d2l though. diff --git a/usr/src/git/zfs/module/zfs/dnode.c b/dnode.c
index 962027231e8f..51adc49ee68c 100644
--- a/usr/src/git/zfs/module/zfs/dnode.c
+++ b/dnode.c
@@ -1085,6 +1085,7 @@ static boolean_t
dnode_check_evictable(dnode_t *dn)
{
dmu_object_type_t type;
+ multilist_t *synced_list;
mutex_enter(&dn->dn_mtx);
type = dn->dn_type;
@@ -1099,13 +1100,9 @@ dnode_check_evictable(dnode_t *dn)
* We check for this by determining if the dnode has any
* active links into os_synced_dnodes.
*/
- for (int i = 0; i < TXG_SIZE; i++) {
- boolean_t link_active;
- multilist_sublist_t *mls;
- multilist_t *synced_list = &dn->dn_objset->os_synced_dnodes[i];
-
- mls = multilist_sublist_lock_obj(synced_list, dn);
- link_active = multilist_link_active(&dn->dn_dirty_link[i]);
+ if((synced_list = dn->dn_objset->os_synced_dnodes) != NULL) {
+ multilist_sublist_t *mls = multilist_sublist_lock_obj(synced_list, dn);
+ boolean_t link_active = list_link_active(list_d2l(&mls->mls_list, dn));
multilist_sublist_unlock(mls);
if (link_active) |
Nope not quite. [ 817.941427] divide error: 0000 [#1] SMP PTI |
Is it possible to do this by checking the dnodes refcount? There should be a hold on it until the userquota updates task is done with it. |
Unfortunately it is not possible to do this with refcounts. In both cases that cause the crash and those that don't the recount is 1. I think I'm just missing a null check. The test crashed for me after a couple hours as well. I'll see about fixing it tonight. |
My suggested diff added the null check, but then I got a divide by zero after some time, which must mean I managed to race with multilist_destroy somehow? |
I've added a rw lock to protect os_synced_dnodes and used that in the place it gets destroyed as well as the evictable function. So far so good. There should be an easier way of figuring out whether the dnode is involved in user quota update though, maybe just add another flag for it? Also, would it not be better to split out the links for synced dnodes from dirty dnodes? The code is quite confusing and the only advantage is saving two pointers. There's also a couple other places that use dn_dirty_link active to check whether something is on the dirty list and its unclear whether they can run concurrently with user quota updates and if so if they expect to consider the synced list as dirty. |
ad6d51f
to
da380d0
Compare
@nivedita76 I think the latest push looks a lot like what you were suggesting. We have to add 64 bits to |
[ 60.446136] perf: interrupt took too long (17218 > 7155), lowering kernel.perf_event_max_sample_rate to 11600 |
da380d0
to
8ae0656
Compare
@nivedita76 sorry about that. looks like I got the boolean logic backwards. I just pushed the corrected logic. Please try it again when you get the chance and I will watch buildbot to see the results. |
include/sys/dnode.h
Outdated
@@ -332,6 +332,7 @@ struct dnode { | |||
kcondvar_t dn_notxholds; | |||
enum dnode_dirtycontext dn_dirtyctx; | |||
uint8_t *dn_dirtyctx_firstset; /* dbg: contents meaningless */ | |||
uint64_t dn_dirty_txg; /* txg dnode was last dirtied */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I'd move this up under dn_assigned_txg.
module/zfs/dnode.c
Outdated
can_evict = (dn->dn_type == DMU_OT_NONE && !DNODE_IS_DIRTY(dn)); | ||
mutex_exit(&dn->dn_mtx); | ||
return (can_evict); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not convinced it's worth adding a function for this. How about, this change in dnode_check_slots_free
?
- dmu_object_type_t type = dn->dn_type;
+ boolean_t free = (dn->dn_type == DMU_OT_NONE && !DNODE_IS_DIRTY(dn));
Running ok so far |
8ae0656
to
5d4f3e5
Compare
@behlendorf fixed your comments and (hopefully) the bug from testing (didn't initialize dn_dirty_txg to 0 upon construction) |
Should it be set back to 0 in dnode_destroy (and asserted in dnode_dest)? There's also some stuff in dnode_move where you want it, though I guess that is not used on linux? |
5d4f3e5
to
61b60b0
Compare
@nivedita76 addressed your comments (and a few other places) and repushed. |
61b60b0
to
d8083b1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good, thanks @tcaputi. @nivedita76 if possible could you please verify the latest version of the PR fully resolves the race you were seeing.
Currently, dnode_check_slots_free() works by checking dn->dn_type in the dnode to determine if the dnode is reclaimable. However, there is a small window of time between dnode_free_sync() in the first call to dsl_dataset_sync() and when the useraccounting code is run when the type is set DMU_OT_NONE, but the dnode is not yet evictable, leading to crashes. This patch adds the ability for dnodes to track which txg they were last dirtied in and adds a check for this before performing the reclaim. This patch also corrects several instances when dn_dirty_link was treated as a list_node_t when it is technically a multilist_node_t. Fixes: openzfs#7147 Signed-off-by: Tom Caputi <tcaputi@datto.com>
d8083b1
to
63e17c4
Compare
Yes the list del errors are gone. |
Codecov Report
@@ Coverage Diff @@
## master #7388 +/- ##
==========================================
+ Coverage 76.29% 76.36% +0.07%
==========================================
Files 330 330
Lines 104235 104292 +57
==========================================
+ Hits 79529 79647 +118
+ Misses 24706 24645 -61
Continue to review full report at Codecov.
|
Currently, dnode_check_slots_free() works by checking dn->dn_type in the dnode to determine if the dnode is reclaimable. However, there is a small window of time between dnode_free_sync() in the first call to dsl_dataset_sync() and when the useraccounting code is run when the type is set DMU_OT_NONE, but the dnode is not yet evictable, leading to crashes. This patch adds the ability for dnodes to track which txg they were last dirtied in and adds a check for this before performing the reclaim. This patch also corrects several instances when dn_dirty_link was treated as a list_node_t when it is technically a multilist_node_t. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tom Caputi <tcaputi@datto.com> Requires-spl: spl-0.7-release Issue openzfs#7147 Issue openzfs#7388 Issue openzfs#7997
Currently, dnode_check_slots_free() works by checking dn->dn_type in the dnode to determine if the dnode is reclaimable. However, there is a small window of time between dnode_free_sync() in the first call to dsl_dataset_sync() and when the useraccounting code is run when the type is set DMU_OT_NONE, but the dnode is not yet evictable, leading to crashes. This patch adds the ability for dnodes to track which txg they were last dirtied in and adds a check for this before performing the reclaim. This patch also corrects several instances when dn_dirty_link was treated as a list_node_t when it is technically a multilist_node_t. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tom Caputi <tcaputi@datto.com> Requires-spl: spl-0.7-release Issue openzfs#7147 Issue openzfs#7388 Issue openzfs#7997
Currently, dnode_check_slots_free() works by checking dn->dn_type in the dnode to determine if the dnode is reclaimable. However, there is a small window of time between dnode_free_sync() in the first call to dsl_dataset_sync() and when the useraccounting code is run when the type is set DMU_OT_NONE, but the dnode is not yet evictable, leading to crashes. This patch adds the ability for dnodes to track which txg they were last dirtied in and adds a check for this before performing the reclaim. This patch also corrects several instances when dn_dirty_link was treated as a list_node_t when it is technically a multilist_node_t. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tom Caputi <tcaputi@datto.com> Closes openzfs#7147 Closes openzfs#7388
Currently, dnode_check_slots_free() works by checking dn->dn_type in the dnode to determine if the dnode is reclaimable. However, there is a small window of time between dnode_free_sync() in the first call to dsl_dataset_sync() and when the useraccounting code is run when the type is set DMU_OT_NONE, but the dnode is not yet evictable, leading to crashes. This patch adds the ability for dnodes to track which txg they were last dirtied in and adds a check for this before performing the reclaim. This patch also corrects several instances when dn_dirty_link was treated as a list_node_t when it is technically a multilist_node_t. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tom Caputi <tcaputi@datto.com> Closes #7147 Closes #7388
Currently, dnode_check_slots_free() works by checking dn->dn_type
in the dnode to determine if the dnode is reclaimable. However,
there is a small window of time between dnode_free_sync() in the
first call to dsl_dataset_sync() and when the useraccounting code
is run when the type is set DMU_OT_NONE, but the dnode is not yet
evictable. This patch adds a check for whether dn_dirty_link is
active to determine if we are in this state.
This patch also corrects several instances when dn_dirty_link was
treated as a list_node_t when it is technically a multilist_node_t.
Signed-off-by: Tom Caputi tcaputi@datto.com
How Has This Been Tested?
We are currently running a compile workload which usually triggers the problem within an hour. We will confirm that the problem does not occur overnight.
Types of changes
Checklist:
Signed-off-by
.