Skip to content

Commit

Permalink
Handle vdev_lookup_top() failure in dva_get_dsize_sync()
Browse files Browse the repository at this point in the history
The dva_get_dsize_sync() function incorrectly assumes that the call
to vdev_lookup_top() cannot fail.  However, the NULL dereference at
clearly shows that under certain circumstances it is possible.  Note
that offset 0x570 (1376) maps as expected to vd->vdev_deflate_ratio.

  BUG: unable to handle kernel NULL pointer dereference at 00000570

  crash> struct -o vdev
  struct vdev {
       [0] uint64_t vdev_id;
       ... ...
    [1376] uint64_t vdev_deflate_ratio;

Given that this can happen this patch add the required error handling.
In the case where vdev_lookup_top() fails assume that no deflation
will occur for the DVA and use the asize.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ned Bass <bass6@llnl.gov>
Signed-off-by: Alexey Zhuravlev <alexey.zhuravlev@intel.com>
Closes #1707
Closes #1987
Closes #1891

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
  • Loading branch information
behlendorf committed May 6, 2014
1 parent 962d524 commit 2c33b91
Showing 1 changed file with 3 additions and 1 deletion.
4 changes: 3 additions & 1 deletion module/zfs/spa_misc.c
Original file line number Diff line number Diff line change
Expand Up @@ -1552,7 +1552,9 @@ dva_get_dsize_sync(spa_t *spa, const dva_t *dva)

if (asize != 0 && spa->spa_deflate) {
vdev_t *vd = vdev_lookup_top(spa, DVA_GET_VDEV(dva));
dsize = (asize >> SPA_MINBLOCKSHIFT) * vd->vdev_deflate_ratio;
if (vd != NULL)
dsize = (asize >> SPA_MINBLOCKSHIFT) *
vd->vdev_deflate_ratio;
}

return (dsize);
Expand Down

2 comments on commit 2c33b91

@ahrens
Copy link
Member

@ahrens ahrens commented on 2c33b91 Jun 23, 2014

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not believe that this fix resolves the issues marked as closed. The actual problem is that there is a DVA with an invalid vdev ID. This change may serve to mask the problem, or more likely just move it elsewhere, to the next function that tries to use the invalid vdev ID.

The correct way to address an incorrect pool is to ASSERT that the vdev ID is valid (or panic with a nicer error message, or fail with EIO but that would suppress bug reports when this happens). This should probably happen in vdev_lookup_top(); currently only some callers do so (e.g. metaslab_free_dva()).

@behlendorf
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This definitely doesn't resolve the root problem of what caused the DVA to be damaged/incorrect in the first place. I've opened #2426 so we don't loose track of running the root cause to ground for this so we can revert this change.

Please sign in to comment.