Illumos #3641 compressed block histograms with zdb #2456

prakashsurya · 2014-07-02T22:31:53Z

This patch is a zdb extension of the '-b' option, producing a histogram
of the physical compressed block sizes per DMU object type on disk. The
'-bbbb' option to zdb will uncover this new feature; here's an example
usage on a new pool and snippet of the output it generates:

# zpool create tank /dev/vd{b,c,d}
# dd bs=1k  if=/dev/urandom of=/tank/1kfile  count=1
# dd bs=3k  if=/dev/urandom of=/tank/3kfile  count=1
# dd bs=64k if=/dev/urandom of=/tank/64kfile count=1
# zdb -bbbb tank
...
     3  68.0K   68.0K   68.0K   22.7K    1.00    34.26  ZFS plain file
psize (in 512-byte sectors): number of blocks
                          2:      1 *
                          3:      0
                          4:      0
                          5:      0
                          6:      1 *
                          7:      0
...
                        127:      0
                        128:      1 *
...

The blocks are also broken down by their indirection level. Expanding on
the above example:

# zfs set recordsize=1k tank
# dd bs=1k if=/dev/urandom of=/tank/2x1kfile count=2
# zdb -bbbb tank
...
     1    16K      1K      2K      2K   16.00     1.02      L1 ZFS plain file
psize (in 512-byte sectors): number of blocks
                          2:      1 *
     5  70.0K   70.0K   70.0K   14.0K    1.00    35.71      L0 ZFS plain file
psize (in 512-byte sectors): number of blocks
                          2:      3 ***
                          3:      0
                          4:      0
                          5:      0
                          6:      1 *
                          7:      0
...
                        127:      0
                        128:      1 *
     6  86.0K   71.0K   72.0K   12.0K    1.21    36.73  ZFS plain file
psize (in 512-byte sectors): number of blocks
                          2:      4 ****
                          3:      0
                          4:      0
                          5:      0
                          6:      1 *
                          7:      0
...
                        127:      0
                        128:      1 *
...

There's now a single 1K L1 block which is the indirect block needed for
the '2x1kfile' file just created, as well as two more 1K L0 blocks from
the same file.

This can be used to get a distribution of the block sizes used within
the pool, on a per object type basis.

References:
https://illumos.org/issues/3641
illumos/illumos-gate@490d05b

Ported by: Tim Chase tim@chase2k.com
Signed-off-by: Prakash Surya surya1@llnl.gov

This patch is a zdb extension of the '-b' option, producing a histogram of the physical compressed block sizes per DMU object type on disk. The '-bbbb' option to zdb will uncover this new feature; here's an example usage on a new pool and snippet of the output it generates: # zpool create tank /dev/vd{b,c,d} # dd bs=1k if=/dev/urandom of=/tank/1kfile count=1 # dd bs=3k if=/dev/urandom of=/tank/3kfile count=1 # dd bs=64k if=/dev/urandom of=/tank/64kfile count=1 # zdb -bbbb tank ... 3 68.0K 68.0K 68.0K 22.7K 1.00 34.26 ZFS plain file psize (in 512-byte sectors): number of blocks 2: 1 * 3: 0 4: 0 5: 0 6: 1 * 7: 0 ... 127: 0 128: 1 * ... The blocks are also broken down by their indirection level. Expanding on the above example: # zfs set recordsize=1k tank # dd bs=1k if=/dev/urandom of=/tank/2x1kfile count=2 # zdb -bbbb tank ... 1 16K 1K 2K 2K 16.00 1.02 L1 ZFS plain file psize (in 512-byte sectors): number of blocks 2: 1 * 5 70.0K 70.0K 70.0K 14.0K 1.00 35.71 L0 ZFS plain file psize (in 512-byte sectors): number of blocks 2: 3 *** 3: 0 4: 0 5: 0 6: 1 * 7: 0 ... 127: 0 128: 1 * 6 86.0K 71.0K 72.0K 12.0K 1.21 36.73 ZFS plain file psize (in 512-byte sectors): number of blocks 2: 4 **** 3: 0 4: 0 5: 0 6: 1 * 7: 0 ... 127: 0 128: 1 * ... There's now a single 1K L1 block which is the indirect block needed for the '2x1kfile' file just created, as well as two more 1K L0 blocks from the same file. This can be used to get a distribution of the block sizes used within the pool, on a per object type basis. References: https://illumos.org/issues/3641 illumos/illumos-gate@490d05b Ported by: Tim Chase <tim@chase2k.com> Signed-off-by: Prakash Surya <surya1@llnl.gov>

FransUrbo · 2014-07-06T08:22:49Z

@prakashsurya Does not apply cleanly on master + #2149.

prakashsurya · 2014-07-07T16:42:43Z

@FransUrbo This isn't supposed to apply on top of #2149. It's actually the exact same change as the first commit in the #2149 series, and this should only be applied to a clean master tree.

IMO, it's difficult to leave independent reviews for each patch in a multi-patch pull request using the GitHub interface. So, I wanted to split that single commit off from the rest so it can be clearly reviewed and landed without interference with reviews of the other patches in the pull request.

I'm currently reviewing the following two patches in that series, and I plan to open another pull request breaking those patches from the rest once I'm finished reviewing it. I don't like having to sift through the long list of messages when reviewing one patch in the stack; trying to determine which comments apply to the review I'm trying to do and which ones do not.

I hope that clears it up..?

FransUrbo · 2014-07-07T20:41:48Z

And yet it applied and compiled with very little tweaking... ?

Could you update #2149 (or the branch you used when sending the pull request) and remove the commit you split of so there's no confusion?

prakashsurya · 2014-07-08T15:03:11Z

What do you mean it applied? My best guess is you used a git command (e.g. cherry-pick, merge, am) to apply this (#2456) on top of #2149, and it was smart enough to ignore most of this patch because the changes were already present in your tree.

I can't update #2149 because that pull request it owned by @dweeezil, or else I would have.

dweeezil · 2014-07-08T15:55:05Z

@prakashsurya I understand you're working on this stack in chunks. Do you intend on making a pull request of each commit?

The reason I created this as a stack of patches rather than a bunch of pull requests is because pull requests in zfsonlinux have traditionally been based on master or a predecessor commit within master. I could certainly, for example, peel back the first commit which from which you've created #2456 but then I'd have to base the branch of #2149 on your commit. This is no problem, but as I mentioned, it's not the way this project's development has been occurring.

prakashsurya · 2014-07-08T17:16:02Z

@dweezil Well, I'm not really sure how this is going to work out in practice, but here's what I was planning:

1. Move the first patch (a3b5639) into a pull request based on master (which I've done here)
2. Move the next two patches (a4dcbb5 and 6b0d278) into a pull request based on (1)
3. Rebase the rest of the patches on (2). I still need to inspect the rest, and determine the best way to break those up.

My reasoning being, all of the patches are likely not going to land all at once. Thus, it makes sense to individually review and land the earlier patches in the series, while the later ones wait their turn. Unfortunately, the current workflow doesn't really lend itself to this. Most people review and test all patches in a pull request at the same time; which made me want to try and split the changes up into logical chunks that can each be tested and landed in turn.

The problem with this, as you mentioned, is the later patches have dependencies on the earlier patches. So I'm not quite sure how GitHub is going to work for pull requests that are created based on other pull requests.

I'm open to other ideas. I just want to make the review process as clear as possible, and make sure reviews and testing happens on the patches as they will likely land. For example, @FransUrbo stated that the make debs target is broken on that patch set, but it's unclear exactly which patch introduced that regression.

But maybe I'm the only one that thinks this would be a good idea?

GitHub pull requests are a poor interface for reviews, IMO, and even worse for this specific use case. Other review systems work much better for this sort of thing.

behlendorf · 2014-07-16T18:54:00Z

Me too. Merged as:

d586964 Illumos #3641 compressed block histograms with zdb

This patch is a zdb extension of the '-b' option, producing a histogram of the physical compressed block sizes per DMU object type on disk. The '-bbbb' option to zdb will uncover this new feature; here's an example usage on a new pool and snippet of the output it generates: # zpool create tank /dev/vd{b,c,d} # dd bs=1k if=/dev/urandom of=/tank/1kfile count=1 # dd bs=3k if=/dev/urandom of=/tank/3kfile count=1 # dd bs=64k if=/dev/urandom of=/tank/64kfile count=1 # zdb -bbbb tank ... 3 68.0K 68.0K 68.0K 22.7K 1.00 34.26 ZFS plain file psize (in 512-byte sectors): number of blocks 2: 1 * 3: 0 4: 0 5: 0 6: 1 * 7: 0 ... 127: 0 128: 1 * ... The blocks are also broken down by their indirection level. Expanding on the above example: # zfs set recordsize=1k tank # dd bs=1k if=/dev/urandom of=/tank/2x1kfile count=2 # zdb -bbbb tank ... 1 16K 1K 2K 2K 16.00 1.02 L1 ZFS plain file psize (in 512-byte sectors): number of blocks 2: 1 * 5 70.0K 70.0K 70.0K 14.0K 1.00 35.71 L0 ZFS plain file psize (in 512-byte sectors): number of blocks 2: 3 *** 3: 0 4: 0 5: 0 6: 1 * 7: 0 ... 127: 0 128: 1 * 6 86.0K 71.0K 72.0K 12.0K 1.21 36.73 ZFS plain file psize (in 512-byte sectors): number of blocks 2: 4 **** 3: 0 4: 0 5: 0 6: 1 * 7: 0 ... 127: 0 128: 1 * ... There's now a single 1K L1 block which is the indirect block needed for the '2x1kfile' file just created, as well as two more 1K L0 blocks from the same file. This can be used to get a distribution of the block sizes used within the pool, on a per object type basis. References: https://illumos.org/issues/3641 illumos/illumos-gate@490d05b Ported by: Tim Chase <tim@chase2k.com> Signed-off-by: Prakash Surya <surya1@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Boris Protopopov <boris.protopopov@me.com> Closes openzfs#2456

prakashsurya mentioned this pull request Jul 2, 2014

Recent illumos/Delphix features #2149

Closed

behlendorf closed this in d586964 Jul 16, 2014

behlendorf added this to the 0.6.4 milestone Jul 16, 2014

behlendorf added the Feature label Jul 16, 2014

prakashsurya deleted the illumos-3641 branch July 24, 2014 18:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Illumos #3641 compressed block histograms with zdb #2456

Illumos #3641 compressed block histograms with zdb #2456

prakashsurya commented Jul 2, 2014

FransUrbo commented Jul 6, 2014

prakashsurya commented Jul 7, 2014

FransUrbo commented Jul 7, 2014

prakashsurya commented Jul 8, 2014

dweeezil commented Jul 8, 2014

prakashsurya commented Jul 8, 2014

behlendorf commented Jul 16, 2014

Illumos #3641 compressed block histograms with zdb #2456

Illumos #3641 compressed block histograms with zdb #2456

Conversation

prakashsurya commented Jul 2, 2014

FransUrbo commented Jul 6, 2014

prakashsurya commented Jul 7, 2014

FransUrbo commented Jul 7, 2014

prakashsurya commented Jul 8, 2014

dweeezil commented Jul 8, 2014

prakashsurya commented Jul 8, 2014

behlendorf commented Jul 16, 2014