Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Illumos #3641 compressed block histograms with zdb #2456

Closed
wants to merge 1 commit into from
Closed

Illumos #3641 compressed block histograms with zdb #2456

wants to merge 1 commit into from

Conversation

prakashsurya
Copy link
Member

This patch is a zdb extension of the '-b' option, producing a histogram
of the physical compressed block sizes per DMU object type on disk. The
'-bbbb' option to zdb will uncover this new feature; here's an example
usage on a new pool and snippet of the output it generates:

# zpool create tank /dev/vd{b,c,d}
# dd bs=1k  if=/dev/urandom of=/tank/1kfile  count=1
# dd bs=3k  if=/dev/urandom of=/tank/3kfile  count=1
# dd bs=64k if=/dev/urandom of=/tank/64kfile count=1
# zdb -bbbb tank
...
     3  68.0K   68.0K   68.0K   22.7K    1.00    34.26  ZFS plain file
psize (in 512-byte sectors): number of blocks
                          2:      1 *
                          3:      0
                          4:      0
                          5:      0
                          6:      1 *
                          7:      0
...
                        127:      0
                        128:      1 *
...

The blocks are also broken down by their indirection level. Expanding on
the above example:

# zfs set recordsize=1k tank
# dd bs=1k if=/dev/urandom of=/tank/2x1kfile count=2
# zdb -bbbb tank
...
     1    16K      1K      2K      2K   16.00     1.02      L1 ZFS plain file
psize (in 512-byte sectors): number of blocks
                          2:      1 *
     5  70.0K   70.0K   70.0K   14.0K    1.00    35.71      L0 ZFS plain file
psize (in 512-byte sectors): number of blocks
                          2:      3 ***
                          3:      0
                          4:      0
                          5:      0
                          6:      1 *
                          7:      0
...
                        127:      0
                        128:      1 *
     6  86.0K   71.0K   72.0K   12.0K    1.21    36.73  ZFS plain file
psize (in 512-byte sectors): number of blocks
                          2:      4 ****
                          3:      0
                          4:      0
                          5:      0
                          6:      1 *
                          7:      0
...
                        127:      0
                        128:      1 *
...

There's now a single 1K L1 block which is the indirect block needed for
the '2x1kfile' file just created, as well as two more 1K L0 blocks from
the same file.

This can be used to get a distribution of the block sizes used within
the pool, on a per object type basis.

References:
https://illumos.org/issues/3641
illumos/illumos-gate@490d05b

Ported by: Tim Chase tim@chase2k.com
Signed-off-by: Prakash Surya surya1@llnl.gov

This patch is a zdb extension of the '-b' option, producing a histogram
of the physical compressed block sizes per DMU object type on disk. The
'-bbbb' option to zdb will uncover this new feature; here's an example
usage on a new pool and snippet of the output it generates:

    # zpool create tank /dev/vd{b,c,d}
    # dd bs=1k  if=/dev/urandom of=/tank/1kfile  count=1
    # dd bs=3k  if=/dev/urandom of=/tank/3kfile  count=1
    # dd bs=64k if=/dev/urandom of=/tank/64kfile count=1
    # zdb -bbbb tank
    ...
         3  68.0K   68.0K   68.0K   22.7K    1.00    34.26  ZFS plain file
    psize (in 512-byte sectors): number of blocks
                              2:      1 *
                              3:      0
                              4:      0
                              5:      0
                              6:      1 *
                              7:      0
    ...
                            127:      0
                            128:      1 *
    ...

The blocks are also broken down by their indirection level. Expanding on
the above example:

    # zfs set recordsize=1k tank
    # dd bs=1k if=/dev/urandom of=/tank/2x1kfile count=2
    # zdb -bbbb tank
    ...
         1    16K      1K      2K      2K   16.00     1.02      L1 ZFS plain file
    psize (in 512-byte sectors): number of blocks
                              2:      1 *
         5  70.0K   70.0K   70.0K   14.0K    1.00    35.71      L0 ZFS plain file
    psize (in 512-byte sectors): number of blocks
                              2:      3 ***
                              3:      0
                              4:      0
                              5:      0
                              6:      1 *
                              7:      0
    ...
                            127:      0
                            128:      1 *
         6  86.0K   71.0K   72.0K   12.0K    1.21    36.73  ZFS plain file
    psize (in 512-byte sectors): number of blocks
                              2:      4 ****
                              3:      0
                              4:      0
                              5:      0
                              6:      1 *
                              7:      0
    ...
                            127:      0
                            128:      1 *
    ...

There's now a single 1K L1 block which is the indirect block needed for
the '2x1kfile' file just created, as well as two more 1K L0 blocks from
the same file.

This can be used to get a distribution of the block sizes used within
the pool, on a per object type basis.

References:
  https://illumos.org/issues/3641
  illumos/illumos-gate@490d05b

Ported by: Tim Chase <tim@chase2k.com>
Signed-off-by: Prakash Surya <surya1@llnl.gov>
@FransUrbo
Copy link
Contributor

@prakashsurya Does not apply cleanly on master + #2149.

@prakashsurya
Copy link
Member Author

@FransUrbo This isn't supposed to apply on top of #2149. It's actually the exact same change as the first commit in the #2149 series, and this should only be applied to a clean master tree.

IMO, it's difficult to leave independent reviews for each patch in a multi-patch pull request using the GitHub interface. So, I wanted to split that single commit off from the rest so it can be clearly reviewed and landed without interference with reviews of the other patches in the pull request.

I'm currently reviewing the following two patches in that series, and I plan to open another pull request breaking those patches from the rest once I'm finished reviewing it. I don't like having to sift through the long list of messages when reviewing one patch in the stack; trying to determine which comments apply to the review I'm trying to do and which ones do not.

I hope that clears it up..?

@FransUrbo
Copy link
Contributor

And yet it applied and compiled with very little tweaking... ?

Could you update #2149 (or the branch you used when sending the pull request) and remove the commit you split of so there's no confusion?

@prakashsurya
Copy link
Member Author

What do you mean it applied? My best guess is you used a git command (e.g. cherry-pick, merge, am) to apply this (#2456) on top of #2149, and it was smart enough to ignore most of this patch because the changes were already present in your tree.

I can't update #2149 because that pull request it owned by @dweeezil, or else I would have.

@dweeezil
Copy link
Contributor

dweeezil commented Jul 8, 2014

@prakashsurya I understand you're working on this stack in chunks. Do you intend on making a pull request of each commit?

The reason I created this as a stack of patches rather than a bunch of pull requests is because pull requests in zfsonlinux have traditionally been based on master or a predecessor commit within master. I could certainly, for example, peel back the first commit which from which you've created #2456 but then I'd have to base the branch of #2149 on your commit. This is no problem, but as I mentioned, it's not the way this project's development has been occurring.

@prakashsurya
Copy link
Member Author

@dweezil Well, I'm not really sure how this is going to work out in practice, but here's what I was planning:

1. Move the first patch (a3b5639) into a pull request based on master (which I've done here)
2. Move the next two patches (a4dcbb5 and 6b0d278) into a pull request based on (1)
3. Rebase the rest of the patches on (2). I still need to inspect the rest, and determine the best way to break those up.

My reasoning being, all of the patches are likely not going to land all at once. Thus, it makes sense to individually review and land the earlier patches in the series, while the later ones wait their turn. Unfortunately, the current workflow doesn't really lend itself to this. Most people review and test all patches in a pull request at the same time; which made me want to try and split the changes up into logical chunks that can each be tested and landed in turn.

The problem with this, as you mentioned, is the later patches have dependencies on the earlier patches. So I'm not quite sure how GitHub is going to work for pull requests that are created based on other pull requests.

I'm open to other ideas. I just want to make the review process as clear as possible, and make sure reviews and testing happens on the patches as they will likely land. For example, @FransUrbo stated that the make debs target is broken on that patch set, but it's unclear exactly which patch introduced that regression.

But maybe I'm the only one that thinks this would be a good idea?

GitHub pull requests are a poor interface for reviews, IMO, and even worse for this specific use case. Other review systems work much better for this sort of thing.

@behlendorf
Copy link
Contributor

Me too. Merged as:

d586964 Illumos #3641 compressed block histograms with zdb

@behlendorf behlendorf added this to the 0.6.4 milestone Jul 16, 2014
@prakashsurya prakashsurya deleted the illumos-3641 branch July 24, 2014 18:10
ryao pushed a commit to ryao/zfs that referenced this pull request Oct 8, 2014
This patch is a zdb extension of the '-b' option, producing a histogram
of the physical compressed block sizes per DMU object type on disk. The
'-bbbb' option to zdb will uncover this new feature; here's an example
usage on a new pool and snippet of the output it generates:

    # zpool create tank /dev/vd{b,c,d}
    # dd bs=1k  if=/dev/urandom of=/tank/1kfile  count=1
    # dd bs=3k  if=/dev/urandom of=/tank/3kfile  count=1
    # dd bs=64k if=/dev/urandom of=/tank/64kfile count=1
    # zdb -bbbb tank
    ...
         3  68.0K   68.0K   68.0K   22.7K    1.00    34.26  ZFS plain file
    psize (in 512-byte sectors): number of blocks
                              2:      1 *
                              3:      0
                              4:      0
                              5:      0
                              6:      1 *
                              7:      0
    ...
                            127:      0
                            128:      1 *
    ...

The blocks are also broken down by their indirection level. Expanding on
the above example:

    # zfs set recordsize=1k tank
    # dd bs=1k if=/dev/urandom of=/tank/2x1kfile count=2
    # zdb -bbbb tank
    ...
         1    16K      1K      2K      2K   16.00     1.02      L1 ZFS plain file
    psize (in 512-byte sectors): number of blocks
                              2:      1 *
         5  70.0K   70.0K   70.0K   14.0K    1.00    35.71      L0 ZFS plain file
    psize (in 512-byte sectors): number of blocks
                              2:      3 ***
                              3:      0
                              4:      0
                              5:      0
                              6:      1 *
                              7:      0
    ...
                            127:      0
                            128:      1 *
         6  86.0K   71.0K   72.0K   12.0K    1.21    36.73  ZFS plain file
    psize (in 512-byte sectors): number of blocks
                              2:      4 ****
                              3:      0
                              4:      0
                              5:      0
                              6:      1 *
                              7:      0
    ...
                            127:      0
                            128:      1 *
    ...

There's now a single 1K L1 block which is the indirect block needed for
the '2x1kfile' file just created, as well as two more 1K L0 blocks from
the same file.

This can be used to get a distribution of the block sizes used within
the pool, on a per object type basis.

References:
  https://illumos.org/issues/3641
  illumos/illumos-gate@490d05b

Ported by: Tim Chase <tim@chase2k.com>
Signed-off-by: Prakash Surya <surya1@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Boris Protopopov <boris.protopopov@me.com>
Closes openzfs#2456
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Feature Feature request or new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants