Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Send stream should only list included snaps #8533

Merged
merged 1 commit into from
Mar 28, 2019

Conversation

tcaputi
Copy link
Contributor

@tcaputi tcaputi commented Mar 25, 2019

Currently, zfs send streams will include a list of all snapshots
on the source side if the '-p' option is provided. This can cause
performance problems on the receive side, especially if those
snapshots aren't present on the destination. These problems arise
because guid_to_name(), which is used for several receive side
functions, will search the entire receive-side pool if it can't
find a snapshot with a matching guid. This patch corrects the
issue by ensuring only streams that require this list of snapshots
include them.

Signed-off-by: Tom Caputi tcaputi@datto.com

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (a change to man pages or other documentation)

Checklist:

@tcaputi tcaputi added Status: Work in Progress Not yet ready for general review Status: Design Review Needed Architecture or design is under discussion labels Mar 25, 2019
@tcaputi tcaputi requested review from alek-p and ahrens March 25, 2019 20:07
@tcaputi
Copy link
Contributor Author

tcaputi commented Mar 25, 2019

@ahrens I would really appreciate it if you could review this. My methodology involved a little more whack-a-mole-ing issues that came up when I ran the test suite than I would have liked.

Copy link
Contributor

@alek-p alek-p left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach makes sense to me and is in line with what you described previosly

lib/libzfs/libzfs_sendrecv.c Show resolved Hide resolved
lib/libzfs/libzfs_sendrecv.c Show resolved Hide resolved
lib/libzfs/libzfs_sendrecv.c Outdated Show resolved Hide resolved
@tcaputi
Copy link
Contributor Author

tcaputi commented Mar 26, 2019

@alek-p your comments have been addressed.

Currently, zfs send streams will include a list of all snapshots
on the source side if the '-p' option is provided. This can cause
performance problems on the receive side, especially if those
snapshots aren't present on the destination. These problems arise
because guid_to_name(), which is used for several receive side
functions, will serach the entire receive-side pool if it can't
find a snapshot with a matching guid. This patch corrects the
issue by ensuring only streams that require this list of snapshots
include them.

Signed-off-by: Tom Caputi <tcaputi@datto.com>
@codecov
Copy link

codecov bot commented Mar 27, 2019

Codecov Report

Merging #8533 into master will decrease coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #8533      +/-   ##
==========================================
- Coverage   78.72%   78.71%   -0.01%     
==========================================
  Files         380      380              
  Lines      116440   116464      +24     
==========================================
+ Hits        91665    91679      +14     
- Misses      24775    24785      +10
Flag Coverage Δ
#kernel 79.07% <ø> (-0.14%) ⬇️
#user 67.63% <100%> (+0.36%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fc16b4f...1fc1578. Read the comment docs.

@ahrens
Copy link
Member

ahrens commented Mar 27, 2019

Can't honestly say that I have a great grasp on the overall context of this code, but the change seems OK to me. If this approach seems too brittle, maybe an alternative would be to increase the performance of the receive where we don't have a matching snapshot. For example, by caching the list (or hashtable or avl tree) of snapshot GUIDs that exist on the target, so that we only have to discover them once?

@tcaputi
Copy link
Contributor Author

tcaputi commented Mar 27, 2019

@ahrens The context of this patch comes from @kpande. He was doing a test for me with zfs send -wp to fix another bug, when we noticed that the receive had stopped progressing and strace said the process was spending its time doing LIST_SNAPSHOT ioctls. The issue ended up being that guid_to_name() was iterating over all datasets in the pool. This was happening because the send stream had included a list of all the snapshots that came before the snapshot being sent, but those snapshots didn't exist on the receive side. Hence the reason for this fix.

Copy link
Contributor

@alek-p alek-p left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@behlendorf behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Design Review Needed Architecture or design is under discussion Status: Work in Progress Not yet ready for general review labels Mar 28, 2019
@behlendorf behlendorf merged commit f94b3cb into openzfs:master Mar 28, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants