Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zfs destroy hangs other zfs commands #3104

Closed
stevenburgess opened this issue Feb 13, 2015 · 10 comments
Closed

zfs destroy hangs other zfs commands #3104

stevenburgess opened this issue Feb 13, 2015 · 10 comments
Labels
Status: Inactive Not being actively updated Status: Stale No recent activity for issue Type: Performance Performance improvement or performance problem

Comments

@stevenburgess
Copy link

We are having a problem where any zfs destroy that takes time to return causes all new zfs commands to not start until the zfs destroy has returned.

I am testing on ubuntu 14.04, ZoL 0.6.3, pool version 5000.

My testing shows that a zfs destroy call being passed any number of snapshots takes a long time to return. Destroying a 8GB FS with one snapshot returns almost instantly and causes no downtime. Destroying an 8GB FS with 4000 snapshots can take 10 min. If I break destroying the 4000 snapshot file into 100 snapshot zfs destroy -d fs@1%100, I get 10 second periods of inactivity.

I have a few scripts that I used to demonstrate this problem.

-# Create a file system with a ton of snapshots (or use one you have)
https://gist.github.com/stevenburgess/793f6c8e8c593b6b30f6
-# Watch new calls to zfs succeed with a little scroll bar
while [ true ]; do zfs get name tank 2>&1 > /dev/null; printf '|'; sleep 0.1; done
-# Destroy the FS in question
zfs destroy -r tank/constant
-# Watch the scroll bar stop a few seconds in, and not move until the command returns

I use zfs get to show the slowdown, but it is all zfs commands that halt. This means that if one command destroys an FS with 4000 snapshots, for a 10 min period no one can:

zfs get any stats
zfs snapshot anything
zfs clone a new FS
zfs recv new data
zfs send start a sendfile

Which is downtime for a bunch of processes. Our only workaround currently is to ensure that zfs destroy commands destroy single snapshots at a time. This causes a period of .2 to .4 seconds of inactivity.

I was able to reproduce it on a few machines around here, on 0.6.3 and git master.

@behlendorf behlendorf added the Type: Performance Performance improvement or performance problem label Feb 13, 2015
@stevenburgess
Copy link
Author

I was able to reproduce this same behavior on FreeBSD, so this may be an upstream problem. I am seeing that while the freeze is occurring, no new TXGs are created, so that explains why things like zfs snapshot and zfs recv freeze. Here is the last TXG that was active for the destroy:

txg      birth            state ndirty       nread        nwritten     reads    writes   otime        qtime        wtime        stime       
8210363  951316291575     C     213843456    3914752      61543936     644      4269     635023385257 2444         25492        8245082139 

otime reads 10 min ((## / 1000000000) / 60), but that time was not spent in the q w or s states.

@behlendorf
Copy link
Contributor

@stevenburgess I mean to comment on this earlier but it slipped my mind. I haven't confirmed this with the source but I strongly suspect what happening here is that the recursive destroy needs to be handled in a single TXG. Since destroying 4000 snapshots takes a fair bit of work this TXG takes a long time to process which prevents the TXGs from rolling forward every few seconds like they're supposed too. When you destroy the snapshots with multiple commands you spread the work over multiple TXGs minimize the impact. This is a problem all the implementations suffer from.

@stevenburgess
Copy link
Author

That certainly stacks up with my observations. I find this behavior to be unexpected, since I would probably not think twice about calling zfs destroy -r on an 8GB filesystem, but that call could have some pretty severe consequences, so I should.

What do you think is best in terms of this ticket? I dont want to load the ZoL GitHub with upstream issues. Since it is an OpenZFS issue, should I try to get it posted there? Or is this an acceptable place?

@behlendorf
Copy link
Contributor

@stevenburgess I think it's definitely appropriate to open an issue in the upstream tracker. We can leave it open here as well, it might make it easier for people to find. I'm a little less optimistic about how best to address this. There are good reasons to want to handle this all in a single TXG.

@AceSlash
Copy link

@behlendorf Is there an upstream ticket since?

This issue is still here on ZoL 0.7.5. I got bitten by it several time now when I forget and naively runs a zfs destroy -r. With monitoring software that issue a log of zfs get and hourly backup script, the amount of zfs commands stuck grows very quickly and the consequences can be very bad.

@behlendorf
Copy link
Contributor

@stevenburgess @AceSlash we expect that ZFS Channel Programs (OpenZFS 7431), #6558, should improve performance for this case. This feature was recently merged to master and if your in a postion to test the current master source it would be appreciated.

@AceSlash
Copy link

AceSlash commented Apr 5, 2018

@behlendorf I'm sorry but I can't test the latest master branch. I really don't know what to do. Today I just did it again, forgetting this bug and killed all zfs operations on an important server.

No backup (zfs send...)
No monitoring (zfs get...)

Is this merged on 0.7.7 (I'm still on 0.7.6 on this system)? I mean that's a super serious issue from my point of view, since it breaks servers.

I'm just sad that no solution was implemented and we still have to deal with this.

@tonyhutter
Copy link
Contributor

I was able to reproduce this while destroying a ~400 snapshot pool and running zfs get name tank at the same time. Here's where zfs get hangs:

     3.643888 ioctl(3, _IOC(0, 0x5a, 0x5, 0), 0x7ffdd3136760) = 0  <<--- ZFS_IOC_POOL_STATS takes 3.6 seconds

I also tried the sample "zfs list" channel program from #7281 (comment) while doing the destroy and saw the delay as well:

$ while [ 1 ] ; do sleep 0.1 && sudo ./cmd/zfs/zfs program tank zfs_rlist.zcp tank name &>/dev/null && date ; done
Wed Apr  4 22:51:55 CDT 2018
Wed Apr  4 22:51:55 CDT 2018
Wed Apr  4 22:51:56 CDT 2018
Wed Apr  4 22:51:56 CDT 2018
Wed Apr  4 22:51:56 CDT 2018
Wed Apr  4 22:51:57 CDT 2018
Wed Apr  4 22:51:57 CDT 2018
Wed Apr  4 22:51:57 CDT 2018
Wed Apr  4 22:52:03 CDT 2018 <<<--- 6 seconds later
Wed Apr  4 22:52:03 CDT 2018

Lastly, I tried running the zfs recursive snapshot destroy channel program from https://www.delphix.com/blog/delphix-engineering/zfs-channel-programs while running zpool get, and still saw the delay.

Note that I did notice that zpool status and zpool get do not hang while doing the destroy, so you may be able to use them for simple heartbeat monitoring. That doesn't solve your problem though...

@stale
Copy link

stale bot commented Aug 25, 2020

This issue has been automatically marked as "stale" because it has not had any activity for a while. It will be closed in 90 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the Status: Stale No recent activity for issue label Aug 25, 2020
@stale stale bot closed this as completed Nov 23, 2020
@tigerblue77
Copy link

This issue is still existing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Inactive Not being actively updated Status: Stale No recent activity for issue Type: Performance Performance improvement or performance problem
Projects
None yet
Development

No branches or pull requests

6 participants
@behlendorf @stevenburgess @AceSlash @tonyhutter @tigerblue77 and others