You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I got a spanking new server last week, and thought I'd upgrade my file vault to ZFS age. The machine is Core i5-4570 with 8GB of RAM, it's running Ubuntu 14.04 (Trusty, kernel 3.13.0-24-generic #47-Ubuntu SMP), and I'm trying my best to have a raidz2 with four 1TB drives.
However, I can't move my files there, because all my copying efforts fail with the whole system hanging. My preferred method would be to rsync from the old drives to the new system (well, old drives but new file system, at least ;).
I noticed there have been some issues regarding (apparently) ARC and rsync, so I first set the max ARC size to 4GB, then disabled primary cache from my target filesystem altogether, and finally reverted from rsync to cp, but the result is always, after a few seconds, an unresponsive machine.
I first tried to run the system with the latest stable zfs (0.6.2-2trusty), but when I had the issue, I browsed through the issues and noticed there were quite a few fixes for various deadlocks, so right now I'm running the latest daily, 0.6.2-2.1trusty~13.gbp7d6f55.
According to ps it's txg_sync that hangs, and it had this in /proc/xxx/stack:
PROPERTY VALUE SOURCE
type filesystem -
creation ti touko 6 22:41 2014 -
used 39,1M -
available 1,73T -
referenced 39,1M -
compressratio 1.00x -
mounted yes -
quota none default
reservation none default
recordsize 128K default
mountpoint /pub/Pictures inherited from turva/pub
sharenfs off default
checksum on default
compression off local
atime on default
devices on default
exec on default
setuid on default
readonly off default
zoned off default
snapdir hidden default
aclinherit restricted default
canmount on default
xattr on default
copies 1 default
version 5 -
utf8only off -
normalization none -
casesensitivity sensitive -
vscan off default
nbmand off default
sharesmb off default
refquota none default
refreservation none default
primarycache none local
secondarycache all default
usedbysnapshots 0 -
usedbydataset 39,1M -
usedbychildren 0 -
usedbyrefreservation 0 -
logbias latency default
dedup off default
mlslabel none default
sync standard default
refcompressratio 1.00x -
written 39,1M -
logicalused 29,8M -
logicalreferenced 29,8M -
snapdev hidden default
acltype off default
context none default
fscontext none default
defcontext none default
rootcontext none default
relatime off default
Just for fun, I also tried to copy something from an USB drive, copied a linux source package into the raidz2 volume, and it worked just fine. Of course, it's only a single file whereas I'm trying to copy whole directory trees in the other case, but it's still more than twice the size I've manage to copy so far from the actual source. That source happens to be a LVM running on top of a SATA drive (formatted ext4, whereas the USB drive was ext3) which is connected to the same SATA host as some of drives that are part of the raidz2. Might not be relevant, but thought I'd mention it.
So, to wrap up: when I start a copy, stuff gets copied for a few seconds, then the copying stops. After a few seconds more, everything else freezes as well, and the only option I have is to power off the machine.
The text was updated successfully, but these errors were encountered:
Well, this is embarrassing. After going through everything it appears the reason for hangs has nothing to do with ZFS; a faulty SATA controller started failing when there was too much traffic on both of its connectors. Since my RAIDZ2 happened to be partly connected to that controller, it created enough pressure for the controller to silently die without a whimper.
@ssundell I'm glad you got to the root cause, and thank you for following up in the issue. If you don't mind I'm going to close this issue. It sounds like we could just do a better job logging what happened.
I got a spanking new server last week, and thought I'd upgrade my file vault to ZFS age. The machine is Core i5-4570 with 8GB of RAM, it's running Ubuntu 14.04 (Trusty, kernel 3.13.0-24-generic #47-Ubuntu SMP), and I'm trying my best to have a raidz2 with four 1TB drives.
However, I can't move my files there, because all my copying efforts fail with the whole system hanging. My preferred method would be to rsync from the old drives to the new system (well, old drives but new file system, at least ;).
I noticed there have been some issues regarding (apparently) ARC and rsync, so I first set the max ARC size to 4GB, then disabled primary cache from my target filesystem altogether, and finally reverted from rsync to cp, but the result is always, after a few seconds, an unresponsive machine.
I first tried to run the system with the latest stable zfs (0.6.2-2
trusty), but when I had the issue, I browsed through the issues and noticed there were quite a few fixes for various deadlocks, so right now I'm running the latest daily, 0.6.2-2.1trusty~13.gbp7d6f55.According to ps it's txg_sync that hangs, and it had this in /proc/xxx/stack:
The properties for the file system:
Just for fun, I also tried to copy something from an USB drive, copied a linux source package into the raidz2 volume, and it worked just fine. Of course, it's only a single file whereas I'm trying to copy whole directory trees in the other case, but it's still more than twice the size I've manage to copy so far from the actual source. That source happens to be a LVM running on top of a SATA drive (formatted ext4, whereas the USB drive was ext3) which is connected to the same SATA host as some of drives that are part of the raidz2. Might not be relevant, but thought I'd mention it.
So, to wrap up: when I start a copy, stuff gets copied for a few seconds, then the copying stops. After a few seconds more, everything else freezes as well, and the only option I have is to power off the machine.
The text was updated successfully, but these errors were encountered: