-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding ZVol as log device of seprate pool causes deadlock. #6065
Comments
PVE developer hat on please report (potential) bugs in PVE to the Proxmox bugtracker at https://bugzilla.proxmox.com first. We forward bug reports and fixes to our upstreams as needed. You are running an outdated version of PVE 4 with known issues. While I am not sure whether your proposed setup is even supposed to work (zvols as vdevs without a layer of abstraction like KVM inbetween have caused problems in the past), an uptodate PVE 4.4 installation (with ZFS 0.6.5.9 and kernel 4.4.49-1-pve) does not show the issue you described. |
I have seen a similar problem with recent performance analysis runs on 0.7.0 RC1 and then 0.7.0 RC3. We run a virtualized environment (under Xen) with our guest VMs running over zvols managed from our driver domain. One workload function is fileserver, which runs a separate zpool on top of the zvols provided. This fileserver zpool is created within the driver domain immediately on top of a couple of zvols, one for the useable disk partition and another for a SLOG device (a performance enhancement). This works fine under 0.6.4.2 but causes ZFS to lock under 0.7.0. I've tried various builds on 0.6.5.x set but can't remember if there was an issue with those versions. I'll perform a bit more detail on the 0.7.0 issue when I finish my current performance runs, to understand if it is constrained to the use of zvol as a SLOG in an overlayed ZFS zpool, or occurs if there is a simple zpool with a single zvol comprising the pool with no log or cache device. |
@behlendorf has there been any other reports of similar issues, and do you know if it is constrained to 0.7.0 stream or appeared in 0.6.5.x. It is quite a big deal for us as our VM disk provisioning currently operates solely in our domain 0 hosting the zpool before any VMs are created. Of course it would work fine if the zvol was passed through to the VM and then zfs zpool created and maintained solely within that context, but that breaks our whole provisioning flow. When I get a chance I'll see if I can pull a stack when this happens to show where the deadlock occurs. I thought I'd read some time back that there was consideration to prevent zpools being created directly over zvols - this is obviously not that due to deadlock rather than error return, but is there anything in the works around that we should be aware of (hopefully not as would break backwards compatibility) |
@koplover layering one zpool on another like this is a difficult thing to automatically detect and so it hasn't been reliably handled in any of the releases. As you observed opening the devices can, but won't always, result in a deadlock. There was a patch proposed in #5286 from @ryao which added a module option called I don't mind dusting off the original patch so we can get it included. I've opened PR #6093 with a refreshed version of the original patch. It would be great if you could verify if it does solve the issue your observing. With the patch applied you'll need to set the |
@behlendorf I've run into this issue each time I've deployed a 0.7.0 RC ZFS system, only a sample of three but reached deadlock 100% at the moment. It sounds from the above that this is a race-condition which has been around for some time, dependent on the time of creating the underlying zvols and the zpool on top of them. As a quick test I've therefore gone to our disk driver domain that hosts our primary zpool (holding zvols representing the volumes of our guest VMs), and created a couple of zvols directly, one to represent the overlying zpool (call this otank), and another for the log device. A couple of minutes later I run: zpool create ztank /dev/zvol/diskconvm/test-data log /dev/zvol/diskconvm/test-slog This hangs presumably deadlocked, it holds out any subsequent zpool / zfs admin commands. Dumping the virtual CPUs on this hosting domain does not reveal anything. The logs (below) show only the following which would seem to be an effect rather than cause, but may help pinpoint the lock causing this deadlock. Happy to try the patch, just want to make sure we are on the same page. The issue I'm seeing seems to be completely repeatable, more than a race condition. I can certainly try the patch, thanks for digging it out, but I wonder if there is a different issue here?
|
Let me summarize the posted stacks so we're all on the same page. The deadlock encountered here is due to a lock inversion. When layering pools the following is possible. What needs to happened to properly address this is to break up the @tuxoko @ryao @bprotopopov have all been working on the zvol code fairly recently. Do any of you have the time to tackle this? In short we need to never call
@koplover you're right, the proposed patch in #6093 won't fully address this issue. I'll close that PR. |
I'll see what I can do :)
But I must ask - why is this s good idea :) to use a zvol as a log device ?
Is this a test setup of some sort?
Typos courtesy of my iPhone
On May 4, 2017, at 5:28 PM, Brian Behlendorf <notifications@github.com<mailto:notifications@github.com>> wrote:
Let me summarize the posted stacks so we're all on the same page. The deadlock encountered here is due to a lock inversion. When layering pools the following is possible. What needs to happened to properly address this is to break up the zvol_state_lock such that it only protects insertion/removal from the zvol_state_list. A new lock could be added per zvol_state_t to protect its contents.
@tuxoko<https://github.com/tuxoko> @ryao<https://github.com/ryao> @bprotopopov<https://github.com/bprotopopov> have all been working on the zvol code fairly recently. Do any of you have the time to tackle this? In short we need to never call dmu_objset_own under the global zvol_state_lock.
zpool systemd-udevd
--------------------- ---------------------
do_vfs_ioctl do_sys_open
zfsdev_ioctl do_filp_open
zfs_ioc_pool_create path_openat
spa_create <- take spa_namespace_lock do_last
vdev_create vfs_open
vdev_open do_dentry_open
vdev_root_open blkdev_open
vdev_open_children blkdev_get
vdev_open __blkdev_get
vdev_disk_open zvol_open <- take zvol_state_lock
blkdev_get_by_path dmu_objset_own
blkdev_get dsl_pool_hold
__blkdev_get spa_open
zvol_open <- wait zvol_state_lock spa_open_common <- wait spa_namespace_lock
@koplover<https://github.com/koplover> you're right, the proposed patch in #6093<#6093> won't fully address this issue. I'll close that PR.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#6065 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/ACX4uWzZQWZVGmtqa6Y-VCyjBfyU5L1uks5r2kL-gaJpZM4NHLLu>.
|
One issue with this approach is that there is a dependency being created between two pools, and if the SSD pool is not available, one cannot import the spinning disk pool without -m option, which might result in data loss. Another issue that comes to mind is what happens if the SSD pool runs out of space. I think these issues need to be considered carefully before going with this type of setup in production. A somewhat similar situation arises if one deploys L2ARC that is shared by several pools - no one does this to avoid cross pool dependencies due to device availability, to my knowledge. |
@bprotopopov This is a production configuration Bear in mind in our setup we have a heavily virtualised environment (Xen) with many different functions realised as separate virtualised guests - around 20-30. All of these guests are supported by zvols provided from the base zpool (vdevs full hard disks in mirror configuration, and log / cache device from SSD). One of these virtualised workloads (VMs) benefits greatly from running a ZFS filesystem, a file server function, where snapshotting provides important features for user restore, backup etc. Essentially, it is 'luck' that we choose ZFS in both places in the architecture (due to the fine features ZFS provides. So, the next question is how to provide this in the most performant and architecturally sound manner. We have tried various configurations, including passing through the zvol in different ways to the overlying VM, and having just a plain vdev comprising the overlying (VM) zpool within the zvol log. However, this too has given rise to performance issues. We now have the 'data' zvol as nosynch, and the 'log' zvol as synch which performs well and does not risk the data. In truth if the underlying provider zvol corrupts in any way we are in a bad way and need to rebuild as all the guest VMs are hit. The restore of the overlying file server zpool is just one instance of restoring this data. Given the above, are you still concerned of the relavance of this scenario? In terms of SSD pool what are you considering here - why would it run out of space specifically as the overlying zvol is just a another zvol from the perspective of the supporting primary zpool - the 'log' zvol just has an optimum block size? |
@koplover
Going back to the issue #6065, there are two sets of concerns:
1) the deadlock that arises from a conceptually valid use of ZFS
2) the practicality of the conceptually valid use of ZFS
The first set of issue should be addressed for correctness. This is not being debated.
The second set of issues relates to practicality of using a zvol on top of SSD pool as a log device for other pools. There are several issues to consider. I have mentioned the dependencies created between the pools:
1) should the SSD pool become unavailable (even temporarily), all the pools that use zvols from the SSD pool will be affected, e.g. will not be importable without potential loss of data.
2) should the SSD pool's capacity utilization approach 85%-90% or above (including fragmentation), all the pools that use zvols from the SSD pool will be affected, e.g. the performance will greatly suffer.
I assume here that there is more that one zvol provisioned in the SSD pool, because if there is only one such zvol, then the SSDs could simply be used (as log vdev) by the pool that uses this zvol as log device, 'with no loss of generality'. Having many zvols as log devices for many pools in many VMs gives rise to the capacity utilization concerns.
Another important concern is the system resource and performance penalty one pays for layering ZFS pools over other pools, e.g. logs over zvols. Conceptually, this approach results in two nested levels of intent logging - one in the SSD pool for the 'log' zvol itself, and one on top of that, in the spinning disk pool that uses the zvol as log device.
I could go into details here, but essentially, I/O processing goes through many software layers, re-scheduling many times to many different threads, performing extra I/Os, caching the same data many times (while copying that data between the caches), paying penalty in CPU, memory, storage capacity, and latency of the I/Os under processing.
Hope that helps.
P.S. Without knowing details of your stack, I cannot advise, but I wonder if you could simply use files from two filesystems from the same spinning disk pool with SSD-based log device(s), one filesystem with no sync and one with sync, passed to your VMs as virtual devices. That seems to be more in line with conventional virtualization approaches. Barring that, again, you could provision zvols from the same pool, and use sync/no sync approach. I am sure you have tried this, although I don't know why the performance would be worse than layering logs on top of zvols.
…________________________________
From: koplover <notifications@github.com>
Sent: Friday, May 5, 2017 4:50:10 AM
To: zfsonlinux/zfs
Cc: Boris Protopopov; Mention
Subject: Re: [zfsonlinux/zfs] Adding ZVol as log device of seprate pool causes deadlock. (#6065)
@bprotopopov<https://github.com/bprotopopov> This is a production configuration
Bear in mind in our setup we have a heavily virtualised environment (Xen) with many different functions realised as separate virtualised guests - around 20-30. All of these guests are supported by zvols provided from the base zpool (vdevs full hard disks in mirror configuration, and log / cache device from SSD).
One of these virtualised workloads (VMs) benefits greatly from running a ZFS filesystem, a file server function, where snapshotting provides important features for user restore, backup etc. Essentially, it is 'luck' that we choose ZFS in both places in the architecture (due to the fine features ZFS provides.
So, the next question is how to provide this in the most performant and architecturally sound manner. We have tried various configurations, including passing through the zvol in different ways to the overlying VM, and having just a plain vdev comprising the overlying (VM) zpool within the zvol log.
However, this too has given rise to performance issues. We now have the 'data' zvol as nosynch, and the 'log' zvol as synch which performs well and does not risk the data.
In truth if the underlying provider zvol corrupts in any way we are in a bad way and need to rebuild as all the guest VMs are hit. The restore of the overlying file server zpool is just one instance of restoring this data.
Given the above, are you still concerned of the relavance of this scenario? In terms of SSD pool what are you considering here - why would it run out of space specifically as the overlying zvol is just a another zvol from the perspective of the supporting primary zpool - the 'log' zvol just has an optimum block size?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#6065 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/ACX4ueszDVK8-cDB_6IxzYluOJRMcpxfks5r2uLCgaJpZM4NHLLu>.
|
@bprotopopov Thanks for getting back to me with such a detailed answer In terms of the two overriding concerns at the start:
Now if we consider the function of the file service VM (a single VM, no other VM runs a zpool of its own as their filesystem, but are ext4 for Linux VMs or ntfs for windows) in isolation, if this was a standalone server we would want ZFS for various functional integrations this supports. It is in this case just a 'file system' (yes I know its more than that) that we choose for this VM for its capabilities. So, this then leaves us with how best to achieve this. We require that writes are synchronous. We want file writes to be reliable and consistent - no loss. My first attempt was therefore the obvious, a synch zvol passed through, and a simple pool defined which was used to hold files. There is no 'no synch' functionality required for the solution - all file writes should be synchronous. Performance testing found large copies in this configuration to be slow. This is where the alternative thoughts come from, we want to amalgamate writes on the underlying SSD zpool to make this most efficient, How can this be achieved. The thinking was how about passing through a second zvol and make that synchronous to act a log device in that pool, so that all writes are guaranteed to the log and are most efficiently passed on to the SSD pool log. As the log writes are the guaranteed, we can change the core file zvol asynch to use that log without file loss. What I really don't understand is why having the zvol acting as the log of the overlying zpool is any different to any other zvol with a file system on top. From the perspective of the underlying pool, there is just block writes occurring from the overlying zvol. Perhaps I am missing something here? For clarity this is the only zvol we use as a log device over the >100 we have on our server. This is to handle a problem we have seen in performance when we have an overlying pool consisting only of a synch zvol defining the single vdev of that pool (no log or cache). |
@koplover, if you feel that this is your preferred solution, that's what you have to use.
Even though conceptually, double-virtualizing storage using zvol-as-vdev approach seems wasteful, l recommend measurements. To assess the performance hit in question, compare your setup with passing straight SSD to your in-VM pool as log device. Maybe your workload is such that this is not an issue.
…________________________________
From: koplover <notifications@github.com>
Sent: Friday, May 5, 2017 10:45:31 AM
To: zfsonlinux/zfs
Cc: Boris Protopopov; Mention
Subject: Re: [zfsonlinux/zfs] Adding ZVol as log device of seprate pool causes deadlock. (#6065)
@bprotopopov<https://github.com/bprotopopov> Thanks for getting back to me with such a detailed answer
In terms of the two overriding concerns at the start:
1. yes the SSD pool has many zvols - well over 100 - they are the basis of every VM we run, some have 4 or 5, and we have > 20 VMs - so you can see if the SSD pool goes down we are in more trouble for the disk subsystem than concerns on this 'log' zvol alone - each root binary disk etc, configuration appears on the zvols (we need to pass a block device through to the VMs, we could loop but zvols are made for this job)
2. We ensure that the SSD pool does not go over 90% for the reported performance issues - we allow provisioning of only up to 90% of the SSD pool assigned space
Now if we consider the function of the file service VM (a single VM, no other VM runs a zpool of its own as their filesystem, but are ext4 for Linux VMs or ntfs for windows) in isolation, if this was a standalone server we would want ZFS for various functional integrations this supports. It is in this case just a 'file system' (yes I know its more than that) that we choose for this VM for its capabilities.
So, this then leaves us with how best to achieve this. We require that writes are synchronous. We want file writes to be reliable and consistent - no loss. My first attempt was therefore the obvious, a synch zvol passed through, and a simple pool defined which was used to hold files. There is no 'no synch' functionality required for the solution - all file writes should be synchronous.
Performance testing found large copies in this configuration to be slow. This is where the alternative thoughts come from, we want to amalgamate writes on the underlying SSD zpool to make this most efficient, How can this be achieved.
The thinking was how about passing through a second zvol and make that synchronous, so that all writes are guaranteed to the log and are most efficiently passed on to the SSD pool log. As the log writes are the guaranteed, we can change the core file zvol asynch to use that log without file loss.
What I really don't understand is why having the zvol acting as the log of the overlying zpool is any different to any other zvol with a file system on top. From the perspective of the underlying pool, there is just block writes occurring from the overlying zvol. Perhaps I am missing something here?
For clarity this is the only zvol we use as a log device over the >100 we have on our server. This is to handle a problem we have seen in performance when we have an overlying pool consisting only of a synch zvol defining the single vdev of that pool (no log or cache).
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#6065 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/ACX4uauwaWt8fw_AazEF2KUm8ebOyIu0ks5r2zYLgaJpZM4NHLLu>.
|
@bprotopopov Just for clarity, your concern on performance is purely zvol as log device, or more generally zvol acting as vdev for zpool data? Our benchmarks have found performance to be OK in this configuration |
I would like to avoid making general statements as far as performance is concerned. I was pointing out the inefficiencies of using zvols as vdevs.
One could still find this approach yielding "acceptable" performance in many settings. Yet to see what price is being paid in terms of resources and performance, one needs to run experiments and take measurements.
Typos courtesy of my iPhone
On May 8, 2017, at 5:04 AM, koplover <notifications@github.com<mailto:notifications@github.com>> wrote:
@bprotopopov<https://github.com/bprotopopov> Just for clarity, your concern on performance is purely zvol as log device, or more generally zvol acting as vdev for zpool data?
Our benchmarks have found performance to be OK in this configuration
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#6065 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/ACX4uSmkSAQByRxICgs9AuAWdOCNkTHGks5r3tpzgaJpZM4NHLLu>.
|
Hi, @behlendorf, I am reviewing the code to prototype the proposed change (per-zvol_state_t lock), but in commit 35d3e32, you seem to have added a code path in zvol_set_volsize() that updates zvol's size even though zvol has not been opened yet (there is no zvol_state_t struct associated with it). Can you please explain what the use case is when this is meaningful ? |
@bprotopopov it's possible that a zvol dataset can exist without a matching |
Hm, @behlendorf :) |
Hi, @behlendorf, if you could refresh my recollection, are zvol_open() and zvol_release() are called from add_disk() and del_gendisk() only, which is why current code checks for owning the zvol_state_lock first, or there are other call paths in zvol code that call zvol_open() and zvol_release() with zvol_state_lock held ? |
Yes, for any of the special properties in
|
I have to confess, I sort of saw it coming :) but still wanted to see this in action. @behlendorf, it appears that with zvol_state_lock moved out of the way, we are now deadlocking on the bdev->bd_mutex and spa_namespace_lock taken in the opposite order: PID: 12591 TASK: ffff88007d041520 CPU: 0 COMMAND: "zpool" where zfs_ioc_vdev_add() calls spa_open() as a first order of business, then opens the device, trying to take bdev->bd_mutex in __blkdev_get() for the zv->zv_disk associated with the log zvol PID: 12699 TASK: ffff880068435520 CPU: 2 COMMAND: "blkid" where sys_open() first takes the bdev->bd_mutex and then tries to get the spa_namespace_lock. This seems like a pretty fundamental issue, not just the global nature of the zvol_state_lock, that makes it difficult to use zvols as log devices. |
I seem to observe the same deadlock with trying to add a zvol as a regular vdev. Is there any evidence that this has ever worked in zfsonlinux ? |
This has never worked perfectly reliably under Linux but it used to be better. @bprotopopov so we used to have a little bit of code is The idea was that we could preemptively take the spa_namespace lock in We also ran in to various related deadlocks due to how broad the zvol_state_lock is. But with those issues behind us with your work the original workaround from 1ee159f might be sufficient (although admittedly not pretty). This major downside of course is the serialization through the |
@bprotopopov @behlendorf I can confirm that with 0.6.4.2 we never saw this issue, obviously our system preparation code luckily avoided any race conditions within the code We have an automated deploy / test stack that completely re-images on average 3-4 servers a night, every night, the the latest test builds, as well as deploying production systems as required. So over the last year > 1000 deploys and have never seen this issue. Any similar deployment against 0.7.0 and indeed manual creation of the overlying zpool immediately exhibits the issue every time. |
Yes, reverting 1ee159f should help. |
But I wonder if it would be cleaner to implement a special vdev type based on zvols ? |
Yeah, seems to work much better :) [root@centos-6 build]# zpool add test_pool /dev/zvol/log_pool/zvol0 |
So, generally, things seem to be working OK (did not do much I/O testing though), but: [root@centos-6 ~]# zpool import
pool: test_pool
the zvol-based test_pool shows up as UNAVAIL until the underlying zvol_pool is imported:[root@centos-6 ~]# zpool import zvol_pool
which I think is expected. |
Great news. @bprotopopov that's the behavior I'd expect. Once you're happy with the patch it would be great to open a PR so we can get you some feedback and additional testing. You might want to consider enabling the following test cases in the ZTS which depend on this layered pool functionality, That will get us some additional testing and ensure we don't regress on this in the future: |
@behlendorf, sure I will look into it. |
@behlendorf, I have looked into enabling the tests you have mentioned above and found that there are several things to be addressed, unrelated to support for zvol-based vdevs, before they can be enabled. I'll have to deal with this in a separate commit. |
@behlendorf, @koplover, I am still thinking about a new vdev type - zvol-based - as an alternative to the trylock() in zvol_first_open(). The custom vdev_open() would not have to go through the blkdev_get() and would therefore avoid the deadlock shown above. In fact, we could allow one to specify such vdevs by their dataset name (pool/zvol), so one could use zvols even if the device nodes for them are not available (zvol_inhibit_dev=1). This approach would also allow one to bypass all the complexities of interacting with the block device layer and therefore, make zvol-based vdev support more portable (openzfs). Plus, we could probably gain some efficiencies by shorting the trip through the block layer. Still, one fundamentally unpleasant issue with the approach of building pools on top of zvols is that it might be possible to get into a circular dependency situation where pools would use zvols from each other as their vdevs. Needless to say, this would be unfortunate. It could be expensive to perform dependency checks of this kind as the time of vdev add. |
Enable about 50 additional test cases which were previously disabled. The required fixes are as follows: * cache_001_pos - No changes required. * cache_010_neg - Updated to use losetup under Linux. Loopback cache devices are allowed, ZVOLs as cache devices are not. * cachefile_001_pos, cachefile_002_pos, cachefile_003_pos, cachefile_004_pos - Set set_device_dir path in cachefile.cfg, updated CPATH1 and CPATH2 to reference unique files. * zfs_clone_005_pos - Wait for udev to create volumes. * zfs_mount_007_pos - Updated mount options to expected Linux names. * zfs_mount_009_neg, zfs_mount_all_001_pos - No changes required. * zfs_unmount_005_pos, zfs_unmount_009_pos, zfs_unmount_all_001_pos - Updated to expect -f to not unmount busy mount points under Linux. * rsend_019_pos - Observed to occasionally take a long time on both 32-bit systems and the kmemleak builder. * zfs_written_property_001_pos - Switched sync(1) to sync_pool. * devices_001_pos, devices_002_neg - Updated create_dev_file() helper for Linux. * exec_002_neg.ksh - Fixed mmap_exec.c to preserve errno. Updated test case to expect EPERM from Linux as described by mmap(2). * grow_pool_001_pos - Adding missing setup.ksh and cleanup.ksh scripts from OpenZFS. * grow_replicas_001_pos.ksh - Added missing $SLICE_* variables. * history_004_pos, history_006_neg, history_008_pos - Fixed by previous commits and were not enabled. No changes. * zfs_allow_010_pos - Added missing spaces after assorted zfs commands in delegate_common.kshlib. * inuse_* - Illumos dump device tests skipped. Remaining test cases updated to correctly create required partitions. * large_files_001_pos - Fixed largest_file.c to accept EINVAL as well as EFBIG as described in write(2). * largest_pool_001_pos - Skipped until layering pools on zvols is supported. * link_count_001 - Added nproc to required commands. * umountall_001 - Updated to use umount -a. * online_offline_001_* - Pull in OpenZFS change to file_trunc.c to make the '-c 0' option run the test in a loop. Included online_offline.cfg file in all test cases. * rename_dirs_001_pos - Updated to use the rename_dir test binary, pkill restricted to exact matches and total runtime reduced. * slog_013_neg - No changes required. * slog_013_pos.ksh - Updated to use losetup under Linux. * slog_014_pos.ksh - ZED will not be running, manually degrade the damaged vdev as expected. * threadsappend_001_pos, write_dirs_002_pos - No changes required. * nopwrite_varying_compression, nopwrite_volume - Forced pool sync with sync_pool to ensure up to date property values. * Fixed typos in ZED log messages. Refactored zed_* helper functions to resolve all-syslog exit=1 errors in zedlog. * zfs_copies_003_pos, zfs_copies_005_neg, zfs_get_004_pos, clone_001_pos, cache_010_neg.ksh, zpool_add_004_pos, zpool_destroy_001_pos, largest_pool_001_pos - Enable tests which create pools layed on ZVOLs, resolved by PR openzfs#6065. * largest_pool_001_pos - Limited to 7eb pool, maximum supported size in 8eb-1 on Linux. * zpool_expand_001_pos, zpool_expand_003_neg - Updated skip reason. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Enable about 50 additional test cases which were previously disabled. The required fixes are as follows: * cache_001_pos - No changes required. * cache_010_neg - Updated to use losetup under Linux. Loopback cache devices are allowed, ZVOLs as cache devices are not. * cachefile_001_pos, cachefile_002_pos, cachefile_003_pos, cachefile_004_pos - Set set_device_dir path in cachefile.cfg, updated CPATH1 and CPATH2 to reference unique files. * zfs_clone_005_pos - Wait for udev to create volumes. * zfs_mount_007_pos - Updated mount options to expected Linux names. * zfs_mount_009_neg, zfs_mount_all_001_pos - No changes required. * zfs_unmount_005_pos, zfs_unmount_009_pos, zfs_unmount_all_001_pos - Updated to expect -f to not unmount busy mount points under Linux. * rsend_019_pos - Observed to occasionally take a long time on both 32-bit systems and the kmemleak builder. * zfs_written_property_001_pos - Switched sync(1) to sync_pool. * devices_001_pos, devices_002_neg - Updated create_dev_file() helper for Linux. * exec_002_neg.ksh - Fixed mmap_exec.c to preserve errno. Updated test case to expect EPERM from Linux as described by mmap(2). * grow_pool_001_pos - Adding missing setup.ksh and cleanup.ksh scripts from OpenZFS. * grow_replicas_001_pos.ksh - Added missing $SLICE_* variables. * history_004_pos, history_006_neg, history_008_pos - Fixed by previous commits and were not enabled. No changes. * zfs_allow_010_pos - Added missing spaces after assorted zfs commands in delegate_common.kshlib. * inuse_* - Illumos dump device tests skipped. Remaining test cases updated to correctly create required partitions. * large_files_001_pos - Fixed largest_file.c to accept EINVAL as well as EFBIG as described in write(2). * largest_pool_001_pos - Skipped until layering pools on zvols is supported. * link_count_001 - Added nproc to required commands. * umountall_001 - Updated to use umount -a. * online_offline_001_* - Pull in OpenZFS change to file_trunc.c to make the '-c 0' option run the test in a loop. Included online_offline.cfg file in all test cases. * rename_dirs_001_pos - Updated to use the rename_dir test binary, pkill restricted to exact matches and total runtime reduced. * slog_013_neg - No changes required. * slog_013_pos.ksh - Updated to use losetup under Linux. * slog_014_pos.ksh - ZED will not be running, manually degrade the damaged vdev as expected. * threadsappend_001_pos, write_dirs_002_pos - No changes required. * nopwrite_varying_compression, nopwrite_volume - Forced pool sync with sync_pool to ensure up to date property values. * Fixed typos in ZED log messages. Refactored zed_* helper functions to resolve all-syslog exit=1 errors in zedlog. * zfs_copies_003_pos, zfs_copies_005_neg, zfs_get_004_pos, clone_001_pos, cache_010_neg.ksh, zpool_add_004_pos, zpool_destroy_001_pos, largest_pool_001_pos - Enable tests which create pools layed on ZVOLs, resolved by PR openzfs#6065. * largest_pool_001_pos - Limited to 7eb pool, maximum supported size in 8eb-1 on Linux. * zpool_expand_001_pos, zpool_expand_003_neg - Updated skip reason. * zfs_rollback_001_pos, zfs_rollback_002_pos - Properly cleanup busy mount points under Linux between test loops. * privilege_001_pos, privilege_003_pos - Skip with log_unsupported. * rollback_003_pos, snapshot_016_pos - No changes required. * snapshot_008_pos - Increased LIMIT from 512K to 2M and added sync_pool to avoid false positives. * xattr_* - Updated for Linux and enabled. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Fix lock order inversion with zvol_open() as it did not account for use of zvols as vdevs. The latter use cases resulted in the lock order inversion deadlocks that involved spa_namespace_lock and bdev->bd_mutex. Signed-off-by: Boris Protopopov <boris.protopopov@actifio.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #6065 Issue #6134
Enable about 50 additional test cases which were previously disabled. The required fixes are as follows: * cache_001_pos - No changes required. * cache_010_neg - Updated to use losetup under Linux. Loopback cache devices are allowed, ZVOLs as cache devices are not. Disabled until all the builders pass reliably. * cachefile_001_pos, cachefile_002_pos, cachefile_003_pos, cachefile_004_pos - Set set_device_dir path in cachefile.cfg, updated CPATH1 and CPATH2 to reference unique files. * zfs_clone_005_pos - Wait for udev to create volumes. * zfs_mount_007_pos - Updated mount options to expected Linux names. * zfs_mount_009_neg, zfs_mount_all_001_pos - No changes required. * zfs_unmount_005_pos, zfs_unmount_009_pos, zfs_unmount_all_001_pos - Updated to expect -f to not unmount busy mount points under Linux. * rsend_019_pos - Observed to occasionally take a long time on both 32-bit systems and the kmemleak builder. * zfs_written_property_001_pos - Switched sync(1) to sync_pool. * devices_001_pos, devices_002_neg - Updated create_dev_file() helper for Linux. * exec_002_neg.ksh - Fixed mmap_exec.c to preserve errno. Updated test case to expect EPERM from Linux as described by mmap(2). * grow_pool_001_pos - Adding missing setup.ksh and cleanup.ksh scripts from OpenZFS. * grow_replicas_001_pos.ksh - Added missing $SLICE_* variables. * history_004_pos, history_006_neg, history_008_pos - Fixed by previous commits and were not enabled. No changes. * zfs_allow_010_pos - Added missing spaces after assorted zfs commands in delegate_common.kshlib. * inuse_* - Illumos dump device tests skipped. Remaining test cases updated to correctly create required partitions. * large_files_001_pos - Fixed largest_file.c to accept EINVAL as well as EFBIG as described in write(2). * largest_pool_001_pos - Skipped until layering pools on zvols is supported. * link_count_001 - Added nproc to required commands. * umountall_001 - Updated to use umount -a. * online_offline_001_* - Pull in OpenZFS change to file_trunc.c to make the '-c 0' option run the test in a loop. Included online_offline.cfg file in all test cases. * rename_dirs_001_pos - Updated to use the rename_dir test binary, pkill restricted to exact matches and total runtime reduced. * slog_013_neg - No changes required. * slog_013_pos.ksh - Updated to use losetup under Linux. * slog_014_pos.ksh - ZED will not be running, manually degrade the damaged vdev as expected. * threadsappend_001_pos, write_dirs_002_pos - No changes required. * nopwrite_varying_compression, nopwrite_volume - Forced pool sync with sync_pool to ensure up to date property values. * Fixed typos in ZED log messages. Refactored zed_* helper functions to resolve all-syslog exit=1 errors in zedlog. * zfs_copies_003_pos, zfs_copies_005_neg, zfs_get_004_pos, clone_001_pos, cache_010_neg.ksh, zpool_add_004_pos, zpool_destroy_001_pos, largest_pool_001_pos - Enable tests which create pools layed on ZVOLs, resolved by PR openzfs#6065. * largest_pool_001_pos - Limited to 7eb pool, maximum supported size in 8eb-1 on Linux. * zpool_expand_001_pos, zpool_expand_003_neg - Updated skip reason. * zfs_rollback_001_pos, zfs_rollback_002_pos - Properly cleanup busy mount points under Linux between test loops. * privilege_001_pos, privilege_003_pos - Skip with log_unsupported. * rollback_003_pos, snapshot_016_pos - No changes required. * snapshot_008_pos - Increased LIMIT from 512K to 2M and added sync_pool to avoid false positives. * xattr_* - Updated for Linux and enabled. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Enable about 50 additional test cases which were previously disabled. The required fixes are as follows: * cache_001_pos - No changes required. * cache_010_neg - Updated to use losetup under Linux. Loopback cache devices are allowed, ZVOLs as cache devices are not. Disabled until all the builders pass reliably. * cachefile_001_pos, cachefile_002_pos, cachefile_003_pos, cachefile_004_pos - Set set_device_dir path in cachefile.cfg, updated CPATH1 and CPATH2 to reference unique files. * zfs_clone_005_pos - Wait for udev to create volumes. * zfs_mount_007_pos - Updated mount options to expected Linux names. * zfs_mount_009_neg, zfs_mount_all_001_pos - No changes required. * zfs_unmount_005_pos, zfs_unmount_009_pos, zfs_unmount_all_001_pos - Updated to expect -f to not unmount busy mount points under Linux. * rsend_019_pos - Observed to occasionally take a long time on both 32-bit systems and the kmemleak builder. * zfs_written_property_001_pos - Switched sync(1) to sync_pool. * devices_001_pos, devices_002_neg - Updated create_dev_file() helper for Linux. * exec_002_neg.ksh - Fixed mmap_exec.c to preserve errno. Updated test case to expect EPERM from Linux as described by mmap(2). * grow_pool_001_pos - Adding missing setup.ksh and cleanup.ksh scripts from OpenZFS. * grow_replicas_001_pos.ksh - Added missing $SLICE_* variables. * history_004_pos, history_006_neg, history_008_pos - Fixed by previous commits and were not enabled. No changes. * zfs_allow_010_pos - Added missing spaces after assorted zfs commands in delegate_common.kshlib. * inuse_* - Illumos dump device tests skipped. Remaining test cases updated to correctly create required partitions. * large_files_001_pos - Fixed largest_file.c to accept EINVAL as well as EFBIG as described in write(2). * largest_pool_001_pos - Skipped until layering pools on zvols is supported. * link_count_001 - Added nproc to required commands. * umountall_001 - Updated to use umount -a. * online_offline_001_* - Pull in OpenZFS change to file_trunc.c to make the '-c 0' option run the test in a loop. Included online_offline.cfg file in all test cases. * rename_dirs_001_pos - Updated to use the rename_dir test binary, pkill restricted to exact matches and total runtime reduced. * slog_013_neg - No changes required. * slog_013_pos.ksh - Updated to use losetup under Linux. * slog_014_pos.ksh - ZED will not be running, manually degrade the damaged vdev as expected. * threadsappend_001_pos, write_dirs_002_pos - No changes required. * nopwrite_varying_compression, nopwrite_volume - Forced pool sync with sync_pool to ensure up to date property values. * Fixed typos in ZED log messages. Refactored zed_* helper functions to resolve all-syslog exit=1 errors in zedlog. * zfs_copies_003_pos, zfs_copies_005_neg, zfs_get_004_pos, clone_001_pos, cache_010_neg.ksh, zpool_add_004_pos, zpool_destroy_001_pos, largest_pool_001_pos - Enable tests which create pools layed on ZVOLs, resolved by PR openzfs#6065. * largest_pool_001_pos - Limited to 7eb pool, maximum supported size in 8eb-1 on Linux. * zpool_expand_001_pos, zpool_expand_003_neg - Updated skip reason. * zfs_rollback_001_pos, zfs_rollback_002_pos - Properly cleanup busy mount points under Linux between test loops. * privilege_001_pos, privilege_003_pos - Skip with log_unsupported. * rollback_003_pos, snapshot_016_pos - No changes required. * snapshot_008_pos - Increased LIMIT from 512K to 2M and added sync_pool to avoid false positives. * xattr_* - Updated for Linux and enabled. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Enable about 50 additional test cases which were previously disabled. The required fixes are as follows: * cache_001_pos - No changes required. * cache_010_neg - Updated to use losetup under Linux. Loopback cache devices are allowed, ZVOLs as cache devices are not. Disabled until all the builders pass reliably. * cachefile_001_pos, cachefile_002_pos, cachefile_003_pos, cachefile_004_pos - Set set_device_dir path in cachefile.cfg, updated CPATH1 and CPATH2 to reference unique files. * zfs_clone_005_pos - Wait for udev to create volumes. * zfs_mount_007_pos - Updated mount options to expected Linux names. * zfs_mount_009_neg, zfs_mount_all_001_pos - No changes required. * zfs_unmount_005_pos, zfs_unmount_009_pos, zfs_unmount_all_001_pos - Updated to expect -f to not unmount busy mount points under Linux. * rsend_019_pos - Observed to occasionally take a long time on both 32-bit systems and the kmemleak builder. * zfs_written_property_001_pos - Switched sync(1) to sync_pool. * devices_001_pos, devices_002_neg - Updated create_dev_file() helper for Linux. * exec_002_neg.ksh - Fixed mmap_exec.c to preserve errno. Updated test case to expect EPERM from Linux as described by mmap(2). * grow_pool_001_pos - Adding missing setup.ksh and cleanup.ksh scripts from OpenZFS. * grow_replicas_001_pos.ksh - Added missing $SLICE_* variables. * history_004_pos, history_006_neg, history_008_pos - Fixed by previous commits and were not enabled. No changes. * zfs_allow_010_pos - Added missing spaces after assorted zfs commands in delegate_common.kshlib. * inuse_* - Illumos dump device tests skipped. Remaining test cases updated to correctly create required partitions. * large_files_001_pos - Fixed largest_file.c to accept EINVAL as well as EFBIG as described in write(2). * largest_pool_001_pos - Skipped until layering pools on zvols is supported. * link_count_001 - Added nproc to required commands. * umountall_001 - Updated to use umount -a. * online_offline_001_* - Pull in OpenZFS change to file_trunc.c to make the '-c 0' option run the test in a loop. Included online_offline.cfg file in all test cases. * rename_dirs_001_pos - Updated to use the rename_dir test binary, pkill restricted to exact matches and total runtime reduced. * slog_013_neg - No changes required. * slog_013_pos.ksh - Updated to use losetup under Linux. * slog_014_pos.ksh - ZED will not be running, manually degrade the damaged vdev as expected. * threadsappend_001_pos, write_dirs_002_pos - No changes required. * nopwrite_varying_compression, nopwrite_volume - Forced pool sync with sync_pool to ensure up to date property values. * Fixed typos in ZED log messages. Refactored zed_* helper functions to resolve all-syslog exit=1 errors in zedlog. * zfs_copies_003_pos, zfs_copies_005_neg, zfs_get_004_pos, clone_001_pos, cache_010_neg.ksh, zpool_add_004_pos, zpool_destroy_001_pos, largest_pool_001_pos - Enable tests which create pools layed on ZVOLs, resolved by PR openzfs#6065. * largest_pool_001_pos - Limited to 7eb pool, maximum supported size in 8eb-1 on Linux. * zpool_expand_001_pos, zpool_expand_003_neg - Updated skip reason. * zfs_rollback_001_pos, zfs_rollback_002_pos - Properly cleanup busy mount points under Linux between test loops. * privilege_001_pos, privilege_003_pos - Skip with log_unsupported. * rollback_003_pos, snapshot_016_pos - No changes required. * snapshot_008_pos - Increased LIMIT from 512K to 2M and added sync_pool to avoid false positives. * xattr_* - Updated for Linux and enabled. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Enable most of the remaining test cases which were previously disabled. The required fixes are as follows: * cache_001_pos - No changes required. * cache_010_neg - Updated to use losetup under Linux. Loopback cache devices are allowed, ZVOLs as cache devices are not. Disabled until all the builders pass reliably. * cachefile_001_pos, cachefile_002_pos, cachefile_003_pos, cachefile_004_pos - Set set_device_dir path in cachefile.cfg, updated CPATH1 and CPATH2 to reference unique files. * zfs_clone_005_pos - Wait for udev to create volumes. * zfs_mount_007_pos - Updated mount options to expected Linux names. * zfs_mount_009_neg, zfs_mount_all_001_pos - No changes required. * zfs_unmount_005_pos, zfs_unmount_009_pos, zfs_unmount_all_001_pos - Updated to expect -f to not unmount busy mount points under Linux. * rsend_019_pos - Observed to occasionally take a long time on both 32-bit systems and the kmemleak builder. * zfs_written_property_001_pos - Switched sync(1) to sync_pool. * devices_001_pos, devices_002_neg - Updated create_dev_file() helper for Linux. * exec_002_neg.ksh - Fixed mmap_exec.c to preserve errno. Updated test case to expect EPERM from Linux as described by mmap(2). * grow_pool_001_pos - Adding missing setup.ksh and cleanup.ksh scripts from OpenZFS. * grow_replicas_001_pos.ksh - Added missing $SLICE_* variables. * history_004_pos, history_006_neg, history_008_pos - Fixed by previous commits and were not enabled. No changes. * zfs_allow_010_pos - Added missing spaces after assorted zfs commands in delegate_common.kshlib. * inuse_* - Illumos dump device tests skipped. Remaining test cases updated to correctly create required partitions. * large_files_001_pos - Fixed largest_file.c to accept EINVAL as well as EFBIG as described in write(2). * largest_pool_001_pos - Skipped until layering pools on zvols is supported. * link_count_001 - Added nproc to required commands. * umountall_001 - Updated to use umount -a. * online_offline_001_* - Pull in OpenZFS change to file_trunc.c to make the '-c 0' option run the test in a loop. Included online_offline.cfg file in all test cases. * rename_dirs_001_pos - Updated to use the rename_dir test binary, pkill restricted to exact matches and total runtime reduced. * slog_013_neg - No changes required. * slog_013_pos.ksh - Updated to use losetup under Linux. * slog_014_pos.ksh - ZED will not be running, manually degrade the damaged vdev as expected. * threadsappend_001_pos, write_dirs_002_pos - No changes required. * nopwrite_varying_compression, nopwrite_volume - Forced pool sync with sync_pool to ensure up to date property values. * Fixed typos in ZED log messages. Refactored zed_* helper functions to resolve all-syslog exit=1 errors in zedlog. * zfs_copies_003_pos, zfs_copies_005_neg, zfs_get_004_pos, clone_001_pos, cache_010_neg.ksh, zpool_add_004_pos, zpool_destroy_001_pos, largest_pool_001_pos - Enable tests which create pools layed on ZVOLs, resolved by PR openzfs#6065. * largest_pool_001_pos - Limited to 7eb pool, maximum supported size in 8eb-1 on Linux. * zpool_expand_001_pos, zpool_expand_003_neg - Updated skip reason. * zfs_rollback_001_pos, zfs_rollback_002_pos - Properly cleanup busy mount points under Linux between test loops. * privilege_001_pos, privilege_003_pos - Skip with log_unsupported. * rollback_003_pos, snapshot_016_pos - No changes required. * snapshot_008_pos - Increased LIMIT from 512K to 2M and added sync_pool to avoid false positives. * xattr_* - Updated for Linux and enabled. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
@bprotopopov that's an interesting idea. There's no reason why we couldn't add a new vdev leaf type which could layer directly on a zvol bypassing the block layer. You could do a similar thing and layer on files in a ZFS filesystem bypassing the VFS. It would be an interesting thing to prototype to determine what practical advantages there are. Would it really help performance, etc. Back to this specific issue after merging 5559ba0 layering directly on zvols is now working well locally for me. However, I have noticed a couple suspicious buildbot failures where tests |
@kpande good thought. I've definitely observed the suspicious failures on systems with 16K stacks so I wouldn't think so, but it's a possibility. |
@behlendorf, if there is more info on the hangs, I can take a look |
Enable most of the remaining test cases which were previously disabled. The required fixes are as follows: * cache_001_pos - No changes required. * cache_010_neg - Updated to use losetup under Linux. Loopback cache devices are allowed, ZVOLs as cache devices are not. Disabled until all the builders pass reliably. * cachefile_001_pos, cachefile_002_pos, cachefile_003_pos, cachefile_004_pos - Set set_device_dir path in cachefile.cfg, updated CPATH1 and CPATH2 to reference unique files. * zfs_clone_005_pos - Wait for udev to create volumes. * zfs_mount_007_pos - Updated mount options to expected Linux names. * zfs_mount_009_neg, zfs_mount_all_001_pos - No changes required. * zfs_unmount_005_pos, zfs_unmount_009_pos, zfs_unmount_all_001_pos - Updated to expect -f to not unmount busy mount points under Linux. * rsend_019_pos - Observed to occasionally take a long time on both 32-bit systems and the kmemleak builder. * zfs_written_property_001_pos - Switched sync(1) to sync_pool. * devices_001_pos, devices_002_neg - Updated create_dev_file() helper for Linux. * exec_002_neg.ksh - Fixed mmap_exec.c to preserve errno. Updated test case to expect EPERM from Linux as described by mmap(2). * grow_pool_001_pos - Adding missing setup.ksh and cleanup.ksh scripts from OpenZFS. * grow_replicas_001_pos.ksh - Added missing $SLICE_* variables. * history_004_pos, history_006_neg, history_008_pos - Fixed by previous commits and were not enabled. No changes. * zfs_allow_010_pos - Added missing spaces after assorted zfs commands in delegate_common.kshlib. * inuse_* - Illumos dump device tests skipped. Remaining test cases updated to correctly create required partitions. * large_files_001_pos - Fixed largest_file.c to accept EINVAL as well as EFBIG as described in write(2). * largest_pool_001_pos - Skipped until layering pools on zvols is supported. * link_count_001 - Added nproc to required commands. * umountall_001 - Updated to use umount -a. * online_offline_001_* - Pull in OpenZFS change to file_trunc.c to make the '-c 0' option run the test in a loop. Included online_offline.cfg file in all test cases. * rename_dirs_001_pos - Updated to use the rename_dir test binary, pkill restricted to exact matches and total runtime reduced. * slog_013_neg - No changes required. * slog_013_pos.ksh - Updated to use losetup under Linux. * slog_014_pos.ksh - ZED will not be running, manually degrade the damaged vdev as expected. * threadsappend_001_pos, write_dirs_002_pos - No changes required. * nopwrite_varying_compression, nopwrite_volume - Forced pool sync with sync_pool to ensure up to date property values. * Fixed typos in ZED log messages. Refactored zed_* helper functions to resolve all-syslog exit=1 errors in zedlog. * zfs_copies_003_pos, zfs_copies_005_neg, zfs_get_004_pos, clone_001_pos, cache_010_neg.ksh, zpool_add_004_pos, zpool_destroy_001_pos, largest_pool_001_pos - Enable tests which create pools layed on ZVOLs, resolved by PR openzfs#6065. * largest_pool_001_pos - Limited to 7eb pool, maximum supported size in 8eb-1 on Linux. * zpool_expand_001_pos, zpool_expand_003_neg - Updated skip reason. * zfs_rollback_001_pos, zfs_rollback_002_pos - Properly cleanup busy mount points under Linux between test loops. * privilege_001_pos, privilege_003_pos - Skip with log_unsupported. * rollback_003_pos, snapshot_016_pos - No changes required. * snapshot_008_pos - Increased LIMIT from 512K to 2M and added sync_pool to avoid false positives. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
@bprotopopov thanks, I'll definitely let you know and open an issue with the details if I'm able to get some backtracks. |
Enable most of the remaining test cases which were previously disabled. The required fixes are as follows: * cache_001_pos - No changes required. * cache_010_neg - Updated to use losetup under Linux. Loopback cache devices are allowed, ZVOLs as cache devices are not. Disabled until all the builders pass reliably. * cachefile_001_pos, cachefile_002_pos, cachefile_003_pos, cachefile_004_pos - Set set_device_dir path in cachefile.cfg, updated CPATH1 and CPATH2 to reference unique files. * zfs_clone_005_pos - Wait for udev to create volumes. * zfs_mount_007_pos - Updated mount options to expected Linux names. * zfs_mount_009_neg, zfs_mount_all_001_pos - No changes required. * zfs_unmount_005_pos, zfs_unmount_009_pos, zfs_unmount_all_001_pos - Updated to expect -f to not unmount busy mount points under Linux. * rsend_019_pos - Observed to occasionally take a long time on both 32-bit systems and the kmemleak builder. * zfs_written_property_001_pos - Switched sync(1) to sync_pool. * devices_001_pos, devices_002_neg - Updated create_dev_file() helper for Linux. * exec_002_neg.ksh - Fixed mmap_exec.c to preserve errno. Updated test case to expect EPERM from Linux as described by mmap(2). * grow_pool_001_pos - Adding missing setup.ksh and cleanup.ksh scripts from OpenZFS. * grow_replicas_001_pos.ksh - Added missing $SLICE_* variables. * history_004_pos, history_006_neg, history_008_pos - Fixed by previous commits and were not enabled. No changes required. * zfs_allow_010_pos - Added missing spaces after assorted zfs commands in delegate_common.kshlib. * inuse_* - Illumos dump device tests skipped. Remaining test cases updated to correctly create required partitions. * large_files_001_pos - Fixed largest_file.c to accept EINVAL as well as EFBIG as described in write(2). * largest_pool_001_pos - Skipped until layering pools on zvols is supported. * link_count_001 - Added nproc to required commands. * umountall_001 - Updated to use umount -a. * online_offline_001_* - Pull in OpenZFS change to file_trunc.c to make the '-c 0' option run the test in a loop. Included online_offline.cfg file in all test cases. * rename_dirs_001_pos - Updated to use the rename_dir test binary, pkill restricted to exact matches and total runtime reduced. * slog_013_neg, write_dirs_002_pos - No changes required. * slog_013_pos.ksh - Updated to use losetup under Linux. * slog_014_pos.ksh - ZED will not be running, manually degrade the damaged vdev as expected. * nopwrite_varying_compression, nopwrite_volume - Forced pool sync with sync_pool to ensure up to date property values. * Fixed typos in ZED log messages. Refactored zed_* helper functions to resolve all-syslog exit=1 errors in zedlog. * zfs_copies_003_pos, zfs_copies_005_neg, zfs_get_004_pos, clone_001_pos, cache_010_neg.ksh, zpool_add_004_pos, zpool_destroy_001_pos, largest_pool_001_pos - Enable tests which create pools layed on ZVOLs, resolved by PR openzfs#6065. * largest_pool_001_pos - Limited to 7eb pool, maximum supported size in 8eb-1 on Linux. * zpool_expand_001_pos, zpool_expand_003_neg - Updated skip reason. * zfs_rollback_001_pos, zfs_rollback_002_pos - Properly cleanup busy mount points under Linux between test loops. * privilege_001_pos, privilege_003_pos, rollback_003_pos, threadsappend_001_pos - Skip with log_unsupported. * snapshot_016_pos - No changes required. * snapshot_008_pos - Increased LIMIT from 512K to 2M and added sync_pool to avoid false positives. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
@bprotopopov I was able to easily reproduce the issue under Amazon Linux in ec2 with the |
Enable most of the remaining test cases which were previously disabled. The required fixes are as follows: * cache_001_pos - No changes required. * cache_010_neg - Updated to use losetup under Linux. Loopback cache devices are allowed, ZVOLs as cache devices are not. Disabled until all the builders pass reliably. * cachefile_001_pos, cachefile_002_pos, cachefile_003_pos, cachefile_004_pos - Set set_device_dir path in cachefile.cfg, updated CPATH1 and CPATH2 to reference unique files. * zfs_clone_005_pos - Wait for udev to create volumes. * zfs_mount_007_pos - Updated mount options to expected Linux names. * zfs_mount_009_neg, zfs_mount_all_001_pos - No changes required. * zfs_unmount_005_pos, zfs_unmount_009_pos, zfs_unmount_all_001_pos - Updated to expect -f to not unmount busy mount points under Linux. * rsend_019_pos - Observed to occasionally take a long time on both 32-bit systems and the kmemleak builder. * zfs_written_property_001_pos - Switched sync(1) to sync_pool. * devices_001_pos, devices_002_neg - Updated create_dev_file() helper for Linux. * exec_002_neg.ksh - Fixed mmap_exec.c to preserve errno. Updated test case to expect EPERM from Linux as described by mmap(2). * grow_pool_001_pos - Adding missing setup.ksh and cleanup.ksh scripts from OpenZFS. * grow_replicas_001_pos.ksh - Added missing $SLICE_* variables. * history_004_pos, history_006_neg, history_008_pos - Fixed by previous commits and were not enabled. No changes required. * zfs_allow_010_pos - Added missing spaces after assorted zfs commands in delegate_common.kshlib. * inuse_* - Illumos dump device tests skipped. Remaining test cases updated to correctly create required partitions. * large_files_001_pos - Fixed largest_file.c to accept EINVAL as well as EFBIG as described in write(2). * largest_pool_001_pos - Skipped until layering pools on zvols is supported. * link_count_001 - Added nproc to required commands. * umountall_001 - Updated to use umount -a. * online_offline_001_* - Pull in OpenZFS change to file_trunc.c to make the '-c 0' option run the test in a loop. Included online_offline.cfg file in all test cases. * rename_dirs_001_pos - Updated to use the rename_dir test binary, pkill restricted to exact matches and total runtime reduced. * slog_013_neg, write_dirs_002_pos - No changes required. * slog_013_pos.ksh - Updated to use losetup under Linux. * slog_014_pos.ksh - ZED will not be running, manually degrade the damaged vdev as expected. * nopwrite_varying_compression, nopwrite_volume - Forced pool sync with sync_pool to ensure up to date property values. * Fixed typos in ZED log messages. Refactored zed_* helper functions to resolve all-syslog exit=1 errors in zedlog. * zfs_copies_003_pos, zfs_copies_005_neg, zfs_get_004_pos, clone_001_pos, cache_010_neg.ksh, zpool_add_004_pos, zpool_destroy_001_pos, largest_pool_001_pos - Enable tests which create pools layed on ZVOLs, resolved by PR openzfs#6065. * largest_pool_001_pos - Limited to 7eb pool, maximum supported size in 8eb-1 on Linux. * zpool_expand_001_pos, zpool_expand_003_neg - Updated skip reason. * zfs_rollback_001_pos, zfs_rollback_002_pos - Properly cleanup busy mount points under Linux between test loops. * privilege_001_pos, privilege_003_pos, rollback_003_pos, threadsappend_001_pos - Skip with log_unsupported. * snapshot_016_pos - No changes required. * snapshot_008_pos - Increased LIMIT from 512K to 2M and added sync_pool to avoid false positives. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
System information
Describe the problem you're observing
I have 1 pool that is all SSD. I have anothe pool that is all spinning disks. I tried making a ZVol on the ssd pool then using that zvol as a log for the spinning disk pool when running the zpool add command the io freezes to the spinning disk pool. The ssd pool contiunes to work. Trying to run any other zpool commands they will not respond example zpool iostat
Describe how to reproduce the problem
see above
Include any warning/errors/backtraces from the system logs
No errors on screen just a hang. However if you want me to run some tests or pull up log files and attach them i'm happy to if you tell me what ones you want.
EDIT:
As a workaround I made a ZFS file system on the SSD pool. Then created a raw image file. After that I used the file instead of a zvol. Worked without issue. But I dont see why the ZVol shouldnt work. It's suposed to be like any other raw disk.
The text was updated successfully, but these errors were encountered: