-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Intermittent unreliable zpool mounting on boot #2444
Comments
@dajhorn This sounds like a race in upstart's initialization of the system. |
Closing as a duplicate of the above referenced issues. |
Hi Darik, |
@montanaviking, I don't remember that code or our conversation, but anything that spins in a while loop is probably inadequate. A general solution for ticket #330 must do at least two things:
You have the right idea, but you're not going to get the desired result unless ZoL is hooked into udev/dbus/mountall/upstart/systemd in all the right places, which is a non-trivial amount of integration work. |
Hi Darik, with the exception of putting in an if statement that terminates the loop only when all my RAIDZ2 drives appear (as entries) in /dev/disk/by-id/xxxx where xxxxx is a drive ID of a member of the RAIDZ2 array. Does putting this loop in /etc/init/mountall.conf stall ZFS automount until the loop terminates (from the description in the FAQ I think it does)? So, I'm thinking that if I set up a loop in /etc/init/mountall.conf as suggested in the FAQ above and I modify it to make the loop continue until all my RAIDZ2 drives appear in /dev/disk/by-id then I should be reasonably assured that I'll avoid the conditions that made mounting the RAIDZ2 /home directory unreliable? I'm also thinking that should one of my drives actually fail on or before boot that the loop would terminate anyway after say 60 tries and the RAIDZ2 could still mount with up to two failed drives. Right now, I have the lines: Finally, if I may ask, is the line: Thanks so much, |
Yes.
Yes.
Yes, right.
Dunno, but it might have beneficial side effects like ensuring that the udev event queue is flushed.
This ensures that the system starts a console even if something goes wrong, like a typo in the script. These things run early enough that any mistake can break the system. |
Hi Darik, Sent from my iPad
|
Hi Darik, ############################################################ waitavail() { failsafedbus() { waitavail >/dev/tty1 2>&1 Please notice the if [] && [] && [] .... statement above as it is the main tool of the script. Basically, the if statement will cause the loop to continue so long as at least one of the drive IDs is missing. When all drives are finally up, running, and ready for mount, their IDs should appear in the /dev/disk/by-id directory and the if statement will evaluate as true, allowing the break statement to execute and the loop to terminate. This will then allow ZFS to attempt its automount of the array - but not until all the drives are ready as indicated by the appearance of their ID files in /dev/disk/by-id. The above script, in /root/mountdelay, is executed in the /etc/init/mountall.conf file and you need to add the following line to your /etc/init/mountall.conf file The above line should be added just before the line below: exec mountall --daemon $force_fsck $fsck_fix Also, please remember to set the /root/mountdelay script file as executable! So ultimately please notice that in the above, one MUST modify the values in the if statement as: Prior to this patch, I had been relying on an /etc/rc.local file containing: Please comment if I'm missing something. |
Right, and it should now be apparent how this kind of solution does not solve the general problem.
The Ubuntu init stack relies on
It seems good at first glance, but I don't have a way to test it. This is it-works-for-me territory until #330 is implemented. HTH. |
Hi Darik, |
@montanaviking, yes, feel free to edit the wiki. Anybody with a GitHub account has write access. |
Hi Darik, |
Hi, So, I'm posting this for three main reasons.
Thanks, Sent from my iPad
|
Hi Darik, Intermittent unreliable zpool mounting on boot #2444"? and My patch reported there does seem to be working solidly for my machine. Will this be necessary or prudent after the recent ZFS update? On 08/30/2014 03:41 PM, Darik Horn wrote:
|
Yes. Commit f67d709 in ZoL 0.6.4 will provide an easy way to avoid this Solaris behavior, which has several side-effects on Linux.
Add the
You're describing a system configuration that I test and that usually works. However, any configuration that
Yes, the |
Hi Derik,
|
If the upstream engineering team thinks that lack of ECC is an actual problem for ZFS, then they could easily check for it at runtime and squawk a warning into the system log, but they don't. Now more people know how scrape data out of a bad EXT4 instance than ZFS, but I haven't seen any evidence that using EXT4 on duff equipment mitigates data loss in any way vice the end-to-end protection that ZFS provides. Personally, I would rather wear a bike helmet than live near a good brain surgeon, even if the bike helmet refuses to
Dunno. I don't use it often enough to know how it behaves in this circumstance. |
Hi Darik, Sent from my iPad
|
Hi,
I've been configuring a new system as specified below:
dual Xeon E5-2630-v2
SuperMicro X9DRI-F-O motherboard with 64GB ECC RAM
OS installed on the Intel 530 series 120GB SSD - on a 60GB partition
The OS is Ubuntu 14.04 64-bit
mdadm was installed
ZFS on Linux was installed (Kernel version of course)
the /home directory was installed on a ZFS RAIDZ2 pool using four 1TB drives i.e. 2 seagate Barracuda and 2 Western Digital blacks.
The zpool was mounted automatically via zfs automount.
After installation, I noticed that the Ubuntu 14.04 would occasionally fail to mount the zpool on boot. When this happened, it could still be mounted after booting.
Strangely, when the OS (Ubuntu 14.04 root directory was moved from the SSD to a regular hard drive (1TB Seagate Barracuda) Zpool mounting appeared to be reliable.
I've gone back to a Ubuntu 12.04 install on my SSD with ZFS on Linux (as configured above) and now see no apparent issues.
Of course, I would really like to get Ubuntu 14.04 working.
Thanks,
Phil
The text was updated successfully, but these errors were encountered: