-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
zpool status always returns true #4670
Comments
Out of curiosity i tried to reproduce the same problem on illumos: its implementation of
The latest FreeNAS release does exaclty that: https://github.com/freenas/freenas/blob/FN-9.10-RELEASE/gui/middleware/notifier.py#L5660 I don't know if we could just implement this and call it a day, maybe it should be discussed with the other OpenZFS members. |
This is actually by design. The exit code indicates that the command returned without error not that the pool is healthy. You're going to need to parse the output to determine the status, this should get easier when the JSON support in #3938 is finalized. We could consider adding another command line option to change this behavior or a new sub-command. But I'd rather not change the expected long standing default behavior. |
The problem I have with this is that the output format changes without (Sent from my phone - please blame any weird errors on autocorrect) On May 20, 2016 1:10:59 PM Brian Behlendorf notifications@github.com wrote:
|
Then let's add a reliable interface. That could be the JSON output which is structured and won't change or something else. |
I would like this for the same reason. I also use nagios to monitor my servers. |
Try: zpool get health |
If you have the pool name, this is even simpler: zpool list -H -o health pool |
does "zpool get health" actually work for this?
but
and
|
Do not try to rely on the zpool status return code to be meaningful, it
isn't.
-- richard
…On Sun, Aug 8, 2021 at 11:55 AM seb314 ***@***.***> wrote:
does "zpool get health" actually work for this?
after intentionally corrupting a zpool and subsequent scrub, I get
# zpool status
pool: testpool
state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: http://zfsonlinux.org/msg/ZFS-8000-8A
scan: scrub repaired 0B in 0h0m with 6517 errors on Sun Aug 8 20:43:45 2021
config:
NAME STATE READ WRITE CKSUM
testpool ONLINE 0 0 0
/home/server/testzfsblob ONLINE 0 0 12,8K
errors: 6478 data errors, use '-v' for a list
but
# zpool get health ; echo $?
NAME PROPERTY VALUE SOURCE
testpool health ONLINE -
0
and
# zpool list -H -o health testpool ; echo $?
ONLINE
0
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#4670 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAGTZTM7OIUEN2MUFHKXOE3T33HKRANCNFSM4CEKUF3A>
.
|
This little bash fragment seems to do the trick: (( $(zpool status -x | wc -l) < 2 )) An aside: I note that btrfs has the |
zpool status
returns true, even on a completely faulted pool. This means that automated monitors and alerts are forced to rely on parsing text (which can change without warning) to evaluate pool health.In the above example, a DEGRADED pool's status still returns 0.
And here we see even a completely UNAVAIL pool still returning 0 from a status check.
I would really, really like to see
zpool status
returning a parseable exit code. An additional option for text output in a stable format designed for machine parsing (as well as predictable exit codes) would be even better.The text was updated successfully, but these errors were encountered: