-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix a number of small zvol-related udev error handling issues #12302
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking the time to remove this cruft and submitting those changes back. This looks great, the detailed commit messages were very welcome since some of these changes would otherwise be non obvious!
This file is old as dirt. It's entirely possible that commas were optional in udev back at that time. But they're definitely supposed to be there nowadays. Signed-off-by: Justin Gottula <justin@jgottula.com>
The $tempnode substitution is so old that it's not even mentioned in the man page anymore. It is still technically supported by udev, but with plenty of "deprecated" comments surrounding it. The preferred modern equivalent of $tempnode is $devnode (or alternatively, %N). Signed-off-by: Justin Gottula <justin@jgottula.com>
Assignment syntax (=) can be used for the PROGRAM key. But the PROGRAM key is really a match key, not an assign key. The internal logic used by udev to decide whether a PROGRAM key "matched" or not (which determines whether the remainder of the rule is evaluated) depends on whether the operator was OP_MATCH (==) or OP_NOMATCH (!=). [1] The man page claims that '"=", ":=", and "+=" have the same effect as "=="' for PROGRAM keys. And, after a brief perusal, the udev source code does seem to confirm that operators other than OP_MATCH (==) or OP_NOMATCH (!=) are implicitly converted to OP_MATCH (==). [2] But it's not entirely clear that this is definitely the case: anecdotal testing seems to indicate that when OP_ASSIGN (=) is used, the program's exit status is disregarded and the remainder of the rule is processed regardless of whether it was, in fact, a successful exit. The bottom line here is that, if zvol_id hits some snag and returns a nonzero exit status, then we almost certainly do NOT want to continue on with the rule and use whatever the stdout contents may have been to mindlessly create /dev/zvol/* symlinks. Therefore, let's be extra-sure and use the match (==) operator explicitly, to eliminate any possibility that udev might do the wrong thing, and ensure that a nonzero exit status will definitely short-circuit the rest of the rule, bypassing the SYMLINK+= assignments. [1] udev, file src/udev/udev-rules.c, func udev_rule_apply_token_to_event, switch case TK_M_PROGRAM if r != 0 (nonzero exit status): return token->op == OP_NOMATCH; switch case TK_M_PROGRAM if r == 0 (zero exit status): return token->op == OP_MATCH; func retval 0 => key is considered to have matched func retval 1 => key is considered to have NOT matched [2] udev, file src/udev/udev-rules.c, func parse_token, at func start: bool is_match = IN_SET(op, OP_MATCH, OP_NOMATCH); in else-if case streq(key, "PROGRAM"): if (!is_match) op = OP_MATCH; Signed-off-by: Justin Gottula <justin@jgottula.com>
The zvol_id program is invoked by udev, via a PROGRAM key in the 60-zvol.rules.in rule file, to determine the "pretty" /dev/zvol/* symlink paths paths that should be generated for each opaquely named /dev/zd* dev node. The udev rule uses the PROGRAM key, followed by a SYMLINK+= assignment containing the %c substitution, to collect the program's stdout and then "paste" it directly into the name of the symlink(s) to be created. Unfortunately, as currently written, zvol_id outputs both its intended output (a single string representing the symlink path that should be created to refer to the name of the dataset whose /dev/zd* path is given) AND its error messages (if any) to stdout. When processing PROGRAM keys (and others, such as IMPORT{program}), udev uses only the data written to stdout for functional purposes. Any data written to stderr is used solely for the purposes of logging (if udev's log_level is set to debug). The unintended consequence of this is as follows: if zvol_id encounters an error condition; and then udev fails to halt processing of the current rule (either because zvol_id didn't return a nonzero exit status, or because the PROGRAM key in the rule wasn't written properly to result in a "non-match" condition that would stop the current rule on a nonzero exit); then udev will create a space-delimited list of symlink names derived directly from the words of the error message string! I've observed this exact behavior on my own system, in a situation where the open() syscall on /dev/zd* dev nodes was failing sporadically (for reasons that aren't especially relevant here). Because the open() call failed, zvol_id printed "Unable to open device file: /dev/zd736\n" to stdout and then exited. The udev rule finished with SYMLINK+="zvol/%c %c". Assuming a volume name like pool/foo/bar, this would ordinarily expand to SYMLINK+="zvol/pool/foo/bar pool/foo/bar" and would cause symlinks to be created like this: /dev/zvol/pool/foo/bar -> /dev/zd736 /dev/pool/foo/bar -> /dev/zd736 But because of the combination of error messages being printed to stdout, and the udev syntax freely accepting a space-delimited sequence of names in this context, the error message string "Unable to open device file: /dev/zd736\n" in reality expanded to SYMLINK+="zvol/Unable to open device file: /dev/zd736" which caused the following symlinks to actually be created: /dev/zvol/Unable -> /dev/zd736 /dev/to -> /dev/zd736 /dev/open -> /dev/zd736 /dev/device -> /dev/zd736 /dev/file: -> /dev/zd736 /dev//dev/zd736 -> /dev/zd736 (And, because multiple zvols had open() syscall errors, multiple zvols attempted to claim several of those symlink names, resulting in numerous udev errors and timeouts and general chaos.) This commit rectifies all this silliness by simply printing error messages to stderr, as Dennis Ritchie originally intended. Signed-off-by: Justin Gottula <justin@jgottula.com>
Currently, there are several places in zvol_id where the program logic returns particular errno values, or even particular ioctl return values, as the program exit status, rather than a straightforward system of explicit zero on success and explicit nonzero value(s) on failure. This is problematic for multiple reasons. One particularly interesting problem that can arise, is that if any of these values happens to have all 8 least significant bits unset (i.e., it is a positive or negative multiple of 256), then although the C program sees a nonzero int value (presumed to be a failure exit status), the actual exit status as seen by the system is only the bottom 8 bits of that integer: zero. This can happen in practice, and I have encountered it myself. In a particularly weird situation, the zvol_open code in the zfs kernel module was behaving in such a manner that it caused the open() syscall to fail and for errno to be set to a kernel-private value (ERESTARTSYS, which happens to be defined as 512). It turns out that 512 is evenly divisible by 256; or, in other words, its least significant 8 bits are all-zero. So even though zvol_id believed it was returning a nonzero (failure) exit status of 512, the system modulo'd that value by 256, resulting in the actual exit status visible by other programs being 0! This actually-zero (non-failure) exit status caused problems: udev believed that the program was operating successfully, when in fact it was attempting to indicate failure via a nonzero exit status integer. Combined with another problem, this led to the creation of nonsense symlinks for zvol dev nodes by udev. Let's get rid of all this problematic logic, and simply return EXIT_SUCCESS (0) is everything went fine, and EXIT_FAILURE (1) if anything went wrong. Additionally, let's clarify some of the variable names (error is similar to errno, etc) and clean up the overall program flow a bit. Signed-off-by: Justin Gottula <justin@jgottula.com>
a20f220
to
ce45020
Compare
Force-pushed just now:
|
NOTE: this version of the patch is based on the current git master as of 20210630, and omits some other fixes to the exact same line of the udev rules file from PR openzfs#12302 that would otherwise conflict. This is a potentially arguable change, because it removes some compatibility cruft that certain systems or people may have come to rely on (either a very long time ago, or unwisely in recent times). On the other hand, it's been literally over a decade since OpenZFS switched to the strategy of using opaque numbered /dev/zd* device nodes, with the canonical zvol access path being a directory tree of symlinks created by udev rules inside /dev/zvol/*. (See openzfs#102.) Even at the time, the /dev/* scheme was labeled as being for "compatibility". This commit removes the second tree of symlinks located directly at /dev/*, under the assumption that anybody with any sense has been using the intended /dev/zvol/* path for a very very long time now. (The more I think about this, the more I anticipate that some large fraction of people will have been blissfully unaware that the intention has been for them to use the /dev/zvol/* tree all along, and they will have come to rely upon the /dev/* tree simply because it's been there this whole time despite being a compat thing.) Signed-off-by: Justin Gottula <justin@jgottula.com>
NOTE: this version of the patch is based on top of the commits added in PR openzfs#12302, which apply fixes to the exact same line of the udev rules file. That PR is probably best understood as a prerequisite to this one; but it's not merged currently, so that's why this exists. This is a potentially arguable change, because it removes some compatibility cruft that certain systems or people may have come to rely on (either a very long time ago, or unwisely in recent times). On the other hand, it's been literally over a decade since OpenZFS switched to the strategy of using opaque numbered /dev/zd* device nodes, with the canonical zvol access path being a directory tree of symlinks created by udev rules inside /dev/zvol/*. (See openzfs#102.) Even at the time, the /dev/* scheme was labeled as being for "compatibility". This commit removes the second tree of symlinks located directly at /dev/*, under the assumption that anybody with any sense has been using the intended /dev/zvol/* path for a very very long time now. (The more I think about this, the more I anticipate that some large fraction of people will have been blissfully unaware that the intention has been for them to use the /dev/zvol/* tree all along, and they will have come to rely upon the /dev/* tree simply because it's been there this whole time despite being a compat thing.) Signed-off-by: Justin Gottula <justin@jgottula.com>
Great write-up! The commit messages are definitely helpful and your analysis in #12301 was very informative. Thanks for doing this work! Regarding the removal of compatibility "/dev/pool/" links, I think it's a good idea. It might break some user configurations but we need to stop supporting deprecated formats at some point. |
The $tempnode substitution is so old that it's not even mentioned in the man page anymore. It is still technically supported by udev, but with plenty of "deprecated" comments surrounding it. The preferred modern equivalent of $tempnode is $devnode (or alternatively, %N). Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com> Signed-off-by: Justin Gottula <justin@jgottula.com> Closes #12302
Assignment syntax (=) can be used for the PROGRAM key. But the PROGRAM key is really a match key, not an assign key. The internal logic used by udev to decide whether a PROGRAM key "matched" or not (which determines whether the remainder of the rule is evaluated) depends on whether the operator was OP_MATCH (==) or OP_NOMATCH (!=). [1] The man page claims that '"=", ":=", and "+=" have the same effect as "=="' for PROGRAM keys. And, after a brief perusal, the udev source code does seem to confirm that operators other than OP_MATCH (==) or OP_NOMATCH (!=) are implicitly converted to OP_MATCH (==). [2] But it's not entirely clear that this is definitely the case: anecdotal testing seems to indicate that when OP_ASSIGN (=) is used, the program's exit status is disregarded and the remainder of the rule is processed regardless of whether it was, in fact, a successful exit. The bottom line here is that, if zvol_id hits some snag and returns a nonzero exit status, then we almost certainly do NOT want to continue on with the rule and use whatever the stdout contents may have been to mindlessly create /dev/zvol/* symlinks. Therefore, let's be extra-sure and use the match (==) operator explicitly, to eliminate any possibility that udev might do the wrong thing, and ensure that a nonzero exit status will definitely short-circuit the rest of the rule, bypassing the SYMLINK+= assignments. [1] udev, file src/udev/udev-rules.c, func udev_rule_apply_token_to_event, switch case TK_M_PROGRAM if r != 0 (nonzero exit status): return token->op == OP_NOMATCH; switch case TK_M_PROGRAM if r == 0 (zero exit status): return token->op == OP_MATCH; func retval 0 => key is considered to have matched func retval 1 => key is considered to have NOT matched [2] udev, file src/udev/udev-rules.c, func parse_token, at func start: bool is_match = IN_SET(op, OP_MATCH, OP_NOMATCH); in else-if case streq(key, "PROGRAM"): if (!is_match) op = OP_MATCH; Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com> Signed-off-by: Justin Gottula <justin@jgottula.com> Closes #12302
The zvol_id program is invoked by udev, via a PROGRAM key in the 60-zvol.rules.in rule file, to determine the "pretty" /dev/zvol/* symlink paths paths that should be generated for each opaquely named /dev/zd* dev node. The udev rule uses the PROGRAM key, followed by a SYMLINK+= assignment containing the %c substitution, to collect the program's stdout and then "paste" it directly into the name of the symlink(s) to be created. Unfortunately, as currently written, zvol_id outputs both its intended output (a single string representing the symlink path that should be created to refer to the name of the dataset whose /dev/zd* path is given) AND its error messages (if any) to stdout. When processing PROGRAM keys (and others, such as IMPORT{program}), udev uses only the data written to stdout for functional purposes. Any data written to stderr is used solely for the purposes of logging (if udev's log_level is set to debug). The unintended consequence of this is as follows: if zvol_id encounters an error condition; and then udev fails to halt processing of the current rule (either because zvol_id didn't return a nonzero exit status, or because the PROGRAM key in the rule wasn't written properly to result in a "non-match" condition that would stop the current rule on a nonzero exit); then udev will create a space-delimited list of symlink names derived directly from the words of the error message string! I've observed this exact behavior on my own system, in a situation where the open() syscall on /dev/zd* dev nodes was failing sporadically (for reasons that aren't especially relevant here). Because the open() call failed, zvol_id printed "Unable to open device file: /dev/zd736\n" to stdout and then exited. The udev rule finished with SYMLINK+="zvol/%c %c". Assuming a volume name like pool/foo/bar, this would ordinarily expand to SYMLINK+="zvol/pool/foo/bar pool/foo/bar" and would cause symlinks to be created like this: /dev/zvol/pool/foo/bar -> /dev/zd736 /dev/pool/foo/bar -> /dev/zd736 But because of the combination of error messages being printed to stdout, and the udev syntax freely accepting a space-delimited sequence of names in this context, the error message string "Unable to open device file: /dev/zd736\n" in reality expanded to SYMLINK+="zvol/Unable to open device file: /dev/zd736" which caused the following symlinks to actually be created: /dev/zvol/Unable -> /dev/zd736 /dev/to -> /dev/zd736 /dev/open -> /dev/zd736 /dev/device -> /dev/zd736 /dev/file: -> /dev/zd736 /dev//dev/zd736 -> /dev/zd736 (And, because multiple zvols had open() syscall errors, multiple zvols attempted to claim several of those symlink names, resulting in numerous udev errors and timeouts and general chaos.) This commit rectifies all this silliness by simply printing error messages to stderr, as Dennis Ritchie originally intended. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com> Signed-off-by: Justin Gottula <justin@jgottula.com> Closes #12302
Currently, there are several places in zvol_id where the program logic returns particular errno values, or even particular ioctl return values, as the program exit status, rather than a straightforward system of explicit zero on success and explicit nonzero value(s) on failure. This is problematic for multiple reasons. One particularly interesting problem that can arise, is that if any of these values happens to have all 8 least significant bits unset (i.e., it is a positive or negative multiple of 256), then although the C program sees a nonzero int value (presumed to be a failure exit status), the actual exit status as seen by the system is only the bottom 8 bits of that integer: zero. This can happen in practice, and I have encountered it myself. In a particularly weird situation, the zvol_open code in the zfs kernel module was behaving in such a manner that it caused the open() syscall to fail and for errno to be set to a kernel-private value (ERESTARTSYS, which happens to be defined as 512). It turns out that 512 is evenly divisible by 256; or, in other words, its least significant 8 bits are all-zero. So even though zvol_id believed it was returning a nonzero (failure) exit status of 512, the system modulo'd that value by 256, resulting in the actual exit status visible by other programs being 0! This actually-zero (non-failure) exit status caused problems: udev believed that the program was operating successfully, when in fact it was attempting to indicate failure via a nonzero exit status integer. Combined with another problem, this led to the creation of nonsense symlinks for zvol dev nodes by udev. Let's get rid of all this problematic logic, and simply return EXIT_SUCCESS (0) is everything went fine, and EXIT_FAILURE (1) if anything went wrong. Additionally, let's clarify some of the variable names (error is similar to errno, etc) and clean up the overall program flow a bit. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com> Signed-off-by: Justin Gottula <justin@jgottula.com> Closes #12302
This file is old as dirt. It's entirely possible that commas were optional in udev back at that time. But they're definitely supposed to be there nowadays. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com> Signed-off-by: Justin Gottula <justin@jgottula.com> Closes openzfs#12302
The $tempnode substitution is so old that it's not even mentioned in the man page anymore. It is still technically supported by udev, but with plenty of "deprecated" comments surrounding it. The preferred modern equivalent of $tempnode is $devnode (or alternatively, %N). Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com> Signed-off-by: Justin Gottula <justin@jgottula.com> Closes openzfs#12302
Assignment syntax (=) can be used for the PROGRAM key. But the PROGRAM key is really a match key, not an assign key. The internal logic used by udev to decide whether a PROGRAM key "matched" or not (which determines whether the remainder of the rule is evaluated) depends on whether the operator was OP_MATCH (==) or OP_NOMATCH (!=). [1] The man page claims that '"=", ":=", and "+=" have the same effect as "=="' for PROGRAM keys. And, after a brief perusal, the udev source code does seem to confirm that operators other than OP_MATCH (==) or OP_NOMATCH (!=) are implicitly converted to OP_MATCH (==). [2] But it's not entirely clear that this is definitely the case: anecdotal testing seems to indicate that when OP_ASSIGN (=) is used, the program's exit status is disregarded and the remainder of the rule is processed regardless of whether it was, in fact, a successful exit. The bottom line here is that, if zvol_id hits some snag and returns a nonzero exit status, then we almost certainly do NOT want to continue on with the rule and use whatever the stdout contents may have been to mindlessly create /dev/zvol/* symlinks. Therefore, let's be extra-sure and use the match (==) operator explicitly, to eliminate any possibility that udev might do the wrong thing, and ensure that a nonzero exit status will definitely short-circuit the rest of the rule, bypassing the SYMLINK+= assignments. [1] udev, file src/udev/udev-rules.c, func udev_rule_apply_token_to_event, switch case TK_M_PROGRAM if r != 0 (nonzero exit status): return token->op == OP_NOMATCH; switch case TK_M_PROGRAM if r == 0 (zero exit status): return token->op == OP_MATCH; func retval 0 => key is considered to have matched func retval 1 => key is considered to have NOT matched [2] udev, file src/udev/udev-rules.c, func parse_token, at func start: bool is_match = IN_SET(op, OP_MATCH, OP_NOMATCH); in else-if case streq(key, "PROGRAM"): if (!is_match) op = OP_MATCH; Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com> Signed-off-by: Justin Gottula <justin@jgottula.com> Closes openzfs#12302
The zvol_id program is invoked by udev, via a PROGRAM key in the 60-zvol.rules.in rule file, to determine the "pretty" /dev/zvol/* symlink paths paths that should be generated for each opaquely named /dev/zd* dev node. The udev rule uses the PROGRAM key, followed by a SYMLINK+= assignment containing the %c substitution, to collect the program's stdout and then "paste" it directly into the name of the symlink(s) to be created. Unfortunately, as currently written, zvol_id outputs both its intended output (a single string representing the symlink path that should be created to refer to the name of the dataset whose /dev/zd* path is given) AND its error messages (if any) to stdout. When processing PROGRAM keys (and others, such as IMPORT{program}), udev uses only the data written to stdout for functional purposes. Any data written to stderr is used solely for the purposes of logging (if udev's log_level is set to debug). The unintended consequence of this is as follows: if zvol_id encounters an error condition; and then udev fails to halt processing of the current rule (either because zvol_id didn't return a nonzero exit status, or because the PROGRAM key in the rule wasn't written properly to result in a "non-match" condition that would stop the current rule on a nonzero exit); then udev will create a space-delimited list of symlink names derived directly from the words of the error message string! I've observed this exact behavior on my own system, in a situation where the open() syscall on /dev/zd* dev nodes was failing sporadically (for reasons that aren't especially relevant here). Because the open() call failed, zvol_id printed "Unable to open device file: /dev/zd736\n" to stdout and then exited. The udev rule finished with SYMLINK+="zvol/%c %c". Assuming a volume name like pool/foo/bar, this would ordinarily expand to SYMLINK+="zvol/pool/foo/bar pool/foo/bar" and would cause symlinks to be created like this: /dev/zvol/pool/foo/bar -> /dev/zd736 /dev/pool/foo/bar -> /dev/zd736 But because of the combination of error messages being printed to stdout, and the udev syntax freely accepting a space-delimited sequence of names in this context, the error message string "Unable to open device file: /dev/zd736\n" in reality expanded to SYMLINK+="zvol/Unable to open device file: /dev/zd736" which caused the following symlinks to actually be created: /dev/zvol/Unable -> /dev/zd736 /dev/to -> /dev/zd736 /dev/open -> /dev/zd736 /dev/device -> /dev/zd736 /dev/file: -> /dev/zd736 /dev//dev/zd736 -> /dev/zd736 (And, because multiple zvols had open() syscall errors, multiple zvols attempted to claim several of those symlink names, resulting in numerous udev errors and timeouts and general chaos.) This commit rectifies all this silliness by simply printing error messages to stderr, as Dennis Ritchie originally intended. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com> Signed-off-by: Justin Gottula <justin@jgottula.com> Closes openzfs#12302
Currently, there are several places in zvol_id where the program logic returns particular errno values, or even particular ioctl return values, as the program exit status, rather than a straightforward system of explicit zero on success and explicit nonzero value(s) on failure. This is problematic for multiple reasons. One particularly interesting problem that can arise, is that if any of these values happens to have all 8 least significant bits unset (i.e., it is a positive or negative multiple of 256), then although the C program sees a nonzero int value (presumed to be a failure exit status), the actual exit status as seen by the system is only the bottom 8 bits of that integer: zero. This can happen in practice, and I have encountered it myself. In a particularly weird situation, the zvol_open code in the zfs kernel module was behaving in such a manner that it caused the open() syscall to fail and for errno to be set to a kernel-private value (ERESTARTSYS, which happens to be defined as 512). It turns out that 512 is evenly divisible by 256; or, in other words, its least significant 8 bits are all-zero. So even though zvol_id believed it was returning a nonzero (failure) exit status of 512, the system modulo'd that value by 256, resulting in the actual exit status visible by other programs being 0! This actually-zero (non-failure) exit status caused problems: udev believed that the program was operating successfully, when in fact it was attempting to indicate failure via a nonzero exit status integer. Combined with another problem, this led to the creation of nonsense symlinks for zvol dev nodes by udev. Let's get rid of all this problematic logic, and simply return EXIT_SUCCESS (0) is everything went fine, and EXIT_FAILURE (1) if anything went wrong. Additionally, let's clarify some of the variable names (error is similar to errno, etc) and clean up the overall program flow a bit. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com> Signed-off-by: Justin Gottula <justin@jgottula.com> Closes openzfs#12302
This file is old as dirt. It's entirely possible that commas were optional in udev back at that time. But they're definitely supposed to be there nowadays. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com> Signed-off-by: Justin Gottula <justin@jgottula.com> Closes openzfs#12302
The $tempnode substitution is so old that it's not even mentioned in the man page anymore. It is still technically supported by udev, but with plenty of "deprecated" comments surrounding it. The preferred modern equivalent of $tempnode is $devnode (or alternatively, %N). Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com> Signed-off-by: Justin Gottula <justin@jgottula.com> Closes openzfs#12302
Assignment syntax (=) can be used for the PROGRAM key. But the PROGRAM key is really a match key, not an assign key. The internal logic used by udev to decide whether a PROGRAM key "matched" or not (which determines whether the remainder of the rule is evaluated) depends on whether the operator was OP_MATCH (==) or OP_NOMATCH (!=). [1] The man page claims that '"=", ":=", and "+=" have the same effect as "=="' for PROGRAM keys. And, after a brief perusal, the udev source code does seem to confirm that operators other than OP_MATCH (==) or OP_NOMATCH (!=) are implicitly converted to OP_MATCH (==). [2] But it's not entirely clear that this is definitely the case: anecdotal testing seems to indicate that when OP_ASSIGN (=) is used, the program's exit status is disregarded and the remainder of the rule is processed regardless of whether it was, in fact, a successful exit. The bottom line here is that, if zvol_id hits some snag and returns a nonzero exit status, then we almost certainly do NOT want to continue on with the rule and use whatever the stdout contents may have been to mindlessly create /dev/zvol/* symlinks. Therefore, let's be extra-sure and use the match (==) operator explicitly, to eliminate any possibility that udev might do the wrong thing, and ensure that a nonzero exit status will definitely short-circuit the rest of the rule, bypassing the SYMLINK+= assignments. [1] udev, file src/udev/udev-rules.c, func udev_rule_apply_token_to_event, switch case TK_M_PROGRAM if r != 0 (nonzero exit status): return token->op == OP_NOMATCH; switch case TK_M_PROGRAM if r == 0 (zero exit status): return token->op == OP_MATCH; func retval 0 => key is considered to have matched func retval 1 => key is considered to have NOT matched [2] udev, file src/udev/udev-rules.c, func parse_token, at func start: bool is_match = IN_SET(op, OP_MATCH, OP_NOMATCH); in else-if case streq(key, "PROGRAM"): if (!is_match) op = OP_MATCH; Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com> Signed-off-by: Justin Gottula <justin@jgottula.com> Closes openzfs#12302
The zvol_id program is invoked by udev, via a PROGRAM key in the 60-zvol.rules.in rule file, to determine the "pretty" /dev/zvol/* symlink paths paths that should be generated for each opaquely named /dev/zd* dev node. The udev rule uses the PROGRAM key, followed by a SYMLINK+= assignment containing the %c substitution, to collect the program's stdout and then "paste" it directly into the name of the symlink(s) to be created. Unfortunately, as currently written, zvol_id outputs both its intended output (a single string representing the symlink path that should be created to refer to the name of the dataset whose /dev/zd* path is given) AND its error messages (if any) to stdout. When processing PROGRAM keys (and others, such as IMPORT{program}), udev uses only the data written to stdout for functional purposes. Any data written to stderr is used solely for the purposes of logging (if udev's log_level is set to debug). The unintended consequence of this is as follows: if zvol_id encounters an error condition; and then udev fails to halt processing of the current rule (either because zvol_id didn't return a nonzero exit status, or because the PROGRAM key in the rule wasn't written properly to result in a "non-match" condition that would stop the current rule on a nonzero exit); then udev will create a space-delimited list of symlink names derived directly from the words of the error message string! I've observed this exact behavior on my own system, in a situation where the open() syscall on /dev/zd* dev nodes was failing sporadically (for reasons that aren't especially relevant here). Because the open() call failed, zvol_id printed "Unable to open device file: /dev/zd736\n" to stdout and then exited. The udev rule finished with SYMLINK+="zvol/%c %c". Assuming a volume name like pool/foo/bar, this would ordinarily expand to SYMLINK+="zvol/pool/foo/bar pool/foo/bar" and would cause symlinks to be created like this: /dev/zvol/pool/foo/bar -> /dev/zd736 /dev/pool/foo/bar -> /dev/zd736 But because of the combination of error messages being printed to stdout, and the udev syntax freely accepting a space-delimited sequence of names in this context, the error message string "Unable to open device file: /dev/zd736\n" in reality expanded to SYMLINK+="zvol/Unable to open device file: /dev/zd736" which caused the following symlinks to actually be created: /dev/zvol/Unable -> /dev/zd736 /dev/to -> /dev/zd736 /dev/open -> /dev/zd736 /dev/device -> /dev/zd736 /dev/file: -> /dev/zd736 /dev//dev/zd736 -> /dev/zd736 (And, because multiple zvols had open() syscall errors, multiple zvols attempted to claim several of those symlink names, resulting in numerous udev errors and timeouts and general chaos.) This commit rectifies all this silliness by simply printing error messages to stderr, as Dennis Ritchie originally intended. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com> Signed-off-by: Justin Gottula <justin@jgottula.com> Closes openzfs#12302
Currently, there are several places in zvol_id where the program logic returns particular errno values, or even particular ioctl return values, as the program exit status, rather than a straightforward system of explicit zero on success and explicit nonzero value(s) on failure. This is problematic for multiple reasons. One particularly interesting problem that can arise, is that if any of these values happens to have all 8 least significant bits unset (i.e., it is a positive or negative multiple of 256), then although the C program sees a nonzero int value (presumed to be a failure exit status), the actual exit status as seen by the system is only the bottom 8 bits of that integer: zero. This can happen in practice, and I have encountered it myself. In a particularly weird situation, the zvol_open code in the zfs kernel module was behaving in such a manner that it caused the open() syscall to fail and for errno to be set to a kernel-private value (ERESTARTSYS, which happens to be defined as 512). It turns out that 512 is evenly divisible by 256; or, in other words, its least significant 8 bits are all-zero. So even though zvol_id believed it was returning a nonzero (failure) exit status of 512, the system modulo'd that value by 256, resulting in the actual exit status visible by other programs being 0! This actually-zero (non-failure) exit status caused problems: udev believed that the program was operating successfully, when in fact it was attempting to indicate failure via a nonzero exit status integer. Combined with another problem, this led to the creation of nonsense symlinks for zvol dev nodes by udev. Let's get rid of all this problematic logic, and simply return EXIT_SUCCESS (0) is everything went fine, and EXIT_FAILURE (1) if anything went wrong. Additionally, let's clarify some of the variable names (error is similar to errno, etc) and clean up the overall program flow a bit. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com> Signed-off-by: Justin Gottula <justin@jgottula.com> Closes openzfs#12302
This file is old as dirt. It's entirely possible that commas were optional in udev back at that time. But they're definitely supposed to be there nowadays. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com> Signed-off-by: Justin Gottula <justin@jgottula.com> Closes openzfs#12302
The $tempnode substitution is so old that it's not even mentioned in the man page anymore. It is still technically supported by udev, but with plenty of "deprecated" comments surrounding it. The preferred modern equivalent of $tempnode is $devnode (or alternatively, %N). Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com> Signed-off-by: Justin Gottula <justin@jgottula.com> Closes openzfs#12302
Assignment syntax (=) can be used for the PROGRAM key. But the PROGRAM key is really a match key, not an assign key. The internal logic used by udev to decide whether a PROGRAM key "matched" or not (which determines whether the remainder of the rule is evaluated) depends on whether the operator was OP_MATCH (==) or OP_NOMATCH (!=). [1] The man page claims that '"=", ":=", and "+=" have the same effect as "=="' for PROGRAM keys. And, after a brief perusal, the udev source code does seem to confirm that operators other than OP_MATCH (==) or OP_NOMATCH (!=) are implicitly converted to OP_MATCH (==). [2] But it's not entirely clear that this is definitely the case: anecdotal testing seems to indicate that when OP_ASSIGN (=) is used, the program's exit status is disregarded and the remainder of the rule is processed regardless of whether it was, in fact, a successful exit. The bottom line here is that, if zvol_id hits some snag and returns a nonzero exit status, then we almost certainly do NOT want to continue on with the rule and use whatever the stdout contents may have been to mindlessly create /dev/zvol/* symlinks. Therefore, let's be extra-sure and use the match (==) operator explicitly, to eliminate any possibility that udev might do the wrong thing, and ensure that a nonzero exit status will definitely short-circuit the rest of the rule, bypassing the SYMLINK+= assignments. [1] udev, file src/udev/udev-rules.c, func udev_rule_apply_token_to_event, switch case TK_M_PROGRAM if r != 0 (nonzero exit status): return token->op == OP_NOMATCH; switch case TK_M_PROGRAM if r == 0 (zero exit status): return token->op == OP_MATCH; func retval 0 => key is considered to have matched func retval 1 => key is considered to have NOT matched [2] udev, file src/udev/udev-rules.c, func parse_token, at func start: bool is_match = IN_SET(op, OP_MATCH, OP_NOMATCH); in else-if case streq(key, "PROGRAM"): if (!is_match) op = OP_MATCH; Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com> Signed-off-by: Justin Gottula <justin@jgottula.com> Closes openzfs#12302
The zvol_id program is invoked by udev, via a PROGRAM key in the 60-zvol.rules.in rule file, to determine the "pretty" /dev/zvol/* symlink paths paths that should be generated for each opaquely named /dev/zd* dev node. The udev rule uses the PROGRAM key, followed by a SYMLINK+= assignment containing the %c substitution, to collect the program's stdout and then "paste" it directly into the name of the symlink(s) to be created. Unfortunately, as currently written, zvol_id outputs both its intended output (a single string representing the symlink path that should be created to refer to the name of the dataset whose /dev/zd* path is given) AND its error messages (if any) to stdout. When processing PROGRAM keys (and others, such as IMPORT{program}), udev uses only the data written to stdout for functional purposes. Any data written to stderr is used solely for the purposes of logging (if udev's log_level is set to debug). The unintended consequence of this is as follows: if zvol_id encounters an error condition; and then udev fails to halt processing of the current rule (either because zvol_id didn't return a nonzero exit status, or because the PROGRAM key in the rule wasn't written properly to result in a "non-match" condition that would stop the current rule on a nonzero exit); then udev will create a space-delimited list of symlink names derived directly from the words of the error message string! I've observed this exact behavior on my own system, in a situation where the open() syscall on /dev/zd* dev nodes was failing sporadically (for reasons that aren't especially relevant here). Because the open() call failed, zvol_id printed "Unable to open device file: /dev/zd736\n" to stdout and then exited. The udev rule finished with SYMLINK+="zvol/%c %c". Assuming a volume name like pool/foo/bar, this would ordinarily expand to SYMLINK+="zvol/pool/foo/bar pool/foo/bar" and would cause symlinks to be created like this: /dev/zvol/pool/foo/bar -> /dev/zd736 /dev/pool/foo/bar -> /dev/zd736 But because of the combination of error messages being printed to stdout, and the udev syntax freely accepting a space-delimited sequence of names in this context, the error message string "Unable to open device file: /dev/zd736\n" in reality expanded to SYMLINK+="zvol/Unable to open device file: /dev/zd736" which caused the following symlinks to actually be created: /dev/zvol/Unable -> /dev/zd736 /dev/to -> /dev/zd736 /dev/open -> /dev/zd736 /dev/device -> /dev/zd736 /dev/file: -> /dev/zd736 /dev//dev/zd736 -> /dev/zd736 (And, because multiple zvols had open() syscall errors, multiple zvols attempted to claim several of those symlink names, resulting in numerous udev errors and timeouts and general chaos.) This commit rectifies all this silliness by simply printing error messages to stderr, as Dennis Ritchie originally intended. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com> Signed-off-by: Justin Gottula <justin@jgottula.com> Closes openzfs#12302
Currently, there are several places in zvol_id where the program logic returns particular errno values, or even particular ioctl return values, as the program exit status, rather than a straightforward system of explicit zero on success and explicit nonzero value(s) on failure. This is problematic for multiple reasons. One particularly interesting problem that can arise, is that if any of these values happens to have all 8 least significant bits unset (i.e., it is a positive or negative multiple of 256), then although the C program sees a nonzero int value (presumed to be a failure exit status), the actual exit status as seen by the system is only the bottom 8 bits of that integer: zero. This can happen in practice, and I have encountered it myself. In a particularly weird situation, the zvol_open code in the zfs kernel module was behaving in such a manner that it caused the open() syscall to fail and for errno to be set to a kernel-private value (ERESTARTSYS, which happens to be defined as 512). It turns out that 512 is evenly divisible by 256; or, in other words, its least significant 8 bits are all-zero. So even though zvol_id believed it was returning a nonzero (failure) exit status of 512, the system modulo'd that value by 256, resulting in the actual exit status visible by other programs being 0! This actually-zero (non-failure) exit status caused problems: udev believed that the program was operating successfully, when in fact it was attempting to indicate failure via a nonzero exit status integer. Combined with another problem, this led to the creation of nonsense symlinks for zvol dev nodes by udev. Let's get rid of all this problematic logic, and simply return EXIT_SUCCESS (0) is everything went fine, and EXIT_FAILURE (1) if anything went wrong. Additionally, let's clarify some of the variable names (error is similar to errno, etc) and clean up the overall program flow a bit. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com> Signed-off-by: Justin Gottula <justin@jgottula.com> Closes openzfs#12302
This file is old as dirt. It's entirely possible that commas were optional in udev back at that time. But they're definitely supposed to be there nowadays. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com> Signed-off-by: Justin Gottula <justin@jgottula.com> Closes openzfs#12302
The $tempnode substitution is so old that it's not even mentioned in the man page anymore. It is still technically supported by udev, but with plenty of "deprecated" comments surrounding it. The preferred modern equivalent of $tempnode is $devnode (or alternatively, %N). Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com> Signed-off-by: Justin Gottula <justin@jgottula.com> Closes openzfs#12302
Assignment syntax (=) can be used for the PROGRAM key. But the PROGRAM key is really a match key, not an assign key. The internal logic used by udev to decide whether a PROGRAM key "matched" or not (which determines whether the remainder of the rule is evaluated) depends on whether the operator was OP_MATCH (==) or OP_NOMATCH (!=). [1] The man page claims that '"=", ":=", and "+=" have the same effect as "=="' for PROGRAM keys. And, after a brief perusal, the udev source code does seem to confirm that operators other than OP_MATCH (==) or OP_NOMATCH (!=) are implicitly converted to OP_MATCH (==). [2] But it's not entirely clear that this is definitely the case: anecdotal testing seems to indicate that when OP_ASSIGN (=) is used, the program's exit status is disregarded and the remainder of the rule is processed regardless of whether it was, in fact, a successful exit. The bottom line here is that, if zvol_id hits some snag and returns a nonzero exit status, then we almost certainly do NOT want to continue on with the rule and use whatever the stdout contents may have been to mindlessly create /dev/zvol/* symlinks. Therefore, let's be extra-sure and use the match (==) operator explicitly, to eliminate any possibility that udev might do the wrong thing, and ensure that a nonzero exit status will definitely short-circuit the rest of the rule, bypassing the SYMLINK+= assignments. [1] udev, file src/udev/udev-rules.c, func udev_rule_apply_token_to_event, switch case TK_M_PROGRAM if r != 0 (nonzero exit status): return token->op == OP_NOMATCH; switch case TK_M_PROGRAM if r == 0 (zero exit status): return token->op == OP_MATCH; func retval 0 => key is considered to have matched func retval 1 => key is considered to have NOT matched [2] udev, file src/udev/udev-rules.c, func parse_token, at func start: bool is_match = IN_SET(op, OP_MATCH, OP_NOMATCH); in else-if case streq(key, "PROGRAM"): if (!is_match) op = OP_MATCH; Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com> Signed-off-by: Justin Gottula <justin@jgottula.com> Closes openzfs#12302
The zvol_id program is invoked by udev, via a PROGRAM key in the 60-zvol.rules.in rule file, to determine the "pretty" /dev/zvol/* symlink paths paths that should be generated for each opaquely named /dev/zd* dev node. The udev rule uses the PROGRAM key, followed by a SYMLINK+= assignment containing the %c substitution, to collect the program's stdout and then "paste" it directly into the name of the symlink(s) to be created. Unfortunately, as currently written, zvol_id outputs both its intended output (a single string representing the symlink path that should be created to refer to the name of the dataset whose /dev/zd* path is given) AND its error messages (if any) to stdout. When processing PROGRAM keys (and others, such as IMPORT{program}), udev uses only the data written to stdout for functional purposes. Any data written to stderr is used solely for the purposes of logging (if udev's log_level is set to debug). The unintended consequence of this is as follows: if zvol_id encounters an error condition; and then udev fails to halt processing of the current rule (either because zvol_id didn't return a nonzero exit status, or because the PROGRAM key in the rule wasn't written properly to result in a "non-match" condition that would stop the current rule on a nonzero exit); then udev will create a space-delimited list of symlink names derived directly from the words of the error message string! I've observed this exact behavior on my own system, in a situation where the open() syscall on /dev/zd* dev nodes was failing sporadically (for reasons that aren't especially relevant here). Because the open() call failed, zvol_id printed "Unable to open device file: /dev/zd736\n" to stdout and then exited. The udev rule finished with SYMLINK+="zvol/%c %c". Assuming a volume name like pool/foo/bar, this would ordinarily expand to SYMLINK+="zvol/pool/foo/bar pool/foo/bar" and would cause symlinks to be created like this: /dev/zvol/pool/foo/bar -> /dev/zd736 /dev/pool/foo/bar -> /dev/zd736 But because of the combination of error messages being printed to stdout, and the udev syntax freely accepting a space-delimited sequence of names in this context, the error message string "Unable to open device file: /dev/zd736\n" in reality expanded to SYMLINK+="zvol/Unable to open device file: /dev/zd736" which caused the following symlinks to actually be created: /dev/zvol/Unable -> /dev/zd736 /dev/to -> /dev/zd736 /dev/open -> /dev/zd736 /dev/device -> /dev/zd736 /dev/file: -> /dev/zd736 /dev//dev/zd736 -> /dev/zd736 (And, because multiple zvols had open() syscall errors, multiple zvols attempted to claim several of those symlink names, resulting in numerous udev errors and timeouts and general chaos.) This commit rectifies all this silliness by simply printing error messages to stderr, as Dennis Ritchie originally intended. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com> Signed-off-by: Justin Gottula <justin@jgottula.com> Closes openzfs#12302
Currently, there are several places in zvol_id where the program logic returns particular errno values, or even particular ioctl return values, as the program exit status, rather than a straightforward system of explicit zero on success and explicit nonzero value(s) on failure. This is problematic for multiple reasons. One particularly interesting problem that can arise, is that if any of these values happens to have all 8 least significant bits unset (i.e., it is a positive or negative multiple of 256), then although the C program sees a nonzero int value (presumed to be a failure exit status), the actual exit status as seen by the system is only the bottom 8 bits of that integer: zero. This can happen in practice, and I have encountered it myself. In a particularly weird situation, the zvol_open code in the zfs kernel module was behaving in such a manner that it caused the open() syscall to fail and for errno to be set to a kernel-private value (ERESTARTSYS, which happens to be defined as 512). It turns out that 512 is evenly divisible by 256; or, in other words, its least significant 8 bits are all-zero. So even though zvol_id believed it was returning a nonzero (failure) exit status of 512, the system modulo'd that value by 256, resulting in the actual exit status visible by other programs being 0! This actually-zero (non-failure) exit status caused problems: udev believed that the program was operating successfully, when in fact it was attempting to indicate failure via a nonzero exit status integer. Combined with another problem, this led to the creation of nonsense symlinks for zvol dev nodes by udev. Let's get rid of all this problematic logic, and simply return EXIT_SUCCESS (0) is everything went fine, and EXIT_FAILURE (1) if anything went wrong. Additionally, let's clarify some of the variable names (error is similar to errno, etc) and clean up the overall program flow a bit. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com> Signed-off-by: Justin Gottula <justin@jgottula.com> Closes openzfs#12302
This file is old as dirt. It's entirely possible that commas were optional in udev back at that time. But they're definitely supposed to be there nowadays. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com> Signed-off-by: Justin Gottula <justin@jgottula.com> Closes #12302
The $tempnode substitution is so old that it's not even mentioned in the man page anymore. It is still technically supported by udev, but with plenty of "deprecated" comments surrounding it. The preferred modern equivalent of $tempnode is $devnode (or alternatively, %N). Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com> Signed-off-by: Justin Gottula <justin@jgottula.com> Closes #12302
Assignment syntax (=) can be used for the PROGRAM key. But the PROGRAM key is really a match key, not an assign key. The internal logic used by udev to decide whether a PROGRAM key "matched" or not (which determines whether the remainder of the rule is evaluated) depends on whether the operator was OP_MATCH (==) or OP_NOMATCH (!=). [1] The man page claims that '"=", ":=", and "+=" have the same effect as "=="' for PROGRAM keys. And, after a brief perusal, the udev source code does seem to confirm that operators other than OP_MATCH (==) or OP_NOMATCH (!=) are implicitly converted to OP_MATCH (==). [2] But it's not entirely clear that this is definitely the case: anecdotal testing seems to indicate that when OP_ASSIGN (=) is used, the program's exit status is disregarded and the remainder of the rule is processed regardless of whether it was, in fact, a successful exit. The bottom line here is that, if zvol_id hits some snag and returns a nonzero exit status, then we almost certainly do NOT want to continue on with the rule and use whatever the stdout contents may have been to mindlessly create /dev/zvol/* symlinks. Therefore, let's be extra-sure and use the match (==) operator explicitly, to eliminate any possibility that udev might do the wrong thing, and ensure that a nonzero exit status will definitely short-circuit the rest of the rule, bypassing the SYMLINK+= assignments. [1] udev, file src/udev/udev-rules.c, func udev_rule_apply_token_to_event, switch case TK_M_PROGRAM if r != 0 (nonzero exit status): return token->op == OP_NOMATCH; switch case TK_M_PROGRAM if r == 0 (zero exit status): return token->op == OP_MATCH; func retval 0 => key is considered to have matched func retval 1 => key is considered to have NOT matched [2] udev, file src/udev/udev-rules.c, func parse_token, at func start: bool is_match = IN_SET(op, OP_MATCH, OP_NOMATCH); in else-if case streq(key, "PROGRAM"): if (!is_match) op = OP_MATCH; Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com> Signed-off-by: Justin Gottula <justin@jgottula.com> Closes #12302
The zvol_id program is invoked by udev, via a PROGRAM key in the 60-zvol.rules.in rule file, to determine the "pretty" /dev/zvol/* symlink paths paths that should be generated for each opaquely named /dev/zd* dev node. The udev rule uses the PROGRAM key, followed by a SYMLINK+= assignment containing the %c substitution, to collect the program's stdout and then "paste" it directly into the name of the symlink(s) to be created. Unfortunately, as currently written, zvol_id outputs both its intended output (a single string representing the symlink path that should be created to refer to the name of the dataset whose /dev/zd* path is given) AND its error messages (if any) to stdout. When processing PROGRAM keys (and others, such as IMPORT{program}), udev uses only the data written to stdout for functional purposes. Any data written to stderr is used solely for the purposes of logging (if udev's log_level is set to debug). The unintended consequence of this is as follows: if zvol_id encounters an error condition; and then udev fails to halt processing of the current rule (either because zvol_id didn't return a nonzero exit status, or because the PROGRAM key in the rule wasn't written properly to result in a "non-match" condition that would stop the current rule on a nonzero exit); then udev will create a space-delimited list of symlink names derived directly from the words of the error message string! I've observed this exact behavior on my own system, in a situation where the open() syscall on /dev/zd* dev nodes was failing sporadically (for reasons that aren't especially relevant here). Because the open() call failed, zvol_id printed "Unable to open device file: /dev/zd736\n" to stdout and then exited. The udev rule finished with SYMLINK+="zvol/%c %c". Assuming a volume name like pool/foo/bar, this would ordinarily expand to SYMLINK+="zvol/pool/foo/bar pool/foo/bar" and would cause symlinks to be created like this: /dev/zvol/pool/foo/bar -> /dev/zd736 /dev/pool/foo/bar -> /dev/zd736 But because of the combination of error messages being printed to stdout, and the udev syntax freely accepting a space-delimited sequence of names in this context, the error message string "Unable to open device file: /dev/zd736\n" in reality expanded to SYMLINK+="zvol/Unable to open device file: /dev/zd736" which caused the following symlinks to actually be created: /dev/zvol/Unable -> /dev/zd736 /dev/to -> /dev/zd736 /dev/open -> /dev/zd736 /dev/device -> /dev/zd736 /dev/file: -> /dev/zd736 /dev//dev/zd736 -> /dev/zd736 (And, because multiple zvols had open() syscall errors, multiple zvols attempted to claim several of those symlink names, resulting in numerous udev errors and timeouts and general chaos.) This commit rectifies all this silliness by simply printing error messages to stderr, as Dennis Ritchie originally intended. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com> Signed-off-by: Justin Gottula <justin@jgottula.com> Closes #12302
Currently, there are several places in zvol_id where the program logic returns particular errno values, or even particular ioctl return values, as the program exit status, rather than a straightforward system of explicit zero on success and explicit nonzero value(s) on failure. This is problematic for multiple reasons. One particularly interesting problem that can arise, is that if any of these values happens to have all 8 least significant bits unset (i.e., it is a positive or negative multiple of 256), then although the C program sees a nonzero int value (presumed to be a failure exit status), the actual exit status as seen by the system is only the bottom 8 bits of that integer: zero. This can happen in practice, and I have encountered it myself. In a particularly weird situation, the zvol_open code in the zfs kernel module was behaving in such a manner that it caused the open() syscall to fail and for errno to be set to a kernel-private value (ERESTARTSYS, which happens to be defined as 512). It turns out that 512 is evenly divisible by 256; or, in other words, its least significant 8 bits are all-zero. So even though zvol_id believed it was returning a nonzero (failure) exit status of 512, the system modulo'd that value by 256, resulting in the actual exit status visible by other programs being 0! This actually-zero (non-failure) exit status caused problems: udev believed that the program was operating successfully, when in fact it was attempting to indicate failure via a nonzero exit status integer. Combined with another problem, this led to the creation of nonsense symlinks for zvol dev nodes by udev. Let's get rid of all this problematic logic, and simply return EXIT_SUCCESS (0) is everything went fine, and EXIT_FAILURE (1) if anything went wrong. Additionally, let's clarify some of the variable names (error is similar to errno, etc) and clean up the overall program flow a bit. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com> Signed-off-by: Justin Gottula <justin@jgottula.com> Closes #12302
This file is old as dirt. It's entirely possible that commas were optional in udev back at that time. But they're definitely supposed to be there nowadays. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com> Signed-off-by: Justin Gottula <justin@jgottula.com> Closes openzfs#12302
The $tempnode substitution is so old that it's not even mentioned in the man page anymore. It is still technically supported by udev, but with plenty of "deprecated" comments surrounding it. The preferred modern equivalent of $tempnode is $devnode (or alternatively, %N). Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com> Signed-off-by: Justin Gottula <justin@jgottula.com> Closes openzfs#12302
Assignment syntax (=) can be used for the PROGRAM key. But the PROGRAM key is really a match key, not an assign key. The internal logic used by udev to decide whether a PROGRAM key "matched" or not (which determines whether the remainder of the rule is evaluated) depends on whether the operator was OP_MATCH (==) or OP_NOMATCH (!=). [1] The man page claims that '"=", ":=", and "+=" have the same effect as "=="' for PROGRAM keys. And, after a brief perusal, the udev source code does seem to confirm that operators other than OP_MATCH (==) or OP_NOMATCH (!=) are implicitly converted to OP_MATCH (==). [2] But it's not entirely clear that this is definitely the case: anecdotal testing seems to indicate that when OP_ASSIGN (=) is used, the program's exit status is disregarded and the remainder of the rule is processed regardless of whether it was, in fact, a successful exit. The bottom line here is that, if zvol_id hits some snag and returns a nonzero exit status, then we almost certainly do NOT want to continue on with the rule and use whatever the stdout contents may have been to mindlessly create /dev/zvol/* symlinks. Therefore, let's be extra-sure and use the match (==) operator explicitly, to eliminate any possibility that udev might do the wrong thing, and ensure that a nonzero exit status will definitely short-circuit the rest of the rule, bypassing the SYMLINK+= assignments. [1] udev, file src/udev/udev-rules.c, func udev_rule_apply_token_to_event, switch case TK_M_PROGRAM if r != 0 (nonzero exit status): return token->op == OP_NOMATCH; switch case TK_M_PROGRAM if r == 0 (zero exit status): return token->op == OP_MATCH; func retval 0 => key is considered to have matched func retval 1 => key is considered to have NOT matched [2] udev, file src/udev/udev-rules.c, func parse_token, at func start: bool is_match = IN_SET(op, OP_MATCH, OP_NOMATCH); in else-if case streq(key, "PROGRAM"): if (!is_match) op = OP_MATCH; Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com> Signed-off-by: Justin Gottula <justin@jgottula.com> Closes openzfs#12302
The zvol_id program is invoked by udev, via a PROGRAM key in the 60-zvol.rules.in rule file, to determine the "pretty" /dev/zvol/* symlink paths paths that should be generated for each opaquely named /dev/zd* dev node. The udev rule uses the PROGRAM key, followed by a SYMLINK+= assignment containing the %c substitution, to collect the program's stdout and then "paste" it directly into the name of the symlink(s) to be created. Unfortunately, as currently written, zvol_id outputs both its intended output (a single string representing the symlink path that should be created to refer to the name of the dataset whose /dev/zd* path is given) AND its error messages (if any) to stdout. When processing PROGRAM keys (and others, such as IMPORT{program}), udev uses only the data written to stdout for functional purposes. Any data written to stderr is used solely for the purposes of logging (if udev's log_level is set to debug). The unintended consequence of this is as follows: if zvol_id encounters an error condition; and then udev fails to halt processing of the current rule (either because zvol_id didn't return a nonzero exit status, or because the PROGRAM key in the rule wasn't written properly to result in a "non-match" condition that would stop the current rule on a nonzero exit); then udev will create a space-delimited list of symlink names derived directly from the words of the error message string! I've observed this exact behavior on my own system, in a situation where the open() syscall on /dev/zd* dev nodes was failing sporadically (for reasons that aren't especially relevant here). Because the open() call failed, zvol_id printed "Unable to open device file: /dev/zd736\n" to stdout and then exited. The udev rule finished with SYMLINK+="zvol/%c %c". Assuming a volume name like pool/foo/bar, this would ordinarily expand to SYMLINK+="zvol/pool/foo/bar pool/foo/bar" and would cause symlinks to be created like this: /dev/zvol/pool/foo/bar -> /dev/zd736 /dev/pool/foo/bar -> /dev/zd736 But because of the combination of error messages being printed to stdout, and the udev syntax freely accepting a space-delimited sequence of names in this context, the error message string "Unable to open device file: /dev/zd736\n" in reality expanded to SYMLINK+="zvol/Unable to open device file: /dev/zd736" which caused the following symlinks to actually be created: /dev/zvol/Unable -> /dev/zd736 /dev/to -> /dev/zd736 /dev/open -> /dev/zd736 /dev/device -> /dev/zd736 /dev/file: -> /dev/zd736 /dev//dev/zd736 -> /dev/zd736 (And, because multiple zvols had open() syscall errors, multiple zvols attempted to claim several of those symlink names, resulting in numerous udev errors and timeouts and general chaos.) This commit rectifies all this silliness by simply printing error messages to stderr, as Dennis Ritchie originally intended. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com> Signed-off-by: Justin Gottula <justin@jgottula.com> Closes openzfs#12302
Currently, there are several places in zvol_id where the program logic returns particular errno values, or even particular ioctl return values, as the program exit status, rather than a straightforward system of explicit zero on success and explicit nonzero value(s) on failure. This is problematic for multiple reasons. One particularly interesting problem that can arise, is that if any of these values happens to have all 8 least significant bits unset (i.e., it is a positive or negative multiple of 256), then although the C program sees a nonzero int value (presumed to be a failure exit status), the actual exit status as seen by the system is only the bottom 8 bits of that integer: zero. This can happen in practice, and I have encountered it myself. In a particularly weird situation, the zvol_open code in the zfs kernel module was behaving in such a manner that it caused the open() syscall to fail and for errno to be set to a kernel-private value (ERESTARTSYS, which happens to be defined as 512). It turns out that 512 is evenly divisible by 256; or, in other words, its least significant 8 bits are all-zero. So even though zvol_id believed it was returning a nonzero (failure) exit status of 512, the system modulo'd that value by 256, resulting in the actual exit status visible by other programs being 0! This actually-zero (non-failure) exit status caused problems: udev believed that the program was operating successfully, when in fact it was attempting to indicate failure via a nonzero exit status integer. Combined with another problem, this led to the creation of nonsense symlinks for zvol dev nodes by udev. Let's get rid of all this problematic logic, and simply return EXIT_SUCCESS (0) is everything went fine, and EXIT_FAILURE (1) if anything went wrong. Additionally, let's clarify some of the variable names (error is similar to errno, etc) and clean up the overall program flow a bit. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com> Signed-off-by: Justin Gottula <justin@jgottula.com> Closes openzfs#12302
I've come across a number of issues of general "crustiness" concerning the udev rules file for zvol dev nodes, as well as the udev helper binary
zvol_id
. This PR consists of a small stack of patches that address some of that crustiness.Some of these are very much "1 character of code changed, with 1 long novel of commit message explanation accompanying it".
Motivation and Context
As mentioned in #12301, I noticed that when
zvol_id
runs into actual problems (which is not a common thing as far as I know), the way that errors are actually handled is... well... not good. The udev rules file for zvols is ancient and uses deprecated syntax, and thezvol_id
helper binary is also ancient and has a number of interesting quicks itself.The core problem in #12301 isn't something I can solve myself, at least right now. But the ancillary stuff, regarding old crusty udev rules files and poor error handling that tends to result in tons of completely spurious symlinks being created... well, that's something I would probably write some local patches for myself anyway; so why not turn it into a day-long writing exercise and contribute it back. 😝
Description
60-zvol.rules.in
$devnode
rather than$tempnode
PROGRAM
key definitely uses the match (==
) operator so that success/failure of the program is actually checkedzvol_id_main.c
stderr
, only to have its existence be completely disregarded by this program 👻How Has This Been Tested?
log_level=debug
etczvol_id
changes in a matter of minutes, most likely, once I give my system a rebootFortunately both the rules file and the helper program are pretty well isolated in their own corner with udev and friends, and they don't really have any super duper strong interactions with other stuff; at least not in the way that changes to e.g. parts of the kernel module might.
Types of changes
/dev/zvol/Unable
et al 😬Checklist:
Signed-off-by
.Not sure if any tests are super-warranted here. I did do a cursory inspection to see if there were any preexisting tests for e.g.
vdev_id
or something similar like that which I could crib from; but didn't seem to find much. 🤷