Skip to content

Portability

Lawrence Velázquez edited this page Mar 26, 2024 · 28 revisions

Since mkvimball-sh is pretty simple, my secondary goal is to make it portable to the point of absurdity. "Could this theoretically work on UNIX System III?" is a practical concern for approximately no one.

These are some notes on portability issues and idioms relevant to mkvimball-sh specifically. They are meant to keep the source code from becoming 99% comments, not to be a comprehensive reference for excessively portable shell scripting. (For that, explore the linked resources.) Error checking is largely omitted, for brevity. Specifics about affected shells are also largely omitted, for brevity and because I don’t know them.

Language

Assignments

The exit status of var=`cmd` is always zero on QNX 4.25.[1] Mitigate this by validating the variable contents to a satisfactory degree. For example:

var=`cmd` && test "x$var" != x

case

The exit status of case varies between shells if no patterns match or if a matched pattern has no commands.[2][3] To ensure consistent behavior, specify every pattern and command list:[3]

case $var in
    pat1) : ;;
    pat2) cmd ;;
    *) : ;;
esac

Command substitution

Bourne shells (other than Schily[4]) do not implement the $(…​) syntax for command substitution.[3][5][6] Use `…​` instead.

Consider assigning command substitutions to variables directly:

var1=`cmd1` && test "x$var1" != x || exit "$?"
var2="prefix${var1}suffix"
cmd2 "$var2"

This permits validating the results before proceeding and sidesteps several issues:

  • Neither "`…​"…​"…​`" nor "`…​\"…​\"…​`" is portable.[6]

  • Upon receiving a fatal or trapped signal during a command substitution, some shells erroneously execute the enclosing command.[6]

  • The MSYS shell may mishandle command substitutions embedded in double quotes if they don’t close together (i.e., "`cmd` suffix").[6]

Trailing line feeds are removed from the result of command substitution.[6][7] The usual technique for preserving them, var=`cmd; echo .`; var=${var%.},[8] cannot be used in Bourne shells due to the absence of ${var%.}. Instead, quote the result and pass it to eval:[8]

sed_script="s/'/'\\\\''/g;1s/^/'/;\$s/\$/'/"
var=`cmd | sed "$sed_script"`
eval "var=$var"

This assumes that cmd appends a final newline (à la dirname). Otherwise, use { cmd; echo; } to ensure that sed receives valid text input.

for

In the Solaris Bourne shell (and possibly others), a for loop that never iterates has the exit status of the previous command.[3] To ensure the POSIX behavior,[7] precede the loop with : or true:[3]

:
for var in $possibly_empty_var; do
    cmd "$var"
done

When looping over the positional parameters, do so implicitly to avoid issues with "$@":[2][9]

for var
do
    cmd "$var"
done

Lists and pipelines

Lists and pipelines should ensure that the commands that determine their exit statuses cannot produce spurious ones. For pipelines in the V7 and System III shells and their derivatives,[4] that includes the first command.[10] For example, this pipeline accounts for issues with case:

case $var in
    pat1) cmd1 ;;
    pat2) cmd2 ;;
    *) : ;;
esac | cmd3

Parameter expansion

Bourne shells lack ${var%word} and its siblings,[6] so use an external tool instead.[11] If var is known to not contain line feeds, a straightforward command substitution might suffice:

newvar=`sed 's/suffix$//' <<EOF
$oldvar
EOF
`

To handle arbitrary values of var, quote the result and pass it to eval:[8]

sed_script="s/'/'\\\\''/g;1s/^/'/;\$s/suffix\$/'/"
newvar=`sed "$sed_script" <<EOF
$oldvar
EOF
`
eval "newvar=$newvar"

(This is a loose example, not a recipe. The specifics of how to best integrate eval-quoting depend strongly on how the variable is being modified and must be considered on a case-by-case basis.)

If there are no positional parameters, pre-SVR3 shells and most derivatives[4] expand "$@" to one empty word instead of zero words.[3][6][9] The traditional workaround, ${1+"$@"}, gets word-split by pre-4.3.0 zsh in sh emulation mode.[3][6][9] Instead use something like:[3][6]

case $# in
    0) cmd ;;
    *) cmd "$@" ;;
esac

In Bourne shells, "$@" is affected by IFS.

Utilities

echo

echo is only portable if invoked without flags or C-style escape sequences.[2][3][12] Instead of echo "$var", use:[2]

cat <<EOF
$var
EOF

Instead of echo "$var" | cmd, use:

cmd <<EOF
$var
EOF

exit

In some shells, exit with no arguments exits with a status of zero instead of $?. Use exit "$?" to be sure.[2]

getopts

Pre-SVR3 shells and derivatives lack getopts.[4] Process options manually.[13]

printf

Systems preceding V9 lack printf.[12] On Solaris 2.5.1 through 10, /usr/bin/printf cannot handle large outputs reliably.[2] Instead of printf %s\\n "$var", use:[2]

cat <<EOF
$var
EOF

Instead of printf %s\\n "$var" | cmd, use:

cmd <<EOF
$var
EOF

(I haven’t come across a generically satisfactory alternative for printf %s "$var", but I haven’t needed one.)

read

In most Bourne shells (except the Heirloom, Schily, and SVR4.2MP2 shells[4]), read does not recognize -r.[2] Consider escaping backslashes in the input:

sed 's/\\/\\&/g' | read line

set

In the V7 shell and most derivatives[4], set does not treat -- as a delimiter between options and operands.[14] Instead of set -- -arg, use:[2][14]

set x -arg
shift

Even in updated Bourne shells, set -- (without operands) does not clear or otherwise modify the positional parameters.[3][14] Instead, use:[2][3][14]

set x
shift

test

V7 (and possibly derivatives) only has test, not [.[4]

All of test str, test ! str, test -n str, test -z str, test str1 = str2, and test str1 != str2 mishandle certain operands in one shell or another.[2][3][15][16][17] When in doubt, use the "x-hack":[2][15][17]

# test whether $var is empty or not
test "x$var" = x
test "x$var" != x

# test whether $var is "val" or not
test "x$var" = xval
test "x$var" != xval

# test $var1 against $var2
test "x$var1" = "x$var2"
test "x$var1" != "x$var2"

unset

Pre-SVR2 shells and derivatives lack unset.[2][4] For some variables, setting an empty or default value[18] works just as well. For others, it may be feasible to use a proxy variable:

# initialize
var_is_set=no

# cmd1 not executed, even if var is imported
test "$var_is_set" = yes && cmd1 "$var"

var=$possibly_empty_value
var_is_set=yes

# cmd2 executed
test "$var_is_set" = yes && cmd2 "$var"

# fake "unset"
var_is_set=no

# cmd3 not executed
test "$var_is_set" = yes && cmd3 "$var"

Variables

IFS

Most Bourne shells import IFS from the environment.[4][18]

Bourne shells use IFS to word-split all unquoted strings, not just the results of unquoted expansions.[3]

If IFS does not contain a space,[9] Bourne shells expand "$@" just like "$*" — as one word comprising the positional parameters separated by single spaces.[3]

Avoid all of this by explicitly resetting IFS to its default value:

lf='
'
sp=' '
tab='	'

IFS=$sp$tab$lf

(Unsetting it does not work in Bourne shells and some older Bourne-compatible ones.[19])