LinuxSelfhelp.com

Go to the first, previous, next, last section, table of contents.


Portable Shell Programming

When writing your own checks, there are some shell-script programming techniques you should avoid in order to make your code portable. The Bourne shell and upward-compatible shells like the Korn shell and Bash have evolved over the years, but to prevent trouble, do not take advantage of features that were added after UNIX version 7, circa 1977. You should not use shell functions, aliases, negated character classes, or other features that are not found in all Bourne-compatible shells; restrict yourself to the lowest common denominator. Even unset is not supported by all shells! Also, include a space after the exclamation point in interpreter specifications, like this:

#! /usr/bin/perl

If you omit the space before the path, then 4.2BSD based systems (such as Sequent DYNIX) will ignore the line, because they interpret `#! /' as a 4-byte magic number.

The set of external programs you should run in a configure script is fairly small. See section `Utilities in Makefiles' in GNU Coding Standards, for the list. This restriction allows users to start out with a fairly small set of programs and build the rest, avoiding too many interdependencies between packages.

Some of these external utilities have a portable subset of features; see section Limitations of Usual Tools.

Shellology

There are several families of shells, most prominently the Bourne family and the C shell family which are deeply incompatible. If you want to write portable shell scripts, avoid members of the C shell family.

Below we describe some of the members of the Bourne shell family.

Ash
@command{ash} is often used on GNU/Linux and BSD systems as a light-weight Bourne-compatible shell. Ash 0.2 has some bugs that are fixed in the 0.3.x series, but portable shell scripts should workaround them, since version 0.2 is still shipped with many GNU/Linux distributions. To be compatible with Ash 0.2:
  • don't use `$?' after expanding empty or unset variables:
    foo=
    false
    $foo
    echo "Don't use it: $?"
    
  • don't use command substitution within variable expansion:
    cat ${FOO=`bar`}
    
  • beware that single builtin substitutions are not performed by a sub shell, hence their effect applies to the current shell! See section Shell Substitutions, item "Command Substitution".
Bash
To detect whether you are running @command{bash}, test if BASH_VERSION is set. To disable its extensions and require POSIX compatibility, run `set -o posix'. See section `Bash POSIX Mode' in The GNU Bash Reference Manual, for details.
@command{/usr/xpg4/bin/sh on Solaris}
The POSIX-compliant Bourne shell on a Solaris system is @command{/usr/xpg4/bin/sh} and is part of an extra optional package. There is no extra charge for this package, but it is also not part of a minimal OS install and therefore some folks may not have it.
Zsh
To detect whether you are running @command{zsh}, test if ZSH_VERSION is set. By default @command{zsh} is not compatible with the Bourne shell: you have to run `emulate sh' and set NULLCMD to `:'. See section `Compatibility' in The Z Shell Manual, for details. Zsh 3.0.8 is the native @command{/bin/sh} on Mac OS X 10.0.3.

The following discussion between Russ Allbery and Robert Lipe is worth reading:

Russ Allbery:

The GNU assumption that @command{/bin/sh} is the one and only shell leads to a permanent deadlock. Vendors don't want to break user's existant shell scripts, and there are some corner cases in the Bourne shell that are not completely compatible with a POSIX shell. Thus, vendors who have taken this route will never (OK..."never say never") replace the Bourne shell (as @command{/bin/sh}) with a POSIX shell.

Robert Lipe:

This is exactly the problem. While most (at least most System V's) do have a bourne shell that accepts shell functions most vendor @command{/bin/sh} programs are not the POSIX shell.

So while most modern systems do have a shell _somewhere_ that meets the POSIX standard, the challenge is to find it.

Here-Documents

Don't rely on `\' being preserved just because it has no special meaning together with the next symbol. in the native @command{/bin/sh} on OpenBSD 2.7 `\"' expands to `"' in here-documents with unquoted delimiter. As a general rule, if `\\' expands to `\' use `\\' to get `\'.

With OpenBSD 2.7's @command{/bin/sh}

$ cat <<EOF
> \" \\
> EOF
" \

and with Bash:

bash-2.04$ cat <<EOF
> \" \\
> EOF
\" \

Many older shells (including the Bourne shell) implement here-documents inefficiently. Users can generally speed things up by using a faster shell, e.g., by using the command `bash ./configure' rather than plain `./configure'.

Some shells can be extremely inefficient when there are a lot of here-documents inside a single statement. For instance if your `configure.ac' includes something like:

if <cross_compiling>; then
  assume this and that
else
  check this
  check that
  check something else
  ...
  on and on forever
  ...
fi

A shell parses the whole if/fi construct, creating temporary files for each here document in it. Some shells create links for such here-documents on every fork, so that the clean-up code they had installed correctly removes them. It is creating the links that the shell can take forever.

Moving the tests out of the if/fi, or creating multiple if/fi constructs, would improve the performance significantly. Anyway, this kind of construct is not exactly the typical use of Autoconf. In fact, it's even not recommended, because M4 macros can't look into shell conditionals, so we may fail to expand a macro when it was expanded before in a conditional path, and the condition turned out to be false at run-time, and we end up not executing the macro at all.

File Descriptors

Some file descriptors shall not be used, since some systems, admittedly arcane, use them for special purpose:

3
some systems may open it to `/dev/tty'.
4
used on the Kubota Titan.

Don't redirect several times the same file descriptor, as you are doomed to failure under Ultrix.

ULTRIX V4.4 (Rev. 69) System #31: Thu Aug 10 19:42:23 GMT 1995
UWS V4.4 (Rev. 11)
$ eval 'echo matter >fullness' >void
illegal io
$ eval '(echo matter >fullness)' >void
illegal io
$ (eval '(echo matter >fullness)') >void
Ambiguous output redirect.

In each case the expected result is of course `fullness' containing `matter' and `void' being empty.

Don't try to redirect the standard error of a command substitution: it must be done inside the command substitution: when running `: `cd /zorglub` 2>/dev/null' expect the error message to escape, while `: `cd /zorglub 2>/dev/null`' works properly.

It is worth noting that Zsh (but not Ash nor Bash) makes it possible in assignments though: `foo=`cd /zorglub` 2>/dev/null'.

Most shells, if not all (including Bash, Zsh, Ash), output traces on stderr, even for sub-shells. This might result in undesired content if you meant to capture the standard-error output of the inner command:

$ ash -x -c '(eval "echo foo >&2") 2>stderr'
$ cat stderr
+ eval echo foo >&2
+ echo foo
foo
$ bash -x -c '(eval "echo foo >&2") 2>stderr'
$ cat stderr
+ eval 'echo foo >&2'
++ echo foo
foo
$ zsh -x -c '(eval "echo foo >&2") 2>stderr'
# Traces on startup files deleted here.
$ cat stderr
+zsh:1> eval echo foo >&2
+zsh:1> echo foo
foo

You'll appreciate the various levels of detail...

One workaround is to grep out uninteresting lines, hoping not to remove good ones...

File System Conventions

While @command{autoconf} and friends will usually be run on some Unix variety, it can and will be used on other systems, most notably DOS variants. This impacts several assumptions regarding file and path names.

For example, the following code:

case $foo_dir in
  /*) # Absolute
     ;;
  *)
     foo_dir=$dots$foo_dir ;;
esac

will fail to properly detect absolute paths on those systems, because they can use a drivespec, and will usually use a backslash as directory separator. The canonical way to check for absolute paths is:

case $foo_dir in
  [\\/]* | ?:[\\/]* ) # Absolute
     ;;
  *)
     foo_dir=$dots$foo_dir ;;
esac

Make sure you quote the brackets if appropriate and keep the backslash as first character (see section Limitations of Shell Builtins).

Also, because the colon is used as part of a drivespec, these systems don't use it as path separator. When creating or accessing paths, use $ac_path_separator instead (or the PATH_SEPARATOR output variable). @command{autoconf} sets this to the appropriate value (`:' or `;') when it starts up.

File names need extra care as well. While DOS-based environments that are Unixy enough to run @command{autoconf} (such as DJGPP) will usually be able to handle long file names properly, there are still limitations that can seriously break packages. Several of these issues can be easily detected by the @href{ftp://ftp.gnu.org/gnu/non-gnu/doschk/doschk-1.1.tar.gz, doschk} package.

A short overview follows; problems are marked with SFN/LFN to indicate where they apply: SFN means the issues are only relevant to plain DOS, not to DOS boxes under Windows, while LFN identifies problems that exist even under Windows.

No multiple dots (SFN)
DOS cannot handle multiple dots in filenames. This is an especially important thing to remember when building a portable configure script, as @command{autoconf} uses a .in suffix for template files. This is perfectly OK on Unices:
AC_CONFIG_HEADER(config.h)
AC_CONFIG_FILES([source.c foo.bar])
AC_OUTPUT
but it causes problems on DOS, as it requires `config.h.in', `source.c.in' and `foo.bar.in'. To make your package more portable to DOS-based environments, you should use this instead:
AC_CONFIG_HEADER(config.h:config.hin)
AC_CONFIG_FILES([source.c:source.cin foo.bar:foobar.in])
AC_OUTPUT
No leading dot (SFN)
DOS cannot handle filenames that start with a dot. This is usually not a very important issue for @command{autoconf}.
Case insensitivity (LFN)
DOS is case insensitive, so you cannot, for example, have both a file called `INSTALL' and a directory called `install'. This also affects @command{make}; if there's a file called `INSTALL' in the directory, @command{make install} will do nothing (unless the `install' target is marked as PHONY).
The 8+3 limit (SFN)
Because the DOS file system only stores the first 8 characters of the filename and the first 3 of the extension, those must be unique. That means that `foobar-part1.c', `foobar-part2.c' and `foobar-prettybird.c' all resolve to the same filename (`FOOBAR-P.C'). The same goes for `foo.bar' and `foo.bartender'. Note: This is not usually a problem under Windows, as it uses numeric tails in the short version of filenames to make them unique. However, a registry setting can turn this behaviour off. While this makes it possible to share file trees containing long file names between SFN and LFN environments, it also means the above problem applies there as well.
Invalid characters
Some characters are invalid in DOS filenames, and should therefore be avoided. In a LFN environment, these are `/', `\', `?', `*', `:', `<', `>', `|' and `"'. In a SFN environment, other characters are also invalid. These include `+', `,', `[' and `]'.

Shell Substitutions

Contrary to a persistent urban legend, the Bourne shell does not systematically split variables and backquoted expressions, in particular on the right-hand side of assignments and in the argument of case. For instance, the following code:

case "$given_srcdir" in
.)  top_srcdir="`echo "$dots" | sed 's,/$,,'`"
*)  top_srcdir="$dots$given_srcdir" ;;
esac

is more readable when written as:

case $given_srcdir in
.)  top_srcdir=`echo "$dots" | sed 's,/$,,'`
*)  top_srcdir=$dots$given_srcdir ;;
esac

and in fact it is even more portable: in the first case of the first attempt, the computation of top_srcdir is not portable, since not all shells properly understand "`..."..."...`". Worse yet, not all shells understand "`...\"...\"...`" the same way. There is just no portable way to use double-quoted strings inside double-quoted backquoted expressions (pfew!).

$@
One of the most famous shell-portability issues is related to `"$@"': when there are no positional arguments, it is supposed to be equivalent to nothing. But some shells, for instance under Digital Unix 4.0 and 5.0, will then replace it with an empty argument. To be portable, use `${1+"$@"}'.
${var:-value}
Old BSD shells, including the Ultrix sh, don't accept the colon for any shell substitution, and complain and die.
${var=literal}
Be sure to quote:
: ${var='Some words'}
otherwise some shells, such as on Digital Unix V 5.0, will die because of a "bad substitution". Solaris' @command{/bin/sh} has a frightening bug in its interpretation of this. Imagine you need set a variable to a string containing `}'. This `}' character confuses Solaris' @command{/bin/sh} when the affected variable was already set. This bug can be exercised by running:
$ unset foo
$ foo=${foo='}'}
$ echo $foo
}
$ foo=${foo='}'   # no error; this hints to what the bug is
$ echo $foo
}
$ foo=${foo='}'}
$ echo $foo
}}
 ^ ugh!
It seems that `}' is interpreted as matching `${', even though it is enclosed in single quotes. The problem doesn't happen using double quotes.
${var=expanded-value}
On Ultrix, running
default="yu,yaa"
: ${var="$default"}
will set var to `M-yM-uM-,M-yM-aM-a', i.e., the 8th bit of each char will be set. You won't observe the phenomenon using a simple `echo $var' since apparently the shell resets the 8th bit when it expands $var. Here are two means to make this shell confess its sins:
$ cat -v <<EOF
$var
EOF
and
$ set | grep '^var=' | cat -v
One classic incarnation of this bug is:
default="a b c"
: ${list="$default"}
for c in $list; do
  echo $c
done
You'll get `a b c' on a single line. Why? Because there are no spaces in `$list': there are `M- ', i.e., spaces with the 8th bit set, hence no IFS splitting is performed!!! One piece of good news is that Ultrix works fine with `: ${list=$default}'; i.e., if you don't quote. The bad news is then that QNX 4.25 then sets list to the last item of default! The portable way out consists in using a double assignment, to switch the 8th bit twice on Ultrix:
list=${list="$default"}
...but beware of the `}' bug from Solaris (see above). For safety, use:
test "${var+set}" = set || var={value}
`commands`
While in general it makes no sense, do not substitute a single builtin with side effects as Ash 0.2, trying to optimize, does not fork a sub-shell to perform the command. For instance, if you wanted to check that @command{cd} is silent, do not use `test -z "`cd /`"' because the following can happen:
$ pwd
/tmp
$ test -n "`cd /`" && pwd
/
The result of `foo=`exit 1`' is left as an exercise to the reader.
$(commands)
This construct is meant to replace ``commands`'; they can be nested while this is impossible to do portably with back quotes. Unfortunately it is not yet widely supported. Most notably, even recent releases of Solaris don't support it:
$ showrev -c /bin/sh | grep version
Command version: SunOS 5.8 Generic 109324-02 February 2001
$ echo $(echo blah)
syntax error: `(' unexpected
nor does IRIX 6.5's Bourne shell:
$ uname -a
IRIX firebird-image 6.5 07151432 IP22
$ echo $(echo blah)
$(echo blah)

Assignments

When setting several variables in a row, be aware that the order of the evaluation is undefined. For instance `foo=1 foo=2; echo $foo' gives `1' with sh on Solaris, but `2' with Bash. You must use `;' to enforce the order: `foo=1; foo=2; echo $foo'.

Don't rely on the exit status of an assignment: Ash 0.2 does not change the status and propagates that of the last statement:

$ false || foo=bar; echo $?
1
$ false || foo=`:`; echo $?
0

and to make things even worse, QNX 4.25 just sets the exit status to 0 in any case:

$ foo=`exit 1`; echo $?
0

To assign default values, follow this algorithm:

  1. If the default value is a literal and does not contain any closing brace, use:
    : ${var='my literal'}
    
  2. If the default value contains no closing brace, has to be expanded, and the variable being initialized will never be IFS-split (i.e., it's not a list), then use:
    : ${var="$default"}
    
  3. If the default value contains no closing brace, has to be expanded, and the variable being initialized will be IFS-split (i.e., it's a list), then use:
    var=${var="$default"}
    
  4. If the default value contains a closing brace, then use:
    test "${var+set}" = set || var='${indirection}'
    

In most cases `var=${var="$default"}' is fine, but in case of doubt, just use the latter. See section Shell Substitutions, items `${var:-value}' and `${var=value}' for the rationale.

Special Shell Variables

Some shell variables should not be used, since they can have a deep influence on the behavior of the shell. In order to recover a sane behavior from the shell, some variables should be unset, but @command{unset} is not portable (see section Limitations of Shell Builtins) and a fallback value is needed. We list these values below.

CDPATH
When this variable is set cd is verbose, so idioms such as `abs=`cd $rel && pwd`' break because abs receives the path twice. Setting CDPATH to the empty value is not enough for most shells. A simple colon is enough except for zsh, which prefers a leading dot:
zsh-3.1.6 % mkdir foo && (CDPATH=: cd foo)
/tmp/foo
zsh-3.1.6 % (CDPATH=:. cd foo)
/tmp/foo
zsh-3.1.6 % (CDPATH=.: cd foo)
zsh-3.1.6 %
(of course we could just unset CDPATH, since it also behaves properly if set to the empty string). Life wouldn't be so much fun if @command{bash} and @command{zsh} had the same behavior:
bash-2.02 % (CDPATH=:. cd foo)
bash-2.02 % (CDPATH=.: cd foo)
/tmp/foo
Therefore, a portable solution to neutralize `CDPATH' is
CDPATH=${ZSH_VERSION+.}:
Note that since @command{zsh} supports @command{unset}, you may unset `CDPATH' using `:' as a fallback, see section Limitations of Shell Builtins.
IFS
Don't set the first character of IFS to backslash. Indeed, Bourne shells use the first character (backslash) when joining the components in `"$@"' and some shells then re-interpret (!) the backslash escapes, so you can end up with backspace and other strange characters.
LANG
LC_ALL
LC_TIME
LC_CTYPE
LANGUAGE
LC_COLLATE
LC_NUMERIC
LC_MESSAGES
These must not be set unconditionally because not all systems understand e.g. `LANG=C' (notably SCO). Fixing @env{LC_MESSAGES} prevents Solaris @command{sh} from translating var values in set! Non-C @env{LC_CTYPE} values break the ctype check. Fixing @env{LC_COLLATE} makes scripts more portable in some cases. For example, it causes the regular expression `[a-z]' to match only lower-case letters on ASCII platforms. However, `[a-z]' does not work in general even when @env{LC_COLLATE} is fixed; for example, it does not work for EBCDIC platforms. For maximum portability, you should use regular expressions like `[abcdefghijklmnopqrstuvwxyz]' that list characters explicitly instead of relying on ranges. If one of these variables is set, you should try to unset it, using `C' as a fall back value. see section Limitations of Shell Builtins, builtin @command{unset}, for more details.
NULLCMD
When executing the command `>foo', @command{zsh} executes `$NULLCMD >foo'. The Bourne shell considers NULLCMD is `:', while @command{zsh}, even in Bourne shell compatibility mode, sets NULLCMD to `cat'. If you forgot to set NULLCMD, your script might be suspended waiting for data on its standard input.
status
This variable is an alias to `$?' for zsh (at least 3.1.6), hence read-only. Do not use it.
PATH_SEPARATOR
On DJGPP systems, the PATH_SEPARATOR variable can be set to either `:' or `;' to control the path separator @command{bash} uses to set up certain environment variables (such as PATH). Since this only works inside bash, you want autoconf to detect the regular DOS path separator `;', so it can be safely substituted in files that may not support `;' as path separator. So either unset this variable or set it to `;'.
RANDOM
Many shells provide RANDOM, a variable that returns a different integer when used. Most of the time, its value does not change when it is not used, but on IRIX 6.5 the value changes all the time. This can be observed by using @command{set}.

Limitations of Shell Builtins

No, no, we are serious: some shells do have limitations! :)

You should always keep in mind that any built-in or command may support options, and therefore have a very different behavior with arguments starting with a dash. For instance, the innocent `echo "$word"' can give unexpected results when word starts with a dash. It is often possible to avoid this problem using `echo "x$word"', taking the `x' into account later in the pipe.

@command{!}
You can't use @command{!}, you'll have to rewrite your code.
@command{break}
The use of `break 2', etcetera, is safe.
@command{case}
You don't need to quote the argument; no splitting is performed. You don't need the final `;;', but you should use it. Because of a bug in its fnmatch, @command{bash} fails to properly handle backslashes in character classes:
bash-2.02$ case /tmp in [/\\]*) echo OK;; esac
bash-2.02$
This is extremely unfortunate, since you are likely to use this code to handle UNIX or MS-DOS absolute paths. To work around this bug, always put the backslash first:
bash-2.02$ case '\TMP' in [\\/]*) echo OK;; esac
OK
bash-2.02$ case /tmp in [\\/]*) echo OK;; esac
OK
@command{echo}
The simple echo is probably the most surprising source of portability troubles. It is not possible to use `echo' portably unless both options and escape sequences are omitted. New applications which are not aiming at portability should use `printf' instead of `echo'. Don't expect any option. See section Preset Output Variables, ECHO_N etc. for a means to simulate @option{-c}. Do not use backslashes in the arguments, as there is no consensus on their handling. On `echo '\n' | wc -l', the @command{sh} of Digital Unix 4.0, MIPS RISC/OS 4.52, answer 2, but the Solaris' @command{sh}, Bash and Zsh (in @command{sh} emulation mode) report 1. Please note that the problem is truly @command{echo}: all the shells understand `'\n'' as the string composed of a backslash and an `n'. Because of these problems, do not pass a string containing arbitrary characters to @command{echo}. For example, `echo "$foo"' is safe if you know that foo's value cannot contain backslashes and cannot start with `-', but otherwise you should use a here-document like this:
cat <<EOF
$foo
EOF
@command{exit}
The default value of @command{exit} is supposed to be $?; unfortunately, some shells, such as the DJGPP port of Bash 2.04, just perform `exit 0'.
bash-2.04$ foo=`exit 1` || echo fail
fail
bash-2.04$ foo=`(exit 1)` || echo fail
fail
bash-2.04$ foo=`(exit 1); exit` || echo fail
bash-2.04$
Using `exit $?' restores the expected behavior. Some shell scripts, such as those generated by @command{autoconf}, use a trap to clean up before exiting. If the last shell command exited with nonzero status, the trap also exits with nonzero status so that the invoker can tell that an error occurred. Unfortunately, in some shells, such as Solaris 8 @command{sh}, an exit trap ignores the exit command's status. In these shells, a trap cannot determine whether it was invoked by plain exit or by exit 1. Instead of calling exit directly, use the AC_MSG_ERROR macro that has a workaround for this problem.
@command{export}
The builtin @command{export} dubs environment variable a shell variable. Each update of exported variables corresponds to an update of the environment variables. Conversely, each environment variable received by the shell when it is launched should be imported as a shell variable marked as exported. Alas, many shells, such as Solaris 2.5, IRIX 6.3, IRIX 5.2, AIX 4.1.5 and DU 4.0, forget to @command{export} the environment variables they receive. As a result, two variables are coexisting: the environment variable and the shell variable. The following code demonstrates this failure:
#! /bin/sh
echo $FOO
FOO=bar
echo $FOO
exec /bin/sh $0
when run with `FOO=foo' in the environment, these shells will print alternately `foo' and `bar', although it should only print `foo' and then a sequence of `bar's. Therefore you should @command{export} again each environment variable that you update.
@command{false}
Don't expect @command{false} to exit with status 1: in the native Bourne shell of Solaris 8, it exits with status 255.
@command{for}
To loop over positional arguments, use:
for arg
do
  echo "$arg"
done
You may not leave the do on the same line as for, since some shells improperly grok:
for arg; do
  echo "$arg"
done
If you want to explicitly refer to the positional arguments, given the `$@' bug (see section Shell Substitutions), use:
for arg in ${1+"$@"}; do
  echo "$arg"
done
@command{if}
Using `!' is not portable. Instead of:
if ! cmp -s file file.new; then
  mv file.new file
fi
use:
if cmp -s file file.new; then :; else
  mv file.new file
fi
There are shells that do not reset the exit status from an @command{if}:
$ if (exit 42); then true; fi; echo $?
42
whereas a proper shell should have printed `0'. This is especially bad in Makefiles since it produces false failures. This is why properly written Makefiles, such as Automake's, have such hairy constructs:
if test -f "$file"; then
  install "$file" "$dest"
else
  :
fi
@command{set}
This builtin faces the usual problem with arguments starting with a dash. Modern shells such as Bash or Zsh understand @option{--} to specify the end of the options (any argument after @option{--} is a parameters, even `-x' for instance), but most shells simply stop the option processing as soon as a non-option argument is found. Therefore, use `dummy' or simply `x' to end the option processing, and use @command{shift} to pop it out:
set x $my_list; shift
@command{shift}
Not only is @command{shift}ing a bad idea when there is nothing left to shift, but in addition it is not portable: the shell of MIPS RISC/OS 4.52 refuses to do it.
@command{test}
The test program is the way to perform many file and string tests. It is often invoked by the alternate name `[', but using that name in Autoconf code is asking for trouble since it is an M4 quote character. If you need to make multiple checks using test, combine them with the shell operators `&&' and `||' instead of using the test operators @option{-a} and @option{-o}. On System V, the precedence of @option{-a} and @option{-o} is wrong relative to the unary operators; consequently, POSIX does not specify them, so using them is nonportable. If you combine `&&' and `||' in the same statement, keep in mind that they have equal precedence. You may use `!' with @command{test}, but not with @command{if}: `test ! -r foo || exit 1'.
@command{test (files)}
To enable configure scripts to support cross-compilation, they shouldn't do anything that tests features of the build system instead of the host system. But occasionally you may find it necessary to check whether some arbitrary file exists. To do so, use `test -f' or `test -r'. Do not use `test -x', because 4.3BSD does not have it. Do not use `test -e' either, because Solaris 2.5 does not have it.
@command{test (strings)}
Avoid `test "string"', in particular if string might start with a dash, since test might interpret its argument as an option (e.g., `string = "-n"'). Contrary to a common belief, `test -n string' and `test -z string' are portable, nevertheless many shells (such as Solaris 2.5, AIX 3.2, UNICOS 10.0.0.6, Digital Unix 4 etc.) have bizarre precedence and may be confused if string looks like an operator:
$ test -n =
test: argument expected
If there are risks, use `test "xstring" = x' or `test "xstring" != x' instead. It is frequent to find variations of the following idiom:
test -n "`echo $ac_feature | sed 's/[-a-zA-Z0-9_]//g'`" &&
  action
to take an action when a token matches a given pattern. Such constructs should always be avoided by using:
echo "$ac_feature" | grep '[^-a-zA-Z0-9_]' >/dev/null 2>&1 &&
  action
Use case where possible since it is faster, being a shell builtin:
case $ac_feature in
  *[!-a-zA-Z0-9_]*) action;;
esac
Alas, negated character classes are probably not portable, although no shell is known to not support the POSIX.2 syntax `[!...]' (when in interactive mode, @command{zsh} is confused by the `[!...]' syntax and looks for an event in its history because of `!'). Many shells do not support the alternative syntax `[^...]' (Solaris, Digital Unix, etc.). One solution can be:
expr "$ac_feature" : '.*[^-a-zA-Z0-9_]' >/dev/null &&
  action
or better yet
expr "x$ac_feature" : '.*[^-a-zA-Z0-9_]' >/dev/null &&
  action
`expr "Xfoo" : "Xbar"' is more robust than `echo "Xfoo" | grep "^Xbar"', because it avoids problems when `foo' contains backslashes.
@command{trap}
It is safe to trap at least the signals 1, 2, 13 and 15. You can also trap 0, i.e., have the @command{trap} run when the script ends (either via an explicit @command{exit}, or the end of the script). Although POSIX is not absolutely clear on this point, it is widely admitted that when entering the trap `$?' should be set to the exit status of the last command run before the trap. The ambiguity can be summarized as: "when the trap is launched by an @command{exit}, what is the last command run: that before @command{exit}, or @command{exit} itself?" Bash considers @command{exit} to be the last command, while Zsh and Solaris 8 @command{sh} consider that when the trap is run it is still in the @command{exit}, hence it is the previous exit status that the trap receives:
$ cat trap.sh
trap 'echo $?' 0
(exit 42); exit 0
$ zsh trap.sh
42
$ bash trap.sh
0
The portable solution is then simple: when you want to `exit 42', run `(exit 42); exit 42', the first @command{exit} being used to set the exit status to 42 for Zsh, and the second to trigger the trap and pass 42 as exit status for Bash. The shell in FreeBSD 4.0 has the following bug: `$?' is reset to 0 by empty lines if the code is inside @command{trap}.
$ trap 'false

echo $?' 0
$ exit
0
Fortunately, this bug only affects @command{trap}.
@command{true}
Don't worry: as far as we know @command{true} is portable. Nevertheless, it's not always a builtin (e.g., Bash 1.x), and the portable shell community tends to prefer using @command{:}. This has a funny side effect: when asked whether @command{false} is more portable than @command{true} Alexandre Oliva answered:

In a sense, yes, because if it doesn't exist, the shell will produce an exit status of failure, which is correct for @command{false}, but not for @command{true}.

@command{unset}
You cannot assume the support of @command{unset}, nevertheless, because it is extremely useful to disable embarrassing variables such as CDPATH or LANG, you can test for its existence and use it provided you give a neutralizing value when @command{unset} is not supported:
if (unset FOO) >/dev/null 2>&1; then
  unset=unset
else
  unset=false
fi
$unset CDPATH || CDPATH=:
See section Special Shell Variables, for some neutralizing values. Also, see section Limitations of Shell Builtins, documentation of @command{export}, for the case of environment variables.

Limitations of Usual Tools

The small set of tools you can expect to find on any machine can still include some limitations you should be aware of.

@command{awk}
Don't leave white spaces before the parentheses in user functions calls, GNU awk will reject it:
$ gawk 'function die () { print "Aaaaarg!"  }
        BEGIN { die () }'
gawk: cmd. line:2:         BEGIN { die () }
gawk: cmd. line:2:                      ^ parse error
$ gawk 'function die () { print "Aaaaarg!"  }
        BEGIN { die() }'
Aaaaarg!
If you want your program to be deterministic, don't depend on for on arrays:
$ cat for.awk
END {
  arr["foo"] = 1
  arr["bar"] = 1
  for (i in arr)
    print i
}
$ gawk -f for.awk </dev/null
foo
bar
$ nawk -f for.awk </dev/null
bar
foo
Some AWK, such as HPUX 11.0's native one, have regex engines fragile to inner anchors:
$ echo xfoo | $AWK '/foo|^bar/ { print }'
$ echo bar | $AWK '/foo|^bar/ { print }'
bar
$ echo xfoo | $AWK '/^bar|foo/ { print }'
xfoo
$ echo bar | $AWK '/^bar|foo/ { print }'
bar
Either do not depend on such patterns (i.e., use `/^(.*foo|bar)/', or use a simple test to reject such AWK.
@command{cat}
Don't rely on any option. The option @option{-v}, which displays non-printing characters, seems portable, though.
@command{cc}
When a compilation such as `cc foo.c -o foo' fails, some compilers (such as CDS on Reliant UNIX) leave a `foo.o'.
@command{cmp}
@command{cmp} performs a raw data comparison of two files, while @command{diff} compares two text files. Therefore, if you might compare DOS files, even if only checking whether two files are different, use @command{diff} to avoid spurious differences due to differences of newline encoding.
@command{cp}
SunOS @command{cp} does not support @option{-f}, although its @command{mv} does. It's possible to deduce why @command{mv} and @command{cp} are different with respect to @option{-f}. @command{mv} prompts by default before overwriting a read-only file. @command{cp} does not. Therefore, @command{mv} requires a @option{-f} option, but @command{cp} does not. @command{mv} and @command{cp} behave differently with respect to read-only files because the simplest form of @command{cp} cannot overwrite a read-only file, but the simplest form of @command{mv} can. This is because @command{cp} opens the target for write access, whereas @command{mv} simply calls link (or, in newer systems, rename).
@command{diff}
Option @option{-u} is nonportable. Some implementations, such as Tru64's, fail when comparing to `/dev/null'. Use an empty file instead.
@command{dirname}
Not all hosts have @command{dirname}, but it is reasonably easy to emulate, e.g.:
dir=`expr "x$file" : 'x\(.*\)/[^/]*' \|
          '.'      : '.'
But there are a few subtilities, e.g., under UN*X, should `//1' give `/'? Paul Eggert answers:

No, under some older flavors of Unix, leading `//' is a special path name: it refers to a "super-root" and is used to access other machines' files. Leading `///', `////', etc. are equivalent to `/'; but leading `//' is special. I think this tradition started with Apollo Domain/OS, an OS that is still in use on some older hosts.

POSIX.2 allows but does not require the special treatment for `//'. It says that the behavior of dirname on path names of the form `//([^/]+/*)?' is implementation defined. In these cases, GNU @command{dirname} returns `/', but it's more portable to return `//' as this works even on those older flavors of Unix.

I have heard rumors that this special treatment of `//' may be dropped in future versions of POSIX, but for now it's still the standard.

@command{egrep}
The empty alternative is not portable, use `?' instead. For instance with Digital Unix v5.0:
> printf "foo\n|foo\n" | egrep '^(|foo|bar)$'
|foo
> printf "bar\nbar|\n" | egrep '^(foo|bar|)$'
bar|
> printf "foo\nfoo|\n|bar\nbar\n" | egrep '^(foo||bar)$'
foo
|bar
@command{egrep} also suffers the limitations of @command{grep}.
@command{expr}
No @command{expr} keyword starts with `x', so use @samp{expr x"word" : 'xregex'} to keep @command{expr} from misinterpreting word. Don't use length, substr, match and index.
@command{expr (`|')}
You can use `|'. Although POSIX does require that `expr "' return the empty string, it does not specify the result when you `|' together the empty string (or zero) with the empty string. For example:
expr " \| "
GNU/Linux and POSIX.2-1992 return the empty string for this case, but traditional Unix returns `0' (Solaris is one such example). In the latest POSIX draft, the specification has been changed to match traditional Unix's behavior (which is bizarre, but it's too late to fix this). Please note that the same problem does arise when the empty string results from a computation, as in:
expr bar : foo \| foo : bar
Avoid this portability problem by avoiding the empty string.
@command{expr (`:')}
Don't use `\?', `\+' and `\|' in patterns, they are not supported on Solaris. The POSIX.2-1992 standard is ambiguous as to whether `expr a : b' (and `expr 'a' : '\(b\)'') output `0' or the empty string. In practice, it outputs the empty string on most platforms, but portable scripts should not assume this. For instance, the QNX 4.25 native @command{expr} returns `0'. You may believe that one means to get a uniform behavior would be to use the empty string as a default value:
expr a : b \| "
unfortunately this behaves exactly as the original expression, see the `@command{expr' (`:')} entry for more information. Older @command{expr} implementations (e.g. SunOS 4 @command{expr} and Solaris 8 @command{/usr/ucb/expr}) have a silly length limit that causes @command{expr} to fail if the matched substring is longer than 120 bytes. In this case, you might want to fall back on `echo|sed' if @command{expr} fails. Don't leave, there is some more! The QNX 4.25 @command{expr}, in addition of preferring `0' to the empty string, has a funny behavior in its exit status: it's always 1 when parentheses are used!
$ val=`expr 'a' : 'a'`; echo "$?: $val"
0: 1
$ val=`expr 'a' : 'b'`; echo "$?: $val"
1: 0

$ val=`expr 'a' : '\(a\)'`; echo "?: $val"
1: a
$ val=`expr 'a' : '\(b\)'`; echo "?: $val"
1: 0
In practice this can be a big problem if you are ready to catch failures of @command{expr} programs with some other method (such as using @command{sed}), since you may get twice the result. For instance
$ expr 'a' : '\(a\)' || echo 'a' | sed 's/^\(a\)$/\1/'
will output `a' on most hosts, but `aa' on QNX 4.25. A simple work around consists in testing @command{expr} and use a variable set to @command{expr} or to @command{false} according to the result.
@command{find}
The option @option{-maxdepth} seems to be GNU specific. Tru64 v5.1, NetBSD 1.5 and Solaris 2.5 @command{find} commands do not understand it.
@command{grep}
Don't use `grep -s' to suppress output, because `grep -s' on System V does not suppress output, only error messages. Instead, redirect the standard output and standard error (in case the file doesn't exist) of grep to `/dev/null'. Check the exit status of grep to determine whether it found a match. Don't use multiple regexps with @option{-e}, as some grep will only honor the last pattern (eg., IRIX 6.5 and Solaris 2.5.1). Anyway, Stardent Vistra SVR4 grep lacks @option{-e}... Instead, use alternation and egrep.
@command{ln}
Don't rely on @command{ln} having a @option{-f} option. Symbolic links are not available on old systems, use `ln' as a fall back. For versions of the DJGPP before 2.04, @command{ln} emulates soft links for executables by generating a stub that in turn calls the real program. This feature also works with nonexistent files like in the Unix spec. So `ln -s file link' will generate `link.exe', which will attempt to call `file.exe' if run. But this feature only works for executables, so `cp -p' is used instead for these systems. DJGPP versions 2.04 and later have full symlink support.
@command{mv}
The only portable options are @option{-f} and @option{-i}. Moving individual files between file systems is portable (it was in V6), but it is not always atomic: when doing `mv new existing', there's a critical section where neither the old nor the new version of `existing' actually exists. Moving directories across mount points is not portable, use @command{cp} and @command{rm}.
@command{sed}
Patterns should not include the separator (unless escaped), even as part of a character class. In conformance with POSIX, the Cray @command{sed} will reject `s/[^/]*$//': use `s,[^/]*$,,'. Sed scripts should not use branch labels longer than 8 characters and should not contain comments. Don't include extra `;', as some @command{sed}, such as NetBSD 1.4.2's, try to interpret the second as a command:
$ echo a | sed 's/x/x/;;s/x/x/'
sed: 1: "s/x/x/;;s/x/x/": invalid command code ;
Input should have reasonably long lines, since some @command{sed} have an input buffer limited to 4000 bytes. Alternation, `\|', is common but not portable. Anchors (`^' and `$') inside groups are not portable. Nested groups are extremely portable, but there is at least one @command{sed} (System V/68 Base Operating System R3V7.1) that does not support it. Of course the option @option{-e} is portable, but it is not needed. No valid Sed program can start with a dash, so it does not help disambiguating. Its sole usefulness is helping enforcing indenting as in:
sed -e instruction-1 \
    -e instruction-2
as opposed to
sed instruction-1;instruction-2
Contrary to yet another urban legend, you may portably use `&' in the replacement part of the s command to mean "what was matched".
@command{sed (`t')}
Some old systems have @command{sed} that "forget" to reset their `t' flag when starting a new cycle. For instance on MIPS RISC/OS, and on IRIX 5.3, if you run the following @command{sed} script (the line numbers are not actual part of the texts):
s/keep me/kept/g  # a
t end             # b
s/.*/deleted/g    # c
: end             # d
on
delete me         # 1
delete me         # 2
keep me           # 3
delete me         # 4
you get
deleted
delete me
kept
deleted
instead of
deleted
deleted
kept
deleted
Why? When processing 1, a matches, therefore sets the t flag, b jumps to d, and the output is produced. When processing line 2, the t flag is still set (this is the bug). Line a fails to match, but @command{sed} is not supposed to clear the t flag when a substitution fails. Line b sees that the flag is set, therefore it clears it, and jumps to d, hence you get `delete me' instead of `deleted'. When processing 3 t is clear, a matches, so the flag is set, hence b clears the flags and jumps. Finally, since the flag is clear, 4 is processed properly. There are two things one should remind about `t' in @command{sed}. Firstly, always remember that `t' jumps if some substitution succeeded, not only the immediately preceding substitution, therefore, always use a fake `t clear; : clear' to reset the t flag where indeed. Secondly, you cannot rely on @command{sed} to clear the flag at each new cycle. One portable implementation of the script above is:
t clear
: clear
s/keep me/kept/g
t end
s/.*/deleted/g
: end
@command{touch}
On some old BSD systems, @command{touch} or any command that results in an empty file does not update the timestamps, so use a command like echo as a workaround. GNU @command{touch} 3.16r (and presumably all before that) fails to work on SunOS 4.1.3 when the empty file is on an NFS-mounted 4.2 volume.

Limitations of Make

Make itself suffers a great number of limitations, only a few of which being listed here. First of all, remember that since commands are executed by the shell, all its weaknesses are inherited...

Leading underscore in macro names
Some Make don't support leading underscores in macro names, such as on NEWS-OS 4.2R.
$ cat Makefile
_am_include = #
_am_quote =
all:; @echo this is test

% make
Make: Must be a separator on rules line 2.  Stop.

$ cat Makefile2
am_include = #
am_quote =
all:; @echo this is test

$ make -f Makefile2
this is test
VPATH
Don't use it! For instance any assignment to VPATH causes Sun @command{make} to only execute the first set of double-colon rules.


Go to the first, previous, next, last section, table of contents.