Skip to content

Latest commit

 

History

History
314 lines (234 loc) · 10.6 KB

shbb.org

File metadata and controls

314 lines (234 loc) · 10.6 KB

This file contains a steadily increasing set of shell script constructs that are useable in real-world applications. In case any of these are not portable to reasonable traditional shell environments those are stated explicitly.

beginning of a shell script

The most reliable way to start shell script is to have string #!/bin/sh as the first line. In that case portability should be taken care very well. Perhaps in these days requirement to support traditional bourne shell constructs could (in most cases) be skipped (and ported whenever necessary (if ever)). Modern shell constructs are just so much convenient to use, and tested.

#!/bin/sh may be dash (linux), bash (linux) and something that works pretty much like dash in bsd systems. I’d guess newer solaris derivatives use something ksh related. Android devices have something based on mksh. macOS uses bash 3.2.

To make the shell script more robust, the next line should be:

set -eu

Every shell script should start with this line. set -e makes script exit in case any command returns nonzero (with exceptions) and set -u makes undefined variable to be an error (which makes script exit at least with -e).

The use of these is somewhat related to warning options in C compilation and use strict; use warnings in perl program.

With set -e the script must take care of the cases when command may return nonzero. The methods are e.g. if command; then ... fi, command [ && ... ] || ... and while command; do ... done and so on…

Some commands, like expr returns nonzero in special ways, which is somewhat unfortunate in this context.

But in some cases the nonzero exit value of the command does not cause the shell to exit. A non-last command in pipeline is obvious case, but also at least in 2 command substitution cases: command arguments (printf %s\\n `false`; echo $?) or when variable is declared with e.g. `export` and `readonly` (readonly rov=`false`) (there are more of these, e.g. local – but these don’t work on all modern shells…).

New to me (2021-05): The following does not cause shell to exit:

false && true (i.e not false && true || true)

Between executed commands. If false && true is last in a function and the caller does not handle nonzero return value of a function, shell will exit (i.e. funcall vs. funcall && : – the difference with funcall || : is that former sets $? to some nonzero value and with || : the $? will be zero (0)).

Ok, let’s sample (before short information of so important set -u)

$ sh -ec 'fn() { false && :; }; fn; echo foo'
zsh: exit 1     sh -ec 'fn() { false; }; fn; echo $?'
$ sh -ec 'fn() { false; }; fn && :; echo $?'
1
$ sh -ec 'fn() { false; }; fn || :; echo $?'
0
$ sh -ec 'fn() { false && :; :; }; fn; echo $?'
0

set -u just means all variables needs to be introduced before referenced; with positional parameters ${1-}fallback” can be used when there is just no other suitable options.

set -f

set -f makes script not do filename expansion using glob patterns. This can make the script more robust – just that is initially (like with -eu developers forget this and the filename expansions they try fails…)

#!/bin/bash --posix

In case script is bash, appending --posix to the command line gives chance that the developer script works more like simpler modern shells. That also prohibits some shell builtins to be overwritten by bash functions defined in environment (like unset () { echo rm -rf /; }), which is at least a bit of a security enchancement.

#!/usr/bin/env command

Systems have /bin/sh for sure, but e.g. bash may be somewhere else in $PATH. The common convention is to use #!/usr/bin/env to run the another shell. With that no further options can be added to the command line (but look ideas in templates/ directory for that, if needed).

new

Latest info: since bash 4.3 after declaring (associative) arrays, some access of those gives ‘unbound variable’ errors. To fix these, add =() to the declare line; as an example

declare -A associative_array=()

Just tested:

$ bash -c 'set -u; declare -a a; echo ${#a[*]}'
bash: a: unbound variable
$ bash -c 'set -u; declare -a a=(); echo ${#a[*]}'
0

fixing script environment

In year 2002 I was writing a shell script on a Solaris box. For convenience it had e.g. GNU coreutils software installed on /opt/gnu/bin (which was included in $PATH). After I got the script ready, I shipped it to another machine, where it – (un)surprisingly – failed to work properly.

As all of you have guessed, the problem was using GNU extensions in the script. Since then I’ve had

PATH='/sbin:/usr/sbin:/bin:/usr/bin'; export PATH

(with some variations) in near beginning of my shell scripts…

Whenever needed in some systems, the PATH= line is outcommented, but in general it is there.

Another thing that is usually there is:

LANG=C LC_ALL=C; export LANG LC_ALL; unset LANGUAGE

Too bad there are no other locales than C there by default. Latest try was C.UTF-8 but that still failed to exist on some systems. Previously I’ve set en_US.UTF-8 but there is some pain points with e.g. script(1) printing dates in 12 hour format (awful!). I’d use en_IE.UTF-8 but that is even more uncommon than C.UTF-8.

With C, python3(1) suffers. With any unknown local, perl(1) shrieks. Based on the needs I have to set the locale variables accordingly, change in mid-script or (sometimes) drop the setting altogether.

Sometimes the environment variables must be set exactly: then the only way may be:

test ${__CLEAR_ENV} = yep ||
    exec /usr/bin/env -i __CLEAR_ENV=yep "USER=$USER" ... /bin/sh "$0" "$@"
unset __CLEAR_ENV

Ulimit and umask may matter. Note that in bash umask can take same symbolic options as chmod() but e.g. dash cannot (dash: 1: umask: Illegal mode: umask)

command substitution

Command substitution allows the output of a command to be substituted in place of the command name itself.

There are 2 formats for command substitution; the legacy “backquoted”:

`command [args]`

and the new:

$(command [args])

I like to use the `legacy` version because (whenever the command line is “simple”):

  • it looks clearer
  • it is portable to older shells
  • it works on Amigashell ;)

These formats works almost the same, but not exactly. e.g.:

$ printf %s\\n `printf %s 's/\\(.\\)/\\1/'`
s/\(.\)/\1/
$ printf %s\\n $(printf %s 's/\\(.\\)/\\1/')
s/\\(.\\)/\\1/

WIP

more or less the rest of this script is work in progress – I decided not to delete those, in case someone finds it useful…

$IFS

IFS – the internal field separator, is a variable containing (multiple ascii characters) that are used to split unquoted $variables to separate arguments to function or command. By default it contains characteds space, tab and newline (in this order) (but to make things more complicated, e.g. zsh appends \0 to this list, making e.g. extraction of single characters from this variable harder…)

By changing contents of $IFS to something else, many things can be done within a shell script, to be able to restore IFS to it’s original value following lines can be added to the beginning of shell script.

saved_IFS=$IFS
readonly saved_IFS

XXX move next content elsewere, and continue with what can be done with changed $IFS

Note that the syntax readonly var=$val was not used; It is not portable to traditional shell; also that format has subtle behavior differences(*), and for example readonly $var=$val will assign val to variable named by expanded value of var (this applies also to export $var=$val and so on).

(*) in bash readonly var=$IFS would not assign full IFS to var, $IFS would need to be quoted there (which is not normally needed).

set --

(use instead of set -)

Traditionally set - works as set +xv (but this is not echoed!) – just that zsh in native mode clears positional parameters.. AARGH!

In zsh – set - "$@” would work, too…

to test:

for shell in zsh bash ksh dash sh
do
        $shell -c 'set a b c; echo $#; set -; echo $#'
        $shell -c 'set a b c; echo $#; set - "$@"; echo $#'
done

Could not find zsh option to “disable” this feature (sh_option_letters did not do it). emulate ksh does it

which…, hash… – command -v

(New info 2023-03: zsh command -v outputs aliases as those are defined, and just the names of functions, so it cannot be used to find path to an executable for sure. That is unfortunate. (++) below.)

Often, one needs to know whether a particular executable is available in the system. In many systems which, hash and command -v could be used to figure this out, but…

which, while often printing path to executable and exiting zero in case command is found, in some systems prints output to stdout even command is not found and in some systems don’t exit nonzero even command if not found.

Which is also not usually shell builtin (only in zsh), requiring shell to do execve(2) in addition to 1 or 2 fork(2) s

The user-visible hash behaviour is exit value, being zero when command is found and most often nonzero when not found – but ksh exits zero even command is not found.

command -v outputs path when found and exits nonzero when not – except on zsh – see (++) above. Therefore dropping suggestion to use command -v to store path of an executable (perhaps use which on zsh, and command -v on other shell - or use iwhich shown below). One note about the -v option of command on /Modern/(*) shell below.

(*) http://pubs.opengroup.org/onlinepubs/009695399/utilities/command.html mentions that the -v option might not be available in all shells that claims to have POSIX compatibility – all Modern shells I’ve tested have this feature, though.

To have which functionality that works with all shells, one could use the following (which may even be fastest as no forks required).

iwhich ()
{
        case $1 in */*)
                test -x "$1" || return 1
                case $# in 3) eval $3=\$1 ;; *) return 1 ;; esac
        esac
        IFS=:
        for _v in $PATH
        do      test -x "$_v/$1" || continue
                _v=$_v/$1
                case $# in 3) eval $3=\$_v ;; *) eval $1=\$_v ;; esac
                IFS=$saved_IFS
                return 0
        done
        IFS=$saved_IFS
        return 1
}

Now, iwhich ls would assign path of ls to variable $ls.

And, iwhich ssh-agent as ssh_agent would assing path of ssh-agent to variable ssh_agent.