Bash scripting for the reluctant
Should I read this?
This is a consolidated reference for helping those already familiar with bash to become more comfortable with it in order to leverage its strengths. It links to informative references that explain their subject matter better than I can.
Concepts and techniques
Writing robust scripts
There are several options for configuring bash to behave more sanely in the presence of surprises. Some commands also have options for running in a more fail-friendly manner.
set -u
orset -o nounset
set -e
orset -o errexit
set -o pipefail
mkdir -p
rm -f
- quoting variable references, as in:
"$@"
find -print0 ... | xargs -0 ...
trap
set -o noclobber
If anything in this list falls outside of your comfort zone, read this before continuing.
Also, unless your script intentionally makes use of pathname expansion (aka globbing), you should disable it via set -f
. If you do make use of globbing, you should use shopt -s failglob
to produce errors for non-matching patterns.
Use printf instead of echo
There are serious portability concerns with using echo
that can lead to nasty surprises. Use the basic feature set of printf
instead:
printf '%s\n' "$var"
Special parameters and variables
It's not important to memorize what $-
, $_
, etc. are for, but make sure this list doesn't contain any surprises. You should also be familiar with $PPID
which is used to get the current parent process ID.
Script-relative paths
If your script is bundled with accompanying files, you will want to reference the paths of these files in terms of the current script's location. Doing so allows you to both relocate the bundle and invoke the script from anywhere without breaking any of the relative file references. Read this for more options and detail.
#!/bin/bash
# dir-example
set -eufo pipefail
here="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
# refer to the 'data' file living in the script directory
... "$here/data" ...
Passing arguments by name
Passing arguments by name (or keyword), rather than position, makes them self-documenting. When passing by position, it's easier to mistakenly transpose two arguments.
A typical solution to processing script arguments is to use something like getopts
.
An alternative way to pass arguments by name is to assign them to variables.
#!/bin/bash
# var-arg-example
set -eufo pipefail
# ARG1 is required.
# ARG2 is optional; it will be assigned to 'default' if unset.
# Use := instead of = to also perform assignment when ARG2 is the empty string.
: ${ARG2='default'}
printf '%s %s\n' "$ARG1" "$ARG2"
# Invoke named-arg-example.
> ./var-arg-example
#./var-arg-example: line 8: ARG1: unbound variable
> export ARG1=foo
> ./var-arg-example
#foo default
> ARG2=bar ./var-arg-example
#foo bar
> ARG1=crow ARG2=bar ./var-arg-example
#crow bar
There is a trade-off being made with regard to name clashes. The upside is using global variable parameters allows arguments to either be explicitly passed inline, or to be exported and used across multiple invocations with the option of overriding them. The downside is the potential for unintentional name clashes with existing global variables.
Input and output
Manipulation of standard I/O in bash is somewhat indirect. Unless using redirections, stdio is implicitly consumed/produced by subcommands. Here are some useful formulas.
Consume part of stdin, assigning to variables, using
read
.> read -n 3 threechars # Your input ends on the same line as the next prompt. abc> printf '%s\n' "$threechars" abc
> read line # This time, the next prompt starts on its own line. Why? #this input is read until you hit enter > printf '%s\n' "$line" #this input is read until you hit enter
Consume all of stdin using
cat
.ALL_OF_STDIN=$(cat) # this also demonstrates command-substitution ... use $ALL_OF_STDIN ...
Consume all of stdin, writing it to a file while also sending it to stdout using
tee
.# Here is another use of tee. printf '%s\n' 'important config' | sudo tee /etc/normally-cannot-write-here > /dev/null # Note that the following will *not* normally succeed. sudo printf '%s\n' 'important config' > /etc/normally-cannot-write-here
Capture the stdout of a command as a string with command substitution.
printf '%s\n' "Today is $(date)"
Treat a the I/O of a command like a file (more accurately, like a pipe) with process substitution.
diff data <(uniq data)
Redirect the current script's standard I/O using
exec
. More generally, useexec
to manipulate arbitrary file descriptors.printf '%s\n' 'this goes to stdout' exec > some-file printf '%s\n' 'this goes to some-file'
Flexibly tie the I/O of processes together using named pipes.
> mkfifo to-show > ls -l #prw-r--r-- 1 user user 0 ... to-show| > { > printf '%s\n' show > to-show > printf '%s\n' these > to-show > printf '%s\n' lines > to-show > } & #[1] 1234 > jobs #[1]+ Running { printf '%s\n' show > to-show; # printf '%s\n' these > to-show; # printf '%s\n' lines > to-show; } & > cat < to-show #show #these #lines #[1]+ Done { printf '%s\n' show > to-show; # printf '%s\n' these > to-show; # #printf '%s\n' lines > to-show; }
Describe input in-situ using 'here documents'.
cat << EXAMPLEDOC All of these lines are treated as input EXAMPLEDOC
Manipulate an entire directory tree as a stream using
tar
.tar cpvf - DIR -C ORIGIN | ssh -T -e none REMOTE-HOST 'tar xvf - -C DESTINATION'
You should also read about relational text processing.
Concurrency
Append a &
to a command to run it concurrently with the remainder of a script's execution. Pause the script's execution until the child processes terminate using wait
.
# With multiple processors, some of these may be able to run in parallel
for i in {1..10}; do
slow_process < "input$i" > "output$i" &
done
wait
... # use output{1..10}
Here is an example demonstrating more interaction with a child process.
# Though this example is a bit too simplistic, it's often a good idea to create
# a temporary working directory to store data related to each child process.
> CHILD_DIR=$(mktemp -d child.XXXXXXXXXX)
> printf '%s\n' "$CHILD_DIR"
#child.iu9Ncsshzc
# Set up an output channel to receive messages from the child process.
> CHILD_OUT="$CHILD_DIR/out"
> mkfifo "$CHILD_OUT"
# Launch a process that outputs the result of its "work" every 5 seconds.
> {
> while true; do
> sleep 5
> printf '%s\n' 'work asynchronously' > "$CHILD_OUT"
> done
> } &
#[1] 12345
# Remember its PID.
> CHILD_PID="$!"
> printf '%s\n' "$CHILD_PID"
#12345
> jobs
#[1]+ Running { while true; do
# sleep 5; printf '%s\n' 'work asynchronously' > "$CHILD_OUT";
#done; } &
# Pull some results out of the channel.
> cat < "$CHILD_OUT"
#work asynchronously
# The child process blocks when writing to the channel until we try to read.
# At most one result will be queued up at a time. If we immediately try
# reading a second time, we notice a pause due to the child sleep.
> cat < "$CHILD_OUT"
#(up to a 5-second pause)
#work asynchronously
# Thanks to the while loop, the child process will continue until we are ready
# to stop it.
> kill "$CHILD_PID"
#[1]+ Terminated { while true; do
# sleep 5; printf '%s\n' ' work asynchronously' > "$CHILD_OUT";
#done; }
# Clean up.
> rm -r "$CHILD_DIR"
Read more about these and other job control commands.
Avoid running multiple instances of a command at the same time by using flock
.
# /etc/cron.d/special
... typical cron things ...
# Run our special process every 5 minutes.
# Run it with flock to prevent overlap if it runs for longer than 5 minutes.
*/5 * * * * user flock -n /tmp/special-process.lock /path/to/special-process
Effective use of ssh
The ssh
command is more than just a way to interactively log into a remote host. It allows remote command execution, transferring files, and various forms of proxying. This explains more.
All of the above links
- Writing Robust Bash Shell Scripts
- Pathname expansion (globbing)
- Why is printf better than echo?
- What are the special dollar sign shell variables?
- Can a Bash script tell what directory it's stored in?
- How do I parse command line arguments in bash?
- Catching user input
- Command Substitution
- Process Substitution
- Using exec
- Named Pipes (FIFOs - First In First Out)
- Here Documents
- Relational shell programming
- Job Control Commands
- How to prevent a script from running simultaneously?
- SSH: More than secure shell