Resilient Bash starter template

By Niels on 2024-10-16.

Shell scripts are quick, simple and great, until they are not.

Over the years I have worked on serious production-level issues that turned out to be caused by shell scripts, sometimes burning days of engineering time. And they're everywhere, in virtually every codebase, in CI/CD systems and probably all over your computer.

Commonly in these kind of issues, a variation of the following factors played a role:

Scripts have a low barrier to writing them. Often written in a hurry, in a offensive to-the-point-way, just executing a couple of commands quickly.
They tend to stick around over time, leading to logic accumulating in .sh-files and complexity increasing exponentially. Especially when multiple people contribute, scripts grow ugly, fast.
Often, scripts fail silently by continuing to execute, because explicit error handling wasn't added. This can lead to all sorts of issues, e.g. empty variables resulting in unintended command execution. Or in the worst case, data loss.
There's no import-statement in scripts, almost always resulting in scripts without any command checking. Once executed on another computer or another OS, this often leads to silent failure.
The script is executed differently than how it was intended to be ran, for example from a different working directory (../../fix-it.sh) or without checking input parameters.

To prevent many of these issues on both production and on my own computer, I've written a template starter script with some best practices applied. It's generic, simple, pure Bash and allows me (and you) to quickly adjust it.

The Template

#!/usr/bin/env bash

# PRE-FLIGHT: establish the path this script is placed in.
CWD="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd -P )"

# PRE-FLIGHT: runtime requirements.
for cmd in "mkdir" "rmdir" "mv"; do
  if [[ -z $(command -v "${cmd}")  ]]; then
    echo "ERROR: ${cmd} not in PATH!"; exit 127;
  fi
done

# PRE-FLIGHT: Check if the vendor/ folder exists.
if [ ! -d "$CWD/vendor" ]; then
  echo "ERROR: ${CWD}/vendor/ not found!"; exit 1;
fi

# PRE-FLIGHT: Check if settings.ini exists.
if [ ! -f "$CWD/settings.ini" ]; then
  echo "ERROR: ${CWD}/settings.ini not found!"; exit 1;
fi

###########################################################

# Run a bunch of commands, jump to || immediately upon first failing command:
true \
  && mkdir -p "${CWD}/a/b/c" \
  && rmdir "${CWD}/a/b/c" \
  && mv "${CWD}/a/b" "${CWD}/a/d" \
  && rmdir "${CWD}/a/d" \
  && rmdir "${CWD}/a" \
  || { echo "ERROR: Failed to make+remove some useless folders!"; exit 1; }

# Do the next important thing:
echo "Hello world!"

Let's break the template down:

Shebang

#!/usr/bin/env bash

This shebang may look obvious to many, but quite often one still sees #!/bin/bash or #!/bin/sh. These are problematic because they are not portable , which often manifests on other operating systems or inside trimmed down Docker images with Bash not installed.

Absolute path to working directory

# PRE-FLIGHT: establish the path this script is placed in.
CWD="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd -P )"

This ugly looking one-liner creates the ${CWD} variable and stands for "current working directory", meaning that it points to the absolute path the script resides in.

Let's say, this script is stored at /home/me/project/template.sh.

And the project folder looks like this:

project/
├── settings.ini
├── template.sh
└── vendor

However, you can run a script in various different ways:

$ ./template.sh
$ /home/me/project/template.sh
$ ../template.sh

And running it from another directory - for example from within a CI/CD pipeline - means that the real "current working directory" can and will differ. So a simple cat settings.ini ran from your script, will fail when the script isn't executed as ./template.sh.

This ${CWD} variable is guaranteed to be /home/me/project, no matter from what directory you run it.

So instead of writing:

echo "setting = ${value}" >> settings.ini

Do:

echo "setting = ${value}" >> "${CWD}/settings.ini"

Its certainly uglier to read, but robust if applied consequently.

Poorman's import statement

# PRE-FLIGHT: runtime requirements.
for cmd in "mkdir" "rmdir" "mv"; do
  if [[ -z $(command -v "${cmd}") ]]; then
    echo "ERROR: ${cmd} not in PATH!"; exit 127;
  fi
done

It probably looks convoluted to be checking for the mv command. However, the idea here is to always keep this at the top of your scripts and keep it in sync with the commands you call from your script.

Essentially turning it into some sort of import statement, e.g.:

for cmd in "kubectl" "kubectx" "helm"; do

It doesn't really import of course, but exits immediately when said command can't be found. This helps prevent many common issues due to uninstalled binaries and avoids silent failures.

Plus, by exiting with 127, other scripts or systems also know your script failed:

$ ./template.sh
ERROR: helm not in PATH!

Folder and File Checking

# PRE-FLIGHT: Check if the vendor/ folder exists.
if [ ! -d "$CWD/vendor" ]; then
  echo "ERROR: ${CWD}/vendor/ not found!"; exit 1;
fi

# PRE-FLIGHT: Check if settings.ini exists.
if [ ! -f "$CWD/settings.ini" ]; then
  echo "ERROR: ${CWD}/settings.ini not found!"; exit 1;
fi

These two I often remove as not every shell script depends on the existence of other folders or files. However, it is still very common and similar to checking dependencies at the top of your script, its good to check for files/folders.

Let's say my script is outside the project/ folder and runs:

$ ./template.sh && pwd
ERROR: /home/me/vendor/ not found!

As you can see I added && pwd to run immediately afterwards, which didn't run.

Running commands

# Run a bunch of commands, jump to || immediately upon first failing command:
true \
  && mkdir -p "${CWD}/a/b/c" \
  && rmdir "${CWD}/a/b/c" \
  && mv "${CWD}/a/b" "${CWD}/a/d" \
  && rmdir "${CWD}/a/d" \
  && rmdir "${CWD}/a" \
  || { echo "ERROR: Failed to make+remove some useless folders!"; exit 1; }

# Do the next important thing:
echo "Hello world!"

Yikes, that looks ugly.

Those with knowledge of Bash will immediately see that the intent here is to chain command execution and stop early, the exact same set -e does.

However, the || {echo "ERROR"; exit 1;} at the end also strongly suggests the writer of scripts to handle errors explicitly. This doesn't just return a non-zero status code, but also allows for messages or cleanup commands to be ran.

Plus, many people touching shell scripts, may not understand set -e.

By reading the above, it becomes obvious that the commands below it will not be printed upon failure:

$ ./template.sh
rmdir: failed to remove '/home/me/a/b/c': Directory not empty
ERROR: Failed to make+remove some useless folders!

Wrap-up

There's many shell script templates out there and this one is neither perfect nor beautiful.

However, the point of copy-pasting and adjusting it, is:

To write defensive shell scripts.
To avoid as much silent failures as possible.
To teach others reading your script, to maintain best practices.

And before I forget!

Check out the ShellCheck linter, which has saved me countless times from ancient Bash peculiarities with its excellent documentation. Set it up in CI/CD for automated quality control and it will help avoiding plenty of nasty issues.