Advanced Bash-Scripting Guide

From LinuxReviews
Jump to navigationJump to search

An in-depth exploration of the art of shell scripting

Based on the TLDP guide by Mendel Cooper, 2014.

Dedication

For Anita, the source of all the magic

Part 1. Introduction[edit | edit source]

"Script: A writing; a written document. [Obs.]"

--Webster's Dictionary, 1913 ed.

The shell is a command interpreter. More than just the insulating layer between the operating system kernel and the user, it's also a fairly powerful programming language. A shell program, called a script, is an easy-to-use tool for building applications by "gluing together" system calls, tools, utilities, and compiled binaries. Virtually the entire repertoire of UNIX commands, utilities, and tools is available for invocation by a shell script. If that were not enough, internal shell commands, such as testing and loop constructs, lend additional power and flexibility to scripts. Shell scripts are especially well suited for administrative system tasks and other routine repetitive tasks not requiring the bells and whistles of a full-blown tightly structured programming language.

Chapter 1. Shell Programming![edit | edit source]

"No programming language is perfect. There is not even a single best language; there are only languages well suited or perhaps poorly suited for particular purposes."

--Herbert Mayer

A working knowledge of shell scripting is essential to anyone wishing to become reasonably proficient at system administration, even if they do not anticipate ever having to actually write a script. Consider that as a Linux machine boots up, it executes the shell scripts in /etc/rc.d to restore the system configuration and set up services. A detailed understanding of these startup scripts is important for analyzing the behavior of a system, and possibly modifying it.

The craft of scripting is not hard to master, since scripts can be built in bite-sized sections and there is only a fairly small set of shell-specific operators and options [1] to learn. The syntax is simple -- even austere -- similar to that of invoking and chaining together utilities at the command line, and there are only a few "rules" governing their use. Most short scripts work right the first time, and debugging even the longer ones is straightforward.

In the early days of personal computing, the BASIC language enabled
anyone reasonably computer proficient to write programs on an early
generation of microcomputers. Decades later, the Bash scripting
language enables anyone with a rudimentary knowledge of Linux or
UNIX to do the same on modern machines.
We now have miniaturized single-board computers with amazing
capabilities, such as the Raspberry Pi.
Bash scripting provides a way to explore the capabilities of these
fascinating devices.

A shell script is a quick-and-dirty method of prototyping a complex application. Getting even a limited subset of the functionality to work in a script is often a useful first stage in project development. In this way, the structure of the application can be tested and tinkered with, and the major pitfalls found before proceeding to the final coding in C, C++, Java, Perl, or Python.

Shell scripting hearkens back to the classic UNIX philosophy of breaking complex projects into simpler subtasks, of chaining together components and utilities. Many consider this a better, or at least more esthetically pleasing approach to problem solving than using one of the new generation of high-powered all-in-one languages, such as Perl, which attempt to be all things to all people, but at the cost of forcing you to alter your thinking processes to fit the tool.

According to Herbert Mayer, "a useful language needs arrays, pointers, and a generic mechanism for building data structures." By these criteria, shell scripting falls somewhat short of being "useful." Or, perhaps not. . . .

When not to use shell scripts

  • Resource-intensive tasks, especially where speed is a factor (sorting, hashing, recursion [2] ...)
  • Procedures involving heavy-duty math operations, especially floating point arithmetic, arbitrary precision calculations, or complex numbers (use C++ or FORTRAN instead)
  • Cross-platform portability required (use C or Java instead)
  • Complex applications, where structured programming is a necessity (type-checking of variables, function prototypes, etc.)
  • Mission-critical applications upon which you are betting the future of the company
  • Situations where security is important, where you need to guarantee the integrity of your system and protect against intrusion, cracking, and vandalism
  • Project consists of subcomponents with interlocking dependencies
  • Extensive file operations required (Bash is limited to serial file access, and that only in a particularly clumsy and inefficient line-by-line fashion.)
  • Need native support for multi-dimensional arrays
  • Need data structures, such as linked lists or trees
  • Need to generate / manipulate graphics or GUIs
  • Need direct access to system hardware or external peripherals
  • Need port or socket I/O
  • Need to use libraries or interface with legacy code
  • Proprietary, closed-source applications (Shell scripts put the source code right out in the open for all the world to see.)

If any of the above applies, consider a more powerful scripting language -- perhaps Perl, Tcl, Python, Ruby -- or possibly a compiled language such as C, C++, or Java. Even then, prototyping the application as a shell script might still be a useful development step.

We will be using Bash, an acronym [3] for "Bourne-Again shell" and a pun on Stephen Bourne's now classic Bourne shell. Bash has become a de facto standard for shell scripting on most flavors of UNIX. Most of the principles this book covers apply equally well to scripting with other shells, such as the Korn Shell, from which Bash derives some of its features, [4] and the C Shell and its variants. (Note that C Shell programming is not recommended due to certain inherent problems, as pointed out in an October, 1993 Usenet post by Tom Christiansen.)

What follows is a tutorial on shell scripting. It relies heavily on examples to illustrate various features of the shell. The example scripts work -- they've been tested, insofar as possible -- and some of them are even useful in real life. The reader can play with the actual working code of the examples in the source archive (scriptname.sh or scriptname.bash), [5] give them execute permission (chmod u+rx scriptname), then run them to see what happens. Should the source archive not be available, then cut-and-paste from the HTML or pdf rendered versions. Be aware that some of the scripts presented here introduce features before they are explained, and this may require the reader to temporarily skip ahead for enlightenment.

Unless otherwise noted, the author of this book wrote the example scripts that follow.

"His countenance was bold and bashed not."

--Edmund Spenser


Chapter 2. Starting Off With a Sha-Bang[edit | edit source]

"Shell programming is a 1950s juke box . . ."

--Larry Wall

In the simplest case, a script is nothing more than a list of system commands stored in a file. At the very least, this saves the effort of retyping that particular sequence of commands each time it is invoked.

Example 2-1. cleanup: A script to clean up log files in /var/log[edit | edit source]
# Cleanup
# Run as root, of course.

cd /var/log
cat /dev/null > messages
cat /dev/null > wtmp
echo "Log files cleaned up.

There is nothing unusual here, only a set of commands that could just as easily have been invoked one by one from the command-line on the console or in a terminal window. The advantages of placing the commands in a script go far beyond not having to retype them time and again. The script becomes a program -- a tool -- and it can easily be modified or customized for a particular application.

Example 2-2. cleanup: An improved clean-up script[edit | edit source]
#!/bin/bash
# Proper header for a Bash script.

# Cleanup, version 2

# Run as root, of course.
# Insert code here to print error message and exit if not root.

LOG_DIR=/var/log
# Variables are better than hard-coded values.
cd $LOG_DIR

cat /dev/null > messages
cat /dev/null > wtmp


echo "Logs cleaned up."

exit #  The right and proper method of "exiting" from a script.
     #  A bare "exit" (no parameter) returns the exit status
     #+ of the preceding command. 


Now that's beginning to look like a real script. But we can go even farther . . .

Example 2-3. cleanup: An enhanced and generalized version of above scripts.[edit | edit source]
#!/bin/bash
# Cleanup, version 3

#  Warning:
#  -------
#  This script uses quite a number of features that will be explained
#+ later on.
#  By the time you've finished the first half of the book,
#+ there should be nothing mysterious about it.



LOG_DIR=/var/log
ROOT_UID=0     # Only users with $UID 0 have root privileges.
LINES=50       # Default number of lines saved.
E_XCD=86       # Can't change directory?
E_NOTROOT=87   # Non-root exit error.


# Run as root, of course.
if [ "$UID" -ne "$ROOT_UID" ]
then
  echo "Must be root to run this script."
  exit $E_NOTROOT
fi  

if [ -n "$1" ]
# Test whether command-line argument is present (non-empty).
then
  lines=$1
else  
  lines=$LINES # Default, if not specified on command-line.
fi  


#  Stephane Chazelas suggests the following,
#+ as a better way of checking command-line arguments,
#+ but this is still a bit advanced for this stage of the tutorial.
#
#    E_WRONGARGS=85  # Non-numerical argument (bad argument format).
#
#    case "$1" in
#    ""      ) lines=50;;
#    *[!0-9]*) echo "Usage: `basename $0` lines-to-cleanup";
#     exit $E_WRONGARGS;;
#    *       ) lines=$1;;
#    esac
#
#* Skip ahead to "Loops" chapter to decipher all this.


cd $LOG_DIR

if [ `pwd` != "$LOG_DIR" ]  # or   if [ "$PWD" != "$LOG_DIR" ]
                            # Not in /var/log?
then
  echo "Can't change to $LOG_DIR."
  exit $E_XCD
fi  # Doublecheck if in right directory before messing with log file.

# Far more efficient is:
#
# cd /var/log || {
#   echo "Cannot change to necessary directory." >&2
#   exit $E_XCD;
# }




tail -n $lines messages > mesg.temp # Save last section of message log file.
mv mesg.temp messages               # Rename it as system log file.


#  cat /dev/null > messages
#* No longer needed, as the above method is safer.

cat /dev/null > wtmp  #  ': > wtmp' and '> wtmp'  have the same effect.
echo "Log files cleaned up."
#  Note that there are other log files in /var/log not affected
#+ by this script.

exit 0
#  A zero return value from the script upon exit indicates success
#+ to the shell.

Since you may not wish to wipe out the entire system log, this version of the script keeps the last section of the message log intact. You will constantly discover ways of fine-tuning previously written scripts for increased effectiveness.

* * *

The sha-bang ( #!) [6] at the head of a script tells your system that this file is a set of commands to be fed to the command interpreter indicated. The #! is actually a two-byte [5] magic number, a special marker that designates a file type, or in this case an executable shell script (type man magic for more details on this fascinating topic). Immediately following the sha-bang is a path name. This is the path to the program that interprets the commands in the script, whether it be a shell, a programming language, or a utility. This command interpreter then executes the commands in the script, starting at the top (the line following the sha-bang line), and ignoring comments. [6]

#!/bin/sh
#!/bin/bash
#!/usr/bin/perl
#!/usr/bin/tcl
#!/bin/sed -f
#!/bin/awk -f

Each of the above script header lines calls a different command interpreter, be it /bin/sh, the default shell (bash in a Linux system) or otherwise. [7] Using #!/bin/sh, the default Bourne shell in most commercial variants of UNIX, makes the script portable to non-Linux machines, though you sacrifice Bash-specific features. The script will, however, conform to the POSIX [8] sh standard.

Note that the path given at the "sha-bang" must be correct, otherwise an error message -- usually "Command not found." -- will be the only result of running the script.[9]

#! can be omitted if the script consists only of a set of generic system commands, using no internal shell directives. The second example, above, requires the initial #!, since the variable assignment line, lines=50, uses a shell-specific construct. [10] Note again that #!/bin/sh invokes the default shell interpreter, which defaults to /bin/bash on a Linux machine.

Tip: This tutorial encourages a modular approach to constructing a script. Make note of and collect "boilerplate" code snippets that might be useful in future scripts. Eventually you will build quite an extensive library of nifty routines. As an example, the following script prolog tests whether the script has been invoked with the correct number of parameters.
E_WRONG_ARGS=85
script_parameters="-a -h -m -z"
#                  -a = all, -h = help, etc.

if [ $# -ne $Number_of_expected_args ]
then
  echo "Usage: `basename $0` $script_parameters"
  # `basename $0` is the script's filename.
  exit $E_WRONG_ARGS
fi
Many times, you will write a script that carries out one particular task. The first script in this chapter is an example. Later, it might occur to you to generalize the script to do other, similar tasks. Replacing the literal ("hard-wired") constants by variables is a step in that direction, as is replacing repetitive code blocks by functions.

2.1. Invoking the script[edit | edit source]

Having written the script, you can invoke it by sh scriptname,[11] or alternatively bash scriptname. (Not recommended is using sh <scriptname, since this effectively disables reading from stdin within the script.) Much more convenient is to make the script itself directly executable with a chmod.

Either:

chmod 555 scriptname (gives everyone read/execute permission)[12]

or

chmod +rx scriptname (gives everyone read/execute permission)

chmod u+rx scriptname (gives only the script owner read/execute permission)

Having made the script executable, you may now test it by ./scriptname.[13] If it begins with a "sha-bang" line, invoking the script calls the correct command interpreter to run it.


As a final step, after testing and debugging, you would likely want to move it to /usr/local/bin (as root, of course), to make the script available to yourself and all other users as a systemwide executable. The script could then be invoked by simply typing scriptname [ENTER] from the command-line.

2.2. Preliminary Exercises[edit | edit source]

  1. System administrators often write scripts to automate common tasks. Give several instances where such scripts would be useful.
  2. Write a script that upon invocation shows the time and date, lists all logged-in users, and gives the system uptime. The script then saves this information to a logfile.


  1. These are referred to as builtins, features internal to the shell.
  2. Although recursion is possible in a shell script, it tends to be slow and its implementation is often an ugly kludge.
  3. An acronym is an ersatz word formed by pasting together the initial letters of the words into a tongue-tripping phrase. This morally corrupt and pernicious practice deserves appropriately severe punishment. Public flogging suggests itself.
  4. Many of the features of ksh88, and even a few from the updated ksh93 have been merged into Bash.
  5. Some flavors of UNIX (those based on 4.2 BSD) allegedly take a four-byte magic number, requiring a blank after the ! -- #! /bin/sh. According to Sven Mascheck this is probably a myth.
  6. The #! line in a shell script will be the first thing the command interpreter (sh or bash) sees. Since this line begins with a #, it will be correctly interpreted as a comment when the command interpreter finally executes the script. The line has already served its purpose - calling the command interpreter. If, in fact, the script includes an extra #! line, then bash will interpret it as a comment.
  7. This allows some cute tricks.
    #!/bin/rm
    # Self-deleting script.
    
    # Nothing much seems to happen when you run this... except that the file disappears.
    
    WHATEVER=85
    
    echo "This line will never print (betcha!)."
    
    exit $WHATEVER  # Doesn't matter. The script will not exit here.
                    # Try an echo $? after script termination.
                    # You'll get a 0, not a 85.
  8. Portable Operating System Interface, an attempt to standardize UNIX-like OSes. The POSIX specifications are listed on the Open Group site.
  9. To avoid this possibility, a script may begin with a #!/bin/env bash sha-bang line. This may be useful on UNIX machines where bash is not located in /bin
  10. If Bash is your default shell, then the #! isn't necessary at the beginning of a script. However, if launching a script from a different shell, such as tcsh, then you will need the #!.
  11. Caution: invoking a Bash script by sh scriptname turns off Bash-specific extensions, and the script may therefore fail to execute.
  12. A script needs read, as well as execute permission for it to run, since the shell needs to be able to read it.
  13. Why not simply invoke the script with scriptname? If the directory you are in ($PWD) is where scriptname is located, why doesn't this work? This fails because, for security reasons, the current directory (./) is not by default included in a user's $PATH. It is therefore necessary to explicitly invoke the script in the current directory with a ./scriptname.