Shell scripts | Variables and control structures

Once we've learnt how to freely manipulate directories, files and data, it's time to give more power to the command line storing our values in variables and making decisions based on their data.

In this first part we'll take a look at variables inside shell scripting, control structures, if statements and case statements.

Some languages need to explicitly allocate and free memory for variables, structs and so. Shell scripting handles this under the hood and in most cases we don't have to clean anything before leaving a function.

Exiting scripts as soon as possible or making return statements in functions can reduce system load and improve performance by avoiding unnecessary code execution.

Variables

In every programming language, variables store data and configuration options, and allow us to manage and control actions inside a script. Variables are quite easy to use but they are also quite easy to get ourselves into trouble with.

— There are three basic rules in naming variables:

Variables have to be composed with alphanumeric characters and underscore characters.
The first character of a variable cannot be a number
Spaces and punctuation symbols are not allowed.

#~./scripts/demo.sh
message="Hi there"
balance=48

If a variable is empty, we can assign a default value using := instead of =

${var:=defaultValue}

Two essential functionalities when working in a terminal-based environment are reading input from the user and printing out information into the screen.

With read command we can get keyboard input into the script.

read [options] [variable/s]

where options are the following:

-a assigns the input to an array of index zero.

-e uses the readline to handle input behaving like the command line.

-n reads num characters rather than the entire line.

-p displays a prompt or message before the input field.

-r doesn't interpret backslash characters as escapes.

-s doesn't echo characters in the screen. Also called silent mode, it's useful when asking for passwords.

-t terminates input after n seconds and returns a non-zero exit status if timed out.

and variable/s define where to store the input data. We can set more than one variable in a read command:

read inputA inputB inputC inputD
printf "%s\n" "inputA = $inputA"
printf "%s\n" "inputB = $inputB"
printf "%s\n" "inputC = $inputC"
printf "%s\n" "inputD = $inputD"

read 1 2 3 4
inputA = 1
inputA = 2
inputA = 3
inputA = 4

If we don't explicitly mark how many variables do we want to store from input, the command will merge all in one, in a default shell variable named REPLY:

read
printf "%s\n" "reply = $REPLY"

read 1 2 3 4
reply = '1 2 3 4'

With printf we can print out into the screen almost the same way that in C programming.

#~./scripts/demo.sh
read -p "Enter your user name: " user_name
printf "%s\n" "Welcome aboard, $user_name"

— Variables in Shell scripting have some peculiarities:

Variables are case-sensitive.
Variables don't need to be identified by type.
There are no spaces between the = sign. The Shell is not going to understand the line as a variable assignment if we add spaces between.
Variables don't need to be declared as the shell doesn't care about it. When the Shell finds a variable, it automatically creates it.
In order to use a previously declared variable, we need to add the $ sign before the variable's name.
The Shell has some builtin internal variables. We can create or modify them too. Those variables are written in uppercase.
Enclosing our variable between brackets avoids any type of ambiguity.
Arithmetic calculation with integers is available through shell variables using the following format:

$ (( expression ))

Where expression can take the following operators:

- + / * % ++ -- **

We've seen before that we don't need to declare variable types, however to work with integers we need to do so.

declare -i x=5

To work with float values we need to delegate our arithmetic operations to external tools like bc or expr.

Variables can be marked as readonly using the following syntax:

readonly varName=value

Variables can be global and local.

- By default every variable is global, even outside the shell if they're declared inside a script.

- To make a variable local to a function (independent from the global scope and only accessible by that function) we can label the variable.

prompt="welcome"

function foo ()
{
  local prompt="here we are"
}
echo $prompt

This example will output "welcome" since the variable inside the function, although named the same, is declared as local.

Control structures

Just having the ability to create and store variables doesn't give us too much power. Comparing and testing data is an essential part in programming.

In order to compare and evaluate our variables' data we have a series of operators:

— File operators

Operators are placed before evaluating the variable:

-N "$file"

where N is the desired operator to evaluate.

-e returns true if the file or directory exist.

-d returns true if the directory exists.

-f returns true if the file exists.

-s returns true if the file exists and it's not empty.

We can also check if the files or directories have read, write and executable (files only) permissions.

-r returns true if read permission is granted.

-w returns true if write permission is granted.

-x returns true if executable permission is granted.

— String operators

-z returns true if string is zero-length.

-z "$string"

-n returns true if string's length is non-zero.

-n "$string"

= returns true if string A is equal to string B. This returns inconsistent values when comparing integers.

"$stringA" = "$stringB"

!= returns true if string A is not equal to string B.

"$stringA" != "$stringB"

— Integer operators

Integer operators are used to compare integers and are placed between to variables to evaluate:

"$intA -NN "$intB"

where NN is the desired operator to use.

-eq returns true if both integers are equal.

-ne returns true if integers are not equal.

-gt returns true if integer A is greater than integer B.

-ge returns true if integer A is greater than or equal to integer B.

-lt returns true if integer A is less than integer B.

-le returns true if integer A is less than or equal to integer B.

Flow control: conditional execution

Conditional executions work based on the exit status of other command. Their main advantage is allowing scripts and functions to run in "short circuit" or exit early. They are a bit faster than an if structure.

Conditional execution operators are && and ||. These operators have no precedence and they are left-associative.

AND && operator will run only if the first action was successful.

$ cd .scripts/ && pwd

OR || operator will run only if the first action wasn't successful.

$ cd .garbage/ || exit
rm -rf *

A third operator named logical not is useful in the game.

NOT ! operator is used to test whether expression is true or not.

$ test ! -f .scripts/demo.sh && echo "File not found."

It's possible to combine multiple statements, always remembering the left-associative property.

Flow control: conditional if

Conditional structures like if allow us to perform different actions for different decisions.

A default if statement looks like this:

if [[ $1 -eq $user ]]; then
  printf "%s\n" "$user you're logged in"
fi

However optional clauses elif and/or else can be added:

if [[ $1 -eq $user ]]; then
  printf "%s\n" "$user you're logged in"
elif [[ $1 -gt $max ]]; then
  printf "%s\n" "Your user name has to be less than $max characters"
else
  printf "%s\n" "You must type your username."
fi

Also nested if statements are allowed:

if [ condition ]; then
  if [ condition ]; then
    #action
  else
    #action  
  fi
else
  #action
fi

The content inside the brackets [[ is treated as a command and it's the exit code of that command what is tested, thus the brackets are not part of the if syntax.

The exit code is true if it exits with 0, and false if it exits with 1.

This way we can also use conditional operators in an if statement:

if [ -r $1 ] && [ -s $1 ]; then
  cat $1
fi

Mathematical expressions return 0 or 1 when placed between double parenthesis.

if (( $1 + $2 > 10 )); then
  printf "%s\n" "Those are too many apples."
fi

Flow control: conditional case

Case statements provide a good alternative to multilevel if statements when you have to match multiple values against one variable.

— Case statements execute the case inside the structure that matches the given pattern.

read -p "please, enter a number to select: " pattern

case $pattern in
1)
    printf "%s\n" "First choice. Nice one"
    ;;
2)
    printf "%s\n" "There we go."
    ;;
3)
    printf "%s\n" "Three is always a good choice."
    ;;
*)
    printf "%s\n" "We're sorry, choose between 1-3."
    ;;
esac

— Case statements are enclosed between the word case and the word esac. The operator ;; breaks after the first match, if any.

As you may notice we have a case that is an asterisk *. It represents any value and behaves similar to a default case.We can cover ourselves in a situation where the given pattern doesn't match any given cases, the catch-all one is executed and we don't make our program to exit with errors.

— We can also make our case statement to work using multiple patterns:

read -p "please, enter a vehicle to inspect: " vehicle

case $vehicle in
car|truck|van)
    printf "%s\n" "Ground vehicle. Has tires and an engine."
    ;;
boat|submarine)
    printf "%s\n" "Water vehicle. Not functional in a desert."
    ;;
plane|helicopter)
    printf "%s\n" "It can fly! It serves multiple purposes."
    ;;
*)
    printf "%s\n" "We're sorry, your vehicle doesn't exist."
    ;;
esac

Summing up

Although a bit long, this episode is supposed to cover the basic functionality when working with variables inside a *nix environment.

In the next episode we'll take a look at flow control with for and while loops, how arrays work inside shell scripting and the use of functions.