Two roads diverged in a wood, and I –
I took the one less traveled by,
And that has made all the difference.Robert Frost, "The Road Not Taken"
To create a branch in our program, is to create an alternative sequence of commands that may be ignored, based on certain conditions at run-time. In English/pseudocode, the control flow might be described like this:
"If this is true, then do this thing. If not, then do something else."
For example, type out the following sequence:
if [[ $num -eq 42 ]]
then # if/then branch
echo 'num is actually equal to 42'
else # else branch
echo 'num is not 42'
fi
If you didn't set the variable num
to 42 beforehand, then the condition in the if statement, ($num
is equal to 42
), would evaluate to false. So only the command in the "else branch" is executed, with this result:
num is not 42
However, if you set num
to 42
beforehand, then the condition in the if
statement was met, as $num
evaluates to 42
, which is equal to 42
. Thus, the code after if/then
is executed, with this result:
num is actually equal to 42
Here's a GIF of the process:
Is this confusing the hell out of you? It probably should, as if/then/else constructs, like for-loops, aren't meant to be typed out at the interactive prompt. Just like for-loops, in which the interpreter waits for you to finish typing in code between do
and done
, the if/then/else construct isn't executed until you've typed in fi
(the closing statement for an if
statement).
If/then/else conditional statements, like for loops, represent a fundamental change to the control flow of programs. No longer do commands get executed in sequence, one-at-a-time as you hit Enter. With a for-loop, some command sequences are executed numerous times before the program advances. And with conditional branching, some sequences may be completely ignored.
For the most part, we'll be using conditional branching in shell-scripts, particularly inside loops, to add an extra layer of complexity to our programs. Complexity is not necessarily better, though…As fundamental as conditional branching is, I've waited until we've practiced loops and shell-scripts, as if/else statements can add a whole layer of debugging confusion.
As always, take things one step at a time. Try not to use conditional branching until you've convinced yourself that you really need your single-minded program to handle alternative scenarios.
To test the examples in this section, type the code into a shell script, and then execute it from the command-line.
If you're unfamiliar with how arguments are passed into scripts, keep in mind that the variable $1
, inside a script, is equal to the first argument passed into a script at execution time.
In other words, when this command is executed:
bash my-script.sh 90210
– then my-script.sh
has access to the value 90210
by referring to $1
(and if a second argument was passed in, it'd be inside $2)
If there is a command-sequence that should optionally run based on whether a conditional expression is true, then the if/then
statement can look as simple as this:
if [[ some condition ]]; then
do_something
fi
Note some key things about the syntax:
[[ ]]
are used to enclose the conditional expression[[$x == $y]]
[[ $x == $y ]]
Inside a script named just_an_if.sh
, write the following code:
echo 'Hello'
if [[ $1 == 'awesome' ]]; then
echo 'You are awesome'
fi
echo 'Bye'
Running that script will look like this:
dun@corn02:~$ bash just_an_if.sh stuff
Hello
Bye
## now with awesome
dun@corn02:~$ bash just_an_if.sh awesome
Hello
You are awesome
Bye
The branching logic looks like this:
'Hello' ____________________________________________'Bye'
\ /
if [[ $1 == 'awesome' ]] /
then /
\ /
\___'You are awesome'_/
If $1
is not equal to 'awesome'
, then the program continues along to the final line. If it does equal 'awesome'
, then the program takes the then
branch of code.
For situations that call for an either this happens, or that happens, we use the else syntax:
if [[ some condition ]]; then
do_this
else
do_that
fi
Inside a script named if_else.sh
, write the following code:
echo 'Hello'
if [[ $1 == 'awesome' ]]; then
echo 'You are awesome'
else
echo 'You are...OK'
fi
echo 'Bye'
Running that script will look like this:
dun@corn02:~$ bash if_else.sh stuff
Hello
You are...OK
Bye
## now with awesome
dun@corn02:~$ bash if_else.sh awesome
Hello
You are awesome
Bye
Here's a diagram of that control flow:
'Hello' __ __________'Bye'
\ / /
if [[ $1 == 'awesome' ]] / /
| then / /
| \ / /
| \___'You are awesome'_/ /
\ /
else /
\____'You are...OK'______/
Unlike the standalone if-statement, if the program fails to meet the if condition ($1 == 'awesome'
), it does not simply continue to the final line, echo 'Bye'
. Instead, it branches into its own command sequence, echo 'You are ...OK'
Many situations require more than an "either/or" to adequately deal with. For that, we have elif
, which allows us to make as many alternative branches as we'd like:
if [[ some condition ]]; then
do_this
elif [[ another condition ]]; then
do_that_a
elif [[ yet another condition]]; then
do_that_b
else
do_that_default_thing
fi
In a script named if_elif_else.sh
, write the following code:
echo 'Hello'
if [[ $1 == 'awesome' ]]; then
echo 'You are awesome'
elif [[ $1 == 'bad' ]]; then
echo 'Yuck'
else
echo 'You are...OK'
fi
echo 'Bye'
Example output:
dun@corn02:~$ bash if_elif_else.sh awesome
Hello
You are awesome
Bye
dun@corn02:~$ bash if_elif_else.sh bad
Hello
Yuck
Bye
dun@corn02:~$ bash if_elif_else.sh kinda_bad
Hello
You are...OK
Bye
The diagram of the control flow:
'Hello' __ __________'Bye'
\ / /
if [[ $1 == 'awesome' ]] / /
| then / /
| \ / /
| \___'You are awesome'_/ /
|\ /
| elif [[ $1 == 'bad' ]] /
| then /
| \_______'Yuck'__________/
\ /
else /
\____'You are...OK____/
-a filename
- true if filename
exists-f filename
- true if filename
exists and is a regular file-d filename
- true if filename
exists and is a directory-s filename
- true if filename
exists and has a size > 0-z $some_string
- true if $some_string
has 0 characters (i.e. is empty)-n $some_string
- true if $some_string
has more than 0 characters$string_a == $string_b
- true if $string_a
is equal to $string_b
$string_a != $string_b
- true if $string_a
is not equal to $string_b
$x -eq $y
- true if integer $x
is equal to integer $y
$x -lt $y
- true if integer $x
is less than integer $y
$x -gt $y
- true if integer $x
is greater than integer $y
See a full list of expressions in the Bash documentation
A common problem in long-running web-scraping tasks, or anything involving the Internet, is that you have to worry about the target site, or the entire Internet going down. Preparing for this scenario is a huge part of professional systems engineering.
What we've done so far hasn't risen up to that level of engineering. But we still have need, quaint as it is, for more robust operation. For example, it'd be nice if our web-scraper, when it has to quit and then restart, could continue from where it started, as opposed to re-downloading the pages it already downloaded.
To implement that kind of unnecessary-download-prevention, we can use the test for file existence:
for url in http://www.example.com http://www.wikipedia.org http://www.cnn.com
do
# remove all punctuation characters
fname=$( echo $url | tr -d '[:punct:]')
if [[ -a $fname ]]; then
echo "Already exists: $fname"
else
echo "Downloading $url into $fname"
fi
done
If you put that code into a shell script named nice-downloader.sh
and run it twice (and assuming it isn't interrupted the first time):
user@host:~$ bash nice-downloader.sh
Downloading http://www.example.com into httpwwwexamplecom
Downloading http://www.wikipedia.org into httpwwwwikipediaorg
Downloading http://www.cnn.com into httpwwwcnncom
# second time:
user@host:~$ bash nice-downloader.sh
Already exists: httpwwwexamplecom
Already exists: httpwwwwikipediaorg
Already exists: httpwwwcnncom
The exclamation mark can be used within the conditional expression if what we want a branch to execute when something is not true:
if [[ ! 1 -eq 0 ]]; then
echo 'FYI, one is not equal to zero'
fi
The conditional expression in the above example reads as: if it is not true that 1 is equal to 0, then…
We can test more than one conditional expression at once, using &&
to require that two conditions that both must be true. Or, using ||
to require that either one (or both) of the conditions must be true.
Use double-ampersands, &&
, to join two conditional expressions in a way that reads: condition A and condition B must both be true :
if [[ $a -gt 42 && $a -lt 100 ]]; then
echo "The value $a is greater than 42 but less than 100"
else
echo "The value $a is not between 42 and 100"
fi
In the above example, the if statement evaluates to true only if both the conditional expressions are true:
$a
is greater than (-gt
) 42
$a
is less than (-lt
) 100
The following, much more convoluted code, achieves the same result – in other words, avoid nested if-blocks unless absolutely necessary:
if [[ $a -gt 42 ]]; then
if [[ $a -lt 100 ]]; then
echo "The value $a is greater than 42 but less than 100"
else
echo "The value $a is not between 42 and 100"
fi
elif [[ $a -lt 100 ]]; then
if [[ $a -gt 42 ]]; then
echo "The value $a is greater than 42 but less than 100"
else
echo "The value $a is not between 42 and 100"
fi
else
echo "The value $a is not between 42 and 100"
fi
Sometimes, you need a conditional expression to read as: "if condition A OR condition B is true".
The if-then branch below will execute if either of these conditions are met:
$a
is less than 42$a
is greater than 100if [[ $a -lt 42 || $a -gt 100 ]]; then
echo "The value $a is either: less than 42, or greater than 100"
else
echo "The value $a is between 42 and 100"
fi
Up to this point, we've been acquainted with the read-while loop, which executes commands for every line in an input file:
while read url
do
curl "$url" >> everywebpage_combined.html
done < list_of_urls.txt
Another form of the while
loop involves passing in a conditional statement that, when true, causes the loop to repeat itself.
The following example sets countdown
to 5
.
For each iteration of the while loop, the condition [[ $countdown -ge 0 ]]
is tested. If it is true, then the loop executes again. This loop keeps executing until the value of countdown
is greater than or equal to 0
.
Once countdown
has a value of -1
, the condition [[ $countdown -ge 0 ]]
will be false, and the loop will cease execution.
How or when does countdown
reach -1
? The code inside the loop subtracts 1
each time the loop runs:
user@host:~$ countdown=5
user@host:~$ while [[ $countdown -ge 0 ]]; do
echo "Liftoff in...$countdown"
countdown=$(( countdown - 1 ))
done
echo 'And we have liftoff'
#
Liftoff in...5
Liftoff in...4
Liftoff in...3
Liftoff in...2
Liftoff in...1
Liftoff in...0
And we have liftoff
Now what happens if the line countdown=$(( countdown -1))
wasn't included?. Then [[ $countdown -ge 0 ]]
will always be true, and the loop won't stop until the universe, or the computer, dies of heat death.
Try the amended loop, and prepare to hit Ctrl-C
countdown=1
while [[ $countdown -ge 0 ]]; do
echo "Liftoff in...$countdown"
done
Sometimes infinite loops are useful, for situations in which we want a program to be performing a task in the background for the indefinite future. In that case, you can simply use true for the conditional statement, which, well, is always true. The following program will remind me to be positive, every 12 hours (43,200 seconds), for as long as the computer stays on. Or until I kill it:
words="You're good enough, you're smart enough, and doggone it, people like you"
while true; do
sleep 43200
echo $words | mail me@stanford.edu -s 'Important reminder'
done
Note: for the purposes of this class and corn.stanford.edu, you should probably not use an infinite loop, but instead, have a finite bound so that if you forget which machine you were on when you launched a script and are thus unable to kill it, it will die at least sometime on its own:
for x in $(seq 1 1000); do
echo "something for every 10 minutes"
sleep 600
done