General purpose command-line tools

All the Unix/OSX command-line tools we’ve used and a few of their most common usecases.

Note (2015-06-15): I compiled this list specifically for my Stanford class, COMM 213: Computational Methods in the Civic Sphere, to serve as shortlist of commands needed to complete the homework. It is by no means a complete list of useful *nix commands and programs. For example, the course did not focus on sysadmin/operating system concepts, so no need to cover chmod. And programs like awk deserve their own mini-site. Originally, I did not include wget because I wanted students to be able to do minimal web requests through curl.

This list contains the set of tools used throughout this course. This listing is not meant to be comprehensive, but to serve as a refresher of what tools exist, and then it's up to you to look up the documentation and to think of ways to combine the tools for your specific situation.

  • basename Extract just the filename from a filepath
  • bc A calculator that reads from standard input
  • cat Concatenates files together
  • cd Change directory
  • cp Copy files
  • csvfix Parse CSV files
  • curl Transfer a URL
  • cut Cut out selected portions of lines
  • date Print or parse date strings
  • echo Print arguments to standard output
  • grep Print lines matching a pattern
  • head Print only the first few lines of a text stream
  • history Show the last executed commands
  • hostname Print the name of the computer you're currently on
  • iconv Converts between character sets
  • jq A command-line JSON parser
  • kill Send a signal to a running process
  • less Paginate long text streams
  • ls List directory contents
  • man Show documentation for a command
  • mkdir Make a directory
  • mv Move or rename files
  • nano Interactive text editor
  • printf Format and print data
  • ps Show a snapshot of current processes
  • pup Parse HTML from the command line
  • pwd Print the name of your working directory
  • read Read a line from standard input
  • rm Remove files
  • sed Stream editor for complex transformation of text
  • seq Print a sequence of numbers
  • sleep Suspend execution for a period of time
  • sort Sort lines of text
  • tail Print only the last lines of a text stream
  • touch Create an empty file or update its timestamp
  • tr Translate characters in a text stream
  • uniq Print only unique lines of text
  • unzip Extract files from a zip archive
  • wc Print the line, word, and byte counts of a text stream
  • wget Easy web crawling
  • whoami Print your username
  • zip Add files to a compressed archive
basename - Extract just the filename from a filepath

Standard usage

basename ./hello/there/cat.jpg 
Output:
cat.jpg 

Get a filename and remove its suffix with -s

basename -s '.jpg' ./hello/there/cat.jpg 
Output:
cat 

It works on URLs too

url="http://www.compciv.org/files/images/topics/scraping/http-cats.jpg"
fname=$(basename $url)
curl $url > $fname
 
bc - A calculator that reads from standard input

Standard usage

echo '100 / 3' | bc 
Output:
33 

Use the -l, --mathlib option to get floating point results

echo '100 / 3' | bc -l 
Output:
33.333333333333336 
cat - Concatenates files together

Adding two (or more) files together

cat file1.txt file2.txt 
Output:
line from file 1
line from file 1
line from file 2 

Unnecessary (but fine, if it helps you to read pipeline from left to right) use of cat

cat onefile.txt | grep 'hi' 

Add a Heredoc-style string into a file

Heredocs are helpful for working multi-line complex strings, such as raw HTML.

cat > basic.html<<'EOF'
  <html>
  <head>
    <title>My first "Web Page"</title>
  </head>
  <body>
    <h1>A headline</h1>
    <p>Check out the
       <a href="http://www.nytimes.com">New York Times</a>
    </p>
  </body>
</html>
EOF
 
cd - Change directory

Change into a directory

cd some/path 

Change into home directory

cd ~ 

Change to parent directory

cd .. 

Change into the system’s root

cd / 

Change into the system’s /tmp directory

cd /tmp 
cp - Copy files

Standard usage

cp source_file.txt new_file.txt 

Force copy: overwrite files without prompting

cp -f source_file.txt existing_file.txt 

Make a copy of a directory with the -r option

cp -r some_dir/ new_dir/ 

Copy something into your home directory

cp something.txt ~ 

Copy all files with a .txt extension into a sub-directory

cp *.txt some_dir 
csvfix - Parse CSV files

This utility provides the ability to parse text files in which the values/columns are delimited by commas, or a delimiter of your choice. Because of the possibility that CSV files contain multi-line data (and, oh, the lack of a standard that will foil even the most skilled greppers), it is recommended that you use CSVFix when dealing with delimited-text data.

The list of subcommands is long; if you need to do something specific, check the CSVFix docs and you’ll probably find what you need.

To install on corn.stanford.edu, after having set your PATH to include ~/compciv_bin:

    wget https://bitbucket.org/neilb/csvfix/get/version-1.6.zip
    unzip version-1.6.zip && rm version-1.6.zip
    cd neilb-csvfix-e804a794d175
    make lin
    cp ./csvfix/bin/csvfix ~/bin_compciv/

For the purpose of some of the examples, example.csv contains the following:

      Name,Quantity,Cost
      Apple,35,2.00
      Orange,67,1.95
      Durian,9,12.00

Use the echo subcommand to print the CSV in a standard format to stdout

csvfix echo example.csv 
Output:
"Name","Quantity","Cost"
"Apple","35","2.00"
"Orange","67","1.95"
"Durian","9","12.00" 

Use the -osep operator to change the delimiter of CSV data when printing to stdout

csvfix echo -osep '|' example.csv 
Output:
"Name"|"Quantity"|"Cost"
"Apple"|"35"|"2.00"
"Orange"|"67"|"1.95"
"Durian"|"9"|"12.00" 

Select and rearrange order of the columns with the order subcommand

csvfix order -n 3,2,1 example.csv 
Output:
"Cost","Quantity","Name"
"2.00","35","Apple"
"1.95","67","Orange"
"12.00","9","Durian" 

Select, rearrange order by column name with order -fn

csvfix order -fn Cost,Name example.csv 
Output:
"Cost","Name"
"2.00","Apple"
"1.95","Orange"
"12.00","Durian" 

Sort the data by a column with the sort subcommand and using the -rh option to include the header

csvfix sort -rh -f 1 example.csv 
Output:
Name,Quantity,Cost
"Apple","35","2.00"
"Durian","9","12.00"
"Orange","67","1.95" 

Force csvfix to only double-quote fields when necessary with -smq option

csvfix -smq order -f 3,2,1 example.csv 
Output:
Cost,Quantity,Name
2.00,35,Apple
1.95,67,Orange
12.00,9,Durian 

Force csvfix to use a specific delimiter with -osep for the output

csvfix -osep '@' order -f 3,2,1 example.csv 
Output:
"Cost"@"Quantity"@"Name"
"2.00"@"35"@"Apple"
"1.95"@"67"@"Orange"
"12.00"@"9"@"Durian" 

Sort the 3rd column, in descending numerical order

csvfix sort -rh -f 3:DN example.csv 
Output:
Name,Quantity,Cost
"Durian","9","12.00"
"Apple","35","2.00"
"Orange","67","1.95" 

Use printf to customize the output of the field values

csvfix printf -fmt "There are %s %s %f" example.csv
 
Output:
There are Name Quantity 0.000000
There are Apple 35 2.000000
There are Orange 67 1.950000
There are Durian 9 12.000000 

Switch up the order of columns for printf with -f option

csvfix printf -f 2,1,3 -fmt "There are %s %ss and they cost %f each" example.csv
 
Output:
There are Quantity Names and they cost 0.000000 each
There are 35 Apples and they cost 2.000000 each
There are 67 Oranges and they cost 1.950000 each
There are 9 Durians and they cost 12.000000 each 

Use the ifn option to remove the header from the output

csvfix printf -ifn -f 2,1,3 -fmt "There are %s %ss and they cost %f each" example.csv
 
Output:
There are 35 Apples and they cost 2.000000 each
There are 67 Oranges and they cost 1.950000 each
There are 9 Durians and they cost 12.000000 each 
curl - Transfer a URL

This nearly-ubiquitous tool makes it possible to interact with Web sites and APIs. Check out its manual for its many options.

Download and print to standard output

curl http://www.example.com 

Download and save to specified file name with -o, –output

curl http://www.example.com -o somefile.txt 

Suppress status indicator and error messages with -s, --silent

curl http://www.example.com -s 

Automatically follow redirects with -L

curl http://t.co/d -L 

Fetch only the headers with --head, -I

curl http://t.co/d -I 

Download and save to the basename of a URL with -O

This handy option will create a filename using the basename of a URL, i.e. the last segment of the URL path

curl http://www.example.com/stuff.zip -O 
cut - Cut out selected portions of lines

Specify a delimiter with -d and which fields to show with -f

echo A,B,C,D,E | cut -d ',' -f 3,4 
Output:
C,D 

Cut out everything except the nth character with -c [n]

echo 'Hello world' | cut -c 7 
Output:
w 

Cut out everything except the range x to y with -c [x-y]

echo 'Hello world' | cut -c 3-7 
Output:
ello w 

Cut out everything before the nth character -c [n-]

echo 'Hello world' | cut -c 7- 
Output:
world 

Cut out everything after the nth character -c [-n]

echo 'Hello world' | cut -c -7 
Output:
Hello w 
date - Print or parse date strings

Standalone usage, display the date now

date 
Output:
Sat Jan 24 10:52:28 PST 2015 

Display the date by parsing a given string with -d, --date=

date -d '2013-01-03' 
Output:
Thu Jan  3 00:00:00 PST 2013 

Parse a relatively human-friendly date string

date -d 'Feb 9 1913' 
Output:
Sun Feb  9 00:00:00 PST 1913 

Format the current date as YYYY-MM-DD

date +%Y-%m-%d 
Output:
2015-02-06 

Format the output as YYYY-MM-DD

date -d 'May 15, 1974' +%Y-%m-%d 
Output:
1974-05-15 

Format the output as YYYY-MM-DD HH:MM:SS

date -d 'May 15, 1974 9:32 PM' '+%Y-%m-%d %H:%M:%S' 
Output:
1974-05-15 21:32:00 

Use -I, --iso-8601 as a shortcut for standard ISO YYYY-MM-DD format

date -d 'Sept 25, 2014 3:52:11 PM' -I 
Output:
2014-09-25 

Specify precision with -I[precision]

date -d 'Sept 25, 2014 3:52:11 PM' -Iseconds  
Output:
2014-09-25T15:52:11-0700 
echo - Print arguments to standard output

Print “something” to screen

echo something 
Output:
something 

Print a variable’s value to stdout

something='fun times'
echo $something
 
Output:
fun times 

Print “something” into a pipe

echo something | tr '[:lower:]' '[:upper:]' 
Output:
SOMETHING 

Quickie concatenation of strings

a=apples
b=bongos
echo "$a AND $b"
 
Output:
apples AND bongos 
grep - Print lines matching a pattern

This 40-year-old tool is one of the most famous and ubiquitous Unix programs, and perhaps the most commonly-used for searching for text.

Printing matching lines in a file

grep 'hello' file1.txt 
Output:
hello world
say hello 

Grepping multiple files, showing file names with the match

grep 'hello' file1.txt file2.txt 
Output:
file1.txt:hello world
file1.txt:say hello
file2.txt:just a hello 

Reading from standard input removes file descriptors

cat file1.txt file2.txt | grep 'hello' 
Output:
hello world
say hello
just a hello 

Case insensitive search with -i

grep -i 'HELLO' file1.txt 
Output:
hello world
say hello 

Showing non-matching lines with -v

grep -v 'hello' file1.txt 
Output:
bye world
say bye 

Using extended regular expressions with -E

grep -E '[0-9]{5}' file1.txt 
Output:
Beverly Hills 90210 

Printing only the match, not the entire line with -o

echo 'Hello world' | grep -o 'world' 
Output:
world 

Printing just the match made by a regular expression pattern (5 or more alphanumerical characters)

cat file1.txt | grep -oE '[[:alnum:]]{5,}' 
Output:
hello
world
hello
world
Beverly
Hills
90210 

Show the x lines before a match with -B x

grep -B 1 'Beverly' file1.txt 
Output:
say bye
Beverly Hills 90210 

Show the y lines after a match with -A y

grep -A 1 'hello world' file1.txt 
Output:
hello world
say hello 

Grep for a series of strings that are contained in a file with -f

grep -f things.txt file1.txt 

Grep faster when you don’t need regular expressions with -F

grep -F 'word' file.txt 

When grepping a list of files (not stdin), use -l to list all files that match the given term at least once.

grep -l 'word' *.txt 

When grepping a list of files (not stdin), use -L to list all files that don’t contain the given term

grep -L 'word' *.txt 
history - Show the last executed commands

Standard usage

history 

Show past commands that involved `cat’

history | grep cat 

Show just the most recent 10 commands

history | tail -n 5 

Remove leading line numbers (as long as history is under 99,999 commands)

history | cut -c 8- 
hostname - Print the name of the computer you're currently on

Standard usage

hostname 
Output:
corn30.stanford.edu 
iconv - Converts between character sets

For our purposes, iconv can be used to bypass the issues that arise from dealing with textual-data with unexpected character encodings. For example, emojis.

For more information, read The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)

Attempt a translation of non-ASCII characters to ASCII

This is useful for converting accented characters, such as é and ô to their non-accented equivalents.

echo Béyôncæ | iconv -t ASCII//TRANSLIT 
Output:
B'ey^oncae 

Ignore all non-ASCII (i.e. standard American-English) characters

This command will sometimes give you an error message. If so, refer to the usage below of iconv

cat somefile.txt | iconv -t ASCII//IGNORE 

Force the conversion of UTF-8 characters to ASCII

cat somefile.txt | iconv -c -f utf-8 -t ascii 
jq - A command-line JSON parser

This is a tool not part of standard Linux distributions but is extremely handy for working with JSON data.

jq has its own parsing language and methods, both for extracting data and for outputting new data structures.

The jq manual is the most comprehensive reference for how jq works, but you can refer to this basic tutorial for the basic concepts.

Simply parse and pretty-print

echo '{"name": "Dan"}' | jq '.'
 
Output:
{
  "name": "Dan"
} 

Select an object’s attribute

echo '{"name": "Dan"}' | jq '.name'
 
Output:
"Dan" 

Select multiple attributes

echo '{"name": "Dan", "age": 45}' | jq '.name, .age'
 
Output:
"Dan"
45 

Print raw-output with -r, --raw-output

echo '{"name": "Dan", "age": 45}' | jq -r '.name, .age'
 
Output:
Dan
45 

Select an element from an array

echo '["a", "b", "c"]' | jq '.[1]'
 
Output:
"b" 

Select attributes from an array of objects

echo '[{"name": "Dan", "age": 42}, {"name": "Bob", "age": 55}]' |
  jq '.[] | .name'
 
Output:
"Dan"
"Bob" 
kill - Send a signal to a running process

Terminate a process with a given PID of 1234 (use ps aux to find PID)

kill 1234 

Terminate all processes that you are allowed to terminate

kill -9 -1 
less - Paginate long text streams

Show a text stream one page at a time

cat *.txt | less 
ls - List directory contents

Default listing of files

ls 

List all files, including hidden files with -a, --all

ls -a 

Show a long list with file attributes with -l

ls -l 
man - Show documentation for a command

Basic usage

man cat 
mkdir - Make a directory

Make a single directory

mkdir my_sub_dir 

Make multiple directories

mkdir apples oranges pears 

Make a directory and all its parent directories with -p

mkdir -p a/path/to/a/new/subdir 

Make a subdirectory inside your home directory

mkdir ~/new_dir 

Make a subdirectory inside /tmp

mkdir /tmp/new_dir 
mv - Move or rename files

Rename a file

mv old_name.txt new_name.txt 

Rename/move a file even if new name exists with -f

mv -f old_name.txt new_name.txt 

Ask before overwriting an existing file with -i

mv -i old_name.txt new_name.txt 

Move something into your home directory

mv somefile ~ 

Move all files with a .txt extension into a sub-directory

mv *.txt some_dir 
nano - Interactive text editor

Open (or create) a file and enter interactive-editing mode

nano file.txt 
printf - Format and print data

The printf command is like echo, just much more powerful and versatile. The Bash Hackers Wiki has a nice page on it.

With printf, you pass in at least two arguments:

  1. A string containing a sort of template for text, with special syntax for placeholders.
  2. A string (or several strings) that are then inserted into the placeholders of the first argument.

There are a bewildering array of syntax placeholders. The examples will try to cover the basics.

Basic usage

By default, printf will not print a newline character at the end, causing the output to butt up against the prompt.

Like this: My name is Danuser@host:~$

printf 'My name is %s' 'Dan'
 

Print a new line at the end with ‘\n’

The stands for ‘new line’

printf 'My name is %s \n' 'Dan'
 

Work with multiple arguments

printf 'My name is %s %s. \nI am %s.\n' 'Dan' 'Man' 'happy'
 
Output:
My name is Dan Man.
I am happy. 

Printing out an HTML string

printf '
  <h1>Hello %s</h1>
  <p>
      <a href="%s">%s</a>
  </p> \n' 'Stranger' 'http://www.thestranger.com/' 'A news site'
 
Output:
<h1>Hello Stranger</h1>
<p>
    <a href="http://www.thestranger.com/">A news site</a>
</p> 

Using a Heredoc-style string in a variable

See the example for the read command for more information on heredocs

Note: if you want to preserve the newlines in some_html, you have to double-quote it, i.e. printf "$some_html"

read -r -d '' some_html <<'EOF'
<h1>Hello %s</h1>
<p>Here is a kitten:</p>
<img src="http://placekitten.com/g/%s/%s">
\n
EOF

printf $some_html 'Cat Lover' 500 300
 
Output:
<h1>Hello Cat Lover</h1>
<p>Here is a kitten:</p>
<img src="http://placekitten.com/g/500/300"> 
ps - Show a snapshot of current processes

List all processes belonging to the current user and session

ps 
Output:
PID TTY          TIME CMD
1434 pts/65   00:00:00 sleep
1532 pts/65   00:00:00 ps
25247 pts/65   00:00:00 bash 

List all processes running on the system

ps aux 

List all of your processes by filtering for your user ID (this is what you most frequently want to do)

ps aux | grep $(whoami) 
pup - Parse HTML from the command line

The pup tool is inspired by the jq JSON-parsing tool, but is used for parsing HTML with HTML/CSS selectors.

curl www.example.com | pup 'a' 
Output:
<a href="http://www.iana.org/domains/example">
 More information...
</a> 
curl www.example.com | pup 'a attr{href}' 
Output:
http://www.iana.org/domains/example 
curl www.example.com | pup 'a text{}' 
Output:
More information... 
pwd - Print the name of your working directory

(When inside your own home directory)

pwd 
Output:
/afs/.ir/users/y/o/your_home 
read - Read a line from standard input

The read command is often used to handle reading text streams line-by-line – which is not something that some_var=$(cat some.txt) will do by default.

It’s especially helpful in combination with a while loop and for assigning Heredocs, i.e. multi-line strings that are too complex to delimit with quotation marks, to variables.

For the most part, we want to use the -r option, which prevents backslashes from doing their normal thing of escaping characters.

Useful links:

For the examples below, assume example.txt contains:

    README.txt
    42
    Documents and Settings
    index.html
    Dogs and Cats.html

Read each line from a file and pass it into a while loop

while read -r x; do
  echo "Opening...$x"
done < example.txt
 
Output:
Opening...README.txt
Opening...42
Opening...Documents and Settings
Opening...index.html
Opening...Dogs and Cats.html 

Read each line from a command and pipe into a while loop

curl -s http://www.example.com | while read -r some_line; do
  echo "This is a line:  $some_line"
done
 
Output:
This is a line:  <!doctype html>
This is a line:  <html>
This is a line:  <head>
This is a line:  <title>Example Domain</title> 

Read each line from a command and pass it into a while loop, right to left

To read the output from a command, wrap it up between <( and ) (as opposed to $( and ))

while read -r x; do
  echo "Opening...$x"
done < <(cat example.txt | grep 'html')
 
Output:
Opening...index.html
Opening...Dogs and Cats.html 

Save a multi-line Heredoc into a variable and do not interpret special Bash symbols

This will be the most common pattern we follow when creating HTML templates within Bash.

Heredocs make it easy to describe a multi-line string without worrying about whether you’ve used the right number of quote marks.

This particular example is derived from this excellent StackOverflow Q&A.

This example, with the use of 'EOF', prevents things like $ from being interpreted by Bash.

The use of the option -d '' tells read to keep on reading even after the first newline

Basically, see the read -r -d '' as the boilerplate to memorize.

read -r -d '' some_variable <<'EOF'
<html>
  <head>
    <title>My first "Web Page"</title>
  </head>
  <body>
    <h1>A headline</h1>
    <p>Check out the
       <a href="http://www.nytimes.com">New York Times</a>
    </p>
  </body>
</html>
EOF
 
rm - Remove files

Remove a file

rm somefile.txt 

Remove all the files in the current directory

rm * 

Remove all the files in the current directory but ask for confirmation with -i

rm -i * 

Remove a file and do not ask for confirmation or show errors with -f

rm -f somefile.txt 

Remove a file even if it is an empty directory with -d

rm -d somedir 

If the given filename is a directory, remove it and everything inside of it with -r

rm -r somedir 

Wipe out your computer (i.e. making a typo while doing rm -rf is very bad)

rm -rf / 
sed - Stream editor for complex transformation of text

sed is a very powerful program that basically has its own language, and thus has books and websites devoted to it.

For our purposes, we can focus solely on its substitution command (Bruce Barnett describes it as “The esssential command”), which allows us to transform text with far more power than we can with just tr.

Basic substitution using the s subcommand

echo 'hello world' | sed s/hello/bye/ 
Output:
bye world 

Repeat the substitution for every match with the g flag

echo 'hello world bye world' | sed s/world/people/g 
Output:
hello people bye people 

Make matches based on extended regular expressions with -E option

echo 'Beverly Hills 90210' | sed -E s/[0-9]{3}/q/ 
Output:
Beverly Hills q10 

An example of regex capturing groups and backreferences

"echo 'Beverly Hills 90210' | sed -E 's/([0-9]+)/I love \1 a lot/'"
 
Output:
Beverly Hills I love 90210 a lot 
seq - Print a sequence of numbers

Print numbers 1 to 3

seq 1 5 
Output:
1
2
3
4
5 
sleep - Suspend execution for a period of time

Sleep for 10 seconds

sleep 10 

Sleep for 5 days (only in GNU Unix, not OSX)

sleep 5d 
sort - Sort lines of text

Sort in ascending alphabetical order

sort lines.txt 
Output:
100
9
A
a
b 

Sort in reverse order with -r

sort -r lines.txt 
Output:
b
a
A
9
100 

Sort numbers based on numerical value with -n

sort -n lines.txt 
Output:
A
a
b
9
100 

Sort lines based on a column q with -k [q] and a delimiter f with -t [f]

sort -k 3 -t ',' lines.csv 
Output:
C,D,Y
A,B,Z 
tail - Print only the last lines of a text stream

Print only the last x lines with -n [x]

cat *.txt | tail -n 5 

Read from a file instead of standard input

tail -n 5 file1.txt 

Skip the first line in a file with -n [+2]

tail -n +2 file1.txt 
touch - Create an empty file or update its timestamp

Update file’s accessed/modified time, or create it if it doesn’t exist

touch somefile.txt 
tr - Translate characters in a text stream

Replace one character for another

echo Hello world | tr 'o' 'a' 
Output:
Hella warld 

Replace multiple characters

echo Hello world | tr 'lo' 'xo' 
Output:
Hexxa warxd 

Normalize all whitespace characters (including newlines) to spaces

txt="Hello,
world"
echo $txt | tr '[:space:]' ' '
 
Output:
Hello, world 

Delete a character, such as a space character, with -d

echo Hello world | tr -d ' ' 
Output:
Helloworld 

Translate lower-case characters to upper-case using character classes

echo Hello world | tr [:lower:] [:upper:] 
Output:
HELLO WORLD 

Remove all punctuation

echo 'Hello, world!' | tr -d '[:punct:]' 
Output:
Hello world 
uniq - Print only unique lines of text

Print just unique lines, but only if input is sorted

uniq somefile.txt 
Output:
oranges
apples
oranges
kiwis
apples 

Used in conjunction with sort

sort somefile.txt | uniq 
Output:
apples
kiwis
oranges 

Print unique values and frequency of occurrence with -c option

sort somefile.txt | uniq -c 
Output:
2 apples
1 kiwis
3 oranges 
unzip - Extract files from a zip archive

Basic unzipping

unzip some.zip 

Use -o option to overwrite existing files without prompting user

unzip -o some.zip 

Extract files and pipe their contents to stdout with -p option

unzip -p some.zip 

Extract only specific files and pipe their contents into a new file

unzip -p stuff.zip 14.txt 42.txt > file.txt 
wc - Print the line, word, and byte counts of a text stream

Print line, word, and character count

wc somefile.txt 
Output:
6 8 55 somefile.txt 

Print just the line count with -l

wc -l somefile.txt 
Output:
6 somefile.txt 

Print just the word count with -w

wc -w somefile.txt 
Output:
8 somefile.txt 

Print just the character count with -c

wc -c somefile.txt 
Output:
55 somefile.txt 

Count the lines from standard input to avoid showing filename

cat somefile.txt | wc -l 
Output:
6 
wget - Easy web crawling

Like curl, wget can be used to download individual files from the Web. However, it contains a suite of features geared towards batch downloads, i.e. web crawling. wget was recently known as being a Low-Cost Tool to Best [the] N.S.A..

And similar to curl, wget has a mountain of documentation worth reading.

Here are some examples from the official docs. I also like The Geek Stuff’s list of wget examples

Download a single file and save to a default filename

Unlike curl, wget does not send downloaded content to stdout by default. Instead, it derives a base filename to save to the current working directory.

For example, wget en.wikipedia.org/wiki/Hello will save to a file named Hello. If the target is a directory (i.e. with a trailing slash, e.g. wget en.wikipedia.org/wiki/), it will save to index.html

Also by default: if the default filename already exists, wget will create a new, numbered variation, e.g. index.html.1

wget www.example.com 
Output:
--2015-06-18 05:09:00--  http://www.example.com/
Resolving www.example.com... 93.184.216.34, 2606:2800:220:1:248:1893:25c8:1946
Connecting to www.example.com|93.184.216.34|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1270 (1.2K) [text/html]
Saving to: ‘index.html’
100%[======================================>] 1,270       --.-K/s   in 0.002s
2015-06-18 05:09:00 (743 KB/s) - ‘index.html’ saved [1270/1270] 

Redirect to stdout

wget -O - www.example.com 
Output:
[content of the webpage]

100%[======================================>] 1,270       --.-K/s   in 0.001s
2015-06-13 04:52:45 (1.53 MB/s) - written to stdout [1270/1270] 

Download files only if newer than existing files

With this option, wget will set the downloaded file’s timestamp based on the web server’s Last-Modified header. On subsequent downloads using -N, wget will fetch a file only if it is newer than the existing file. Read the full docs at gnu.org: Time-Stamping Usage

wget -N www.example.com 
Output:
--2015-06-18 05:06:39--  http://www.example.com/
Resolving www.example.com... 93.184.216.34, 2606:2800:220:1:248:1893:25c8:1946
Connecting to www.example.com|93.184.216.34|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1270 (1.2K) [text/html]
Server file no newer than local file ‘index.html’ -- not retrieving. 

Recursively download links

This is where wget starts to get fun – and dangerous. The recursive option will cause wget to download not just the target page, but all URLs linked to from that page. This includes URLs of things like images and stylesheets

By default, it will save all of the files into a directory named after the site domain.

It should go without saying that this can be a massive operation if you aren’t careful.

From the documentation on Recursive Download:

Recursive retrieval of HTTP and HTML/CSS content is breadth-first. This means that Wget first downloads the requested document, then the documents linked from that document, then the documents linked by them, and so on. In other words, Wget first downloads the documents at depth 1, then those at depth 2, and so on

wget -r www.stanford.edu 
Output:
[a wall of output showing that every file linked to from the Stanford homepage has been downloaded]

2015-06-18 05:17:43 (4.26 MB/s) - ‘www.stanford.edu/about/history/images/hero-seq.jpg’ saved [520408/520408]

FINISHED --2015-06-18 05:17:43--
Total wall clock time: 12s
Downloaded: 147 files, 9.2M in 4.6s (2.00 MB/s) 

Specify the number of layers (i.e. the depth) for a recursive crawl.

By default, a recursive crawl with wget will go 5 layers deep, i.e it will download all the links from the first page. Then it will visit each of those links and download their links, and so on, five layers deep.

Setting this value to 1 will only download URLs linked from in the target page. Setting it to 0 is shorthand for an infinite number of layers to crawl. Be careful.

wget -r -l 1 www.stanford.edu 
Output:
[long list of files downloaded]

FINISHED --2015-06-18 05:26:43--
Total wall clock time: 2.2s
Downloaded: 42 files, 1.0M in 0.4s (2.73 MB/s) 

Download only files with a specified extension

Use in conjunction with -r. Extremely helpful if a webpage contains links to a set binary files you want to collect, without collecting everything else, such as links to other webpages.

wget -r -A .jpg www.stanford.edu 
Output:
[a wall of output for every jpg on the homepage]
2015-06-18 05:20:34 (4.49 MB/s) - ‘www.stanford.edu/about/history/images/hero-seq.jpg’ saved [520408/520408]

FINISHED --2015-06-18 05:20:34--
Total wall clock time: 3.4s
Downloaded: 32 files, 4.7M in 2.1s (2.26 MB/s) 

Mirror an entire site

Again, be careful. From the docs:

Turn on options suitable for mirroring. This option turns on recursion and time-stamping, sets infinite recursion depth and keeps FTP directory listings. It is currently equivalent to -r -N -l inf --no-remove-listing.

wget -m www.example.com 

Snapshot a single page

This is a variation I use when I just want to preserve a single page and all of its visual elements, similar to how sites like archive.is work.

See a full description of the flags and options in this gist.

wget -E -H -k -K -nd -N -p -P /tmp/wikipedia https://en.wikipedia.org/wiki/Main_Page 
Output:
[wall of output of downloaded files]
FINISHED --2015-06-18 05:48:04--
Total wall clock time: 2.7s
Downloaded: 20 files, 154K in 0.3s (563 KB/s)
Converting /tmp/wikipedia/Main_Page.html... 28-310
Converted 1 files in 0.003 seconds. 

Mirror a subdirectory

Use of the –no-parent flag prevents going higher than the specified sub directory

wget -m -P -e robots=off --no-parent http://www.example.com/whatsup 
whoami - Print your username

Standard usage

whoami 
Output:
your_sunet_id 
zip - Add files to a compressed archive

Add all the .txt files in current directory to a zip archive

zip alltext.zip *.txt