The curl
tool lets us fetch a given URL from the command-line. Sometimes we want to save a web file to our own computer. Other times we might pipe it directly into another program. Either way, curl
has us covered.
See its documentation here.
This is the basic usage of curl
:
curl http://some.url --output some.file
That --output
flag denotes the filename (some.file
) of the downloaded URL (http://some.url
)
Let's try it with a basic website address:
curl http://example.com --output my.file
Besides the display of a progress indicator (which I explain below), you don't have much indication of what curl
actually downloaded. So let's confirm that a file named my.file
was actually downloaded.
Using the ls
command will show the contents of the directory:
ls
Which outputs:
my.file
And if you use cat
to output the contents of my.file
, like so:
cat my.file
– you will the HTML that powers http://example.com
Let's back up a bit: when you first ran the curl
command, you might have seen a quick blip of a progress indicator:
% Total % Received % Xferd Average Speed Time Time
Dload Upload Total Spent
100 1270 100 1270 0 0 50573 0 --:--:-- --:--:--
If you remember the Basics of the Unix Philosophy, one of the tenets is:
Rule of Silence: When a program has nothing surprising to say, it should say nothing.
In the example of curl
, the author apparently believes that it's important to tell the user the progress of the download. For a very small file, that status display is not terribly helpful. Let's try it with a bigger file (this is the baby names file from the Social Security Administration) to see how the progress indicator animates:
curl http://stash.compciv.org/ssa_baby_names/names.zip \
--output babynames.zip
Quick note: If you're new to the command-line, you're probably used to commands executing every time you hit Enter. In this case, the command is so long (because of the URL) that I broke it down into two lines with the use of the backslash, i.e. \
This is solely to make it easier for you to read. As far as the computer cares, it just joins the two lines together as if that backslash weren't there and runs it as one command.
The curl
progress indicator is a nice affordance, but let's just see if we get curl
to act like all of our Unix tools. In curl
's documentation of options, there is an option for silence:
-s, --silent
Silent or quiet mode. Don't show progress meter or error messages. Makes Curl mute. It will still output the data you ask for, potentially even to the terminal/stdout unless you redirect it.
Try it out:
curl http://example.com --output my.file --silent
So those are the basics for the curl
command. There are many, many more options, but for now, we know how to use curl
to do something that is actually quite powerful: fetch a file, anywhere on the Internet, from the simple confines of our command-line.
Before we go further, though, let's look at the various ways this simple command can be re-written and, more crucially, screwed up:
As you might have noticed in the --silent
documentation, it lists the alternative form of -s
. Many options for many tools have a shortened alias. In fact, --output
can be shortened to -o
curl http://example.com -o my.file -s
Now watch out: the number of hyphens is not something you can mess up on; the following commands would cause an error or other unexpected behavior:
curl http://example.com -o my.file -silent
curl http://example.com -output my.file -s
curl http://example.com --o my.file --s
Also, mind the position of my.file
, which can be thought of as the argument to the -o
option. The argument must follow after the -o
…because curl
.
If you instead executed this:
curl http://example.com -o -s my.file
How would curl
know that my.file
, and not -s
is the argument, i.e. what you want to name the content of the downloaded URL?
In fact, you might see that you've created a file named -s
…which is not the end of the world, but not something you want to happen unwittingly.
By and large (from what I can think of at the top of my head), the order of the options doesn't matter:
curl http://example.com -s -o my.file
In fact, the URL, http://example.com
, can be placed anywhere in the mix:
curl -s http://example.com -o my.file
curl -s -o my.file http://example.com
A couple of things to note:
curl
command, can be placed anywhere after the command is not the way that all commands have been designed. So it always pays to read the documentation with every new command.Notice how -s http://example.com
doesn't cause a problem. That's because the -s
option doesn't take an argument. But try the following:
curl -s -o http://example.com my.file
And you will have a problem.
The last thing to consider is what happens when you just curl
for a URL with no options (which, after all, should be optional). Before you try it, think about another part of the Unix philosophy:
This is the Unix philosophy: Write programs that do one thing and do it well. Write programs to work together. Write programs to handle text streams, because that is a universal interface.
If you curl
without any options except for the URL, the content of the URL (whether it's a webpage, or a binary file, such as an image or a zip file) will be printed out to screen. Try it:
curl http://example.com
Output:
<!doctype html>
<html>
<head>
<title>Example Domain</title>
<meta charset="utf-8" />
<meta http-equiv="Content-type" content="text/html; charset=utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
... (and so forth)
Even with the small amount of HTML code that makes up the http://example.com webpage, it's too much for human eyes to process (and reading raw HTML wasn't meant for humans).
But what if we wanted to send the contents of a web file to another program? Maybe to wc
, which is used to count words and lines? Then we can use the powerful Unix feature of pipes. In this example, I'm using curl
's silent option so that only the output of wc
(and not the progress indicator) is seen. Also, I'm using the -l
option for wc
to just get the number of lines in the HTML for example.com:
curl -s http://example.com | wc -l
Number of lines in example.com
is: 50
Now, you could've also done the same in two lines:
curl -s http://example.com -o temp.file
wc -l temp.file
But not only is that less elegant, it also requires creating a new file called temp.file
. Now, this is a trivial concern, but someday, you may work with systems and data flows in which temporarily saving a file is not an available luxury (think of massive files).