Curriculum

In reverse-order, the list of things we've learned (or will soon be learning). In addition to the articles/tutorials listed below, here are a couple of overall references:

Homework: Wednesday, March 11 - Collecting Dallas Officer-Involved Shootings

Collect and parse the Dallas Police Department's officer-involved shooting data and make an interactive map.

Extra Credit: Tuesday, March 10 - The Celebrity (Tw)It List

Finding out who the most-followed users follow on Twitter.

Homework: Tuesday, February 24 - Draft proposal of a final project

Use your computational methods to solve a computational problem of your own choosing.

Extra Credit: Friday, February 20 - Build face-grep in Python

Taking the Unix philosophy to Python and computer vision object-detection algorithms.

The overview to a four-part homework assignment in looking up and comparing lobbying and U.S. Congressional activity.

Text formatting and templates and loops

How HTML works, and some of its terminology

  • What an HTML element is
  • What a tag is
  • What a tag attribute is
  • How to use Heredocs with HTML
  • Some HTML boilerplate

A quick introduction to a few of the useful endpoints of the Spotify music service.

Homework: Friday, February 13 - Analyzing Tweets in CSV form

Connect to the Twitter API, download a user's tweets as CSV, and count frequency of hashtags and words.

An exercise in using the Instagram API to get your own media collection.

A walkthrough of basic API usage, JSON parsing, and ImageMagick magic to create some fun static montages.

Extra Credit: Tuesday, March 10 - 404-Finder

Write a program to auto-detect broken links

Some dev-ops steps needed to get Twitter-related tools onto our Farmshare accounts

  • How to install the twurl and t command-line Twitter programs
  • How to authenticate an app for accessing Twitter's API

Ask what you can do for your country, and what your country can pay you.

Using curl to fetch metadata about a webpage's existence

  • How to follow HTTP redirects with curl
  • How to request only HTTP headers with curl
Homework: Tuesday, February 3 - Using baby names to classify names by gender

Use the SSA baby name data to make a naive filter for guessing the gender of a name.

Homework: Friday, January 30 - Basic if-else practice

Practice the logic of if-elif-else conditional branching

How to write a program that can branch into more than one path of execution.

  • How to write an if/elif/else statement
  • How to test if a file exists
  • How to compare strings
  • How to compare numbers
  • How a while-loop works
  • How to make an infinite loop
Homework: Tuesday, January 27 - Exploring Congressional Twitter data as JSON

Basic JSON parsing exercise using what Congress tweets.

JSON is a lightweight format that is nearly ubiquitous for data-exchange. jq is a command-line tool for parsing JSON.

  • How to install jq
  • Why JSON is used as a text data-format
  • How JSON compares to CSV and plaintext
  • Objects and arrays in JSON
  • Using jq to select attribute-value pairs from JSON objects
  • Using jq to select objects and values from JSON arrays
Extra Credit: Tuesday, February 10 - Firsts in American baby-naming

Even more practice with text filters, this time to find when baby names first became known.

Extra Credit: Tuesday, January 27 - More analysis of trends in American baby-naming

More practice with text filters to find interesting trends in the SSA baby name data.

Homework: Friday, January 30 - Death Row rows parsing

Collect and aggregate data from three different states' death row listings.

A quick exercise in HTML/CSS selectors and dirty data

  • Using pup with pseudo-CSS selectors to refine the parsing of HTML content
  • The limits of working with HTML data from the command line
Extra Credit: Tuesday, February 17 - Listing the BuzzFeed listicles

Practicing web-scraping and regexes on BuzzFeed listicle titles

Using the pup tool to more sanely extract data from HTML files

  • How to use a HTML parser to select content via HTML/CSS selectors
  • How to extract specific attributes from HTML elements, such as a href from a hyperlink
  • How to extract just the text from HTML elements
Homework: Friday, January 16 - Managing baby names and data projects with Github

A sampler project that demonstrates how your code and data should be organized for minimal head-smashing.

  • How to install programs to run from your home directory on corn.stanford.edu
  • How to modify your .bashrc and add to your system PATH

How to design and package your code so that it can be re-used in future scenarios.

  • How to pass arguments into a shell script

After collecting the list of WH Briefings, it's time to get each briefing.

Homework: Thursday, January 22 - Parsing the White House Press Briefings as HTML

Data analysis of all the words used in the White House press briefings

How to wrap your code into a file and run it from the command-line

  • How to save a series of commands into a shell script to be run (later) from the command line

How to multitask from the command-line interface

  • Using the ampersand to put a task into the background
  • How to kill a background task
  • How to use nohup and krenew to run a background task even after logging out

How to save programs as files and execute them with bash

  • How to create shell scripts
  • How to run shell scripts
  • How to use the nano text editor
  • How to make a script executable

How to loop, aka designing a program to do repetitive work for you

  • How to write a for loop
  • How to write a read-while loop
  • Pipes and loops

A syntax for describing patterns of text, i.e. Steroids for Grep

  • How to use character classes to match types of characters
  • How to match characters at the beginning of the line
  • How to match characters at the end of the line
  • How to match one-or-more of a type of character
  • How to match any character
  • How to use negated character sets to match any character with exceptions

The fastest way to search text from the command-line

  • How to print lines from files that match a certain text string
  • How to show lines that don't match a certain text string
  • How to use regular expressions to grep for patterns
Homework: Wednesday, January 14 - Collecting the White House Press Briefings

The first step in analyzing web data is to just collect the webpages.

How to download files straight from the command-line interface

  • How to use curl to transfer files from the Web
  • How to save curl's output into a file
  • How to silence curl's progress indicator
  • The different ways curl's arguments and options can be arranged

Using variables to refer to data, including the results of a command.

  • How to set variables and recall their values
  • How to do command substitution
  • How to do math in the shell

Thinking of programs as independent filters working on streams of data

  • More examples of how to pass text from one program to another

One of Unix's most fundamental features: how programs can be joined together through the use of pipes and redirections.

  • How to pipe output from one program to another
  • How to redirect output into a file
  • What is stdin and stdout
  • What is a useless use of cat

An overview of how commands are interpreted and executed by the shell.

  • How commands read arguments
  • How options are used to modify a command's behavior
  • How commands do not share the same interface
  • How to manage the seemingly limitless things to memorize
Homework: Wednesday, January 7 - Setup Prep

Setting up our programming toolbox and environment.

An overview of how Bash interprets text, both literally and symbolically.

  • What are literal values
  • How Bash interprets space-separated values
  • The difference between single and double quotes
  • How to reference a variable within a string
  • How to spread a single command across multiple-lines for easier reading
  • How to have several commands listed in a single line
  • Using Ctrl-C to return to the interactive prompt
  • The limits of reading space- or line-separated data
  • Heredocs

A quick overview of the Bash command-line interpreter and the mechanics of its prompt

  • What is Bash and what is "the shell"
  • What is the prompt