Monday - Parsing of raw text
Wednesday - Parsing of structured text, including HTML and JSON
I'll keep using the word "parser" without fully explaining it. Today we'll be looking at HTML parsing, but it's just the first step in understanding how many other data structures are parsed.
A nice way to examine the concept of parsing, in general, is to look at how human language is parsed. Check out the Stanford Parser and play around with it.
Here are several relevant guides that I've created for this week:
corn.stanford.edu
so that you can (sanely) complete future web scraping projects. If you want, you can try installing some of the other programs listed, including the command-line movie-to-GIF makervia the gifify tool: