Scraping web pages is a messy, error prone, and brittle method to go about getting some data of the internet, but sometimes it is all you have. I have written a few scrapers and have always wondered what a good scraper set up might look like. In an attempt to scrape as many Gothamist articles as I could while the was site down, I came up with a solution that I really liked using Docker, Node, and open source. My traditional scraping approach has been something like: Inspect some HTML in the b...