SQL is a language of database, it includes database creation, deletion, fetching rows and modifying rows etc. SQL is an ANSI (American National Standards Institute). Scrape Web Pages for Data.Author: Paul DuBois
This page lists a bunch of FREE O’Reilly books:
When such a neat list is available,downloading them in bulk can become very easy sometimes.Luckily that is indeed the case for this list.Let’s take a closer look.
The first book on the list is The Secrets Behind Great One-on-One Meetings,the URL points to:
On that page there is a form to fill to get the page with download links.On the download page, the book is available in PDF, MOBI and EPUB formats,via the links:
Notice the similarities between the link on the first page,and the download links.With the complete list of the books,and with some simple transformations and looping,you can download the books in bulk.
.csp.The DOM query to find those links can be written as:
We can extract from this result just the
href fields,join them with line breaks and print as plain list on the console:
Save this in a file, let’s call it
With a little shell scripting,we can loop over this list,and make the necessary minor transformations to download all the books,grouped into folders by category:
That’s basically it. Beware, at the time of this writing,the total size of the collection is around 4.5 gigabytes.