LeanPub
Steps:
src/leanpub.py: Within this script I usedBeautifulSoupfor scraping over the pages and finding onlyFREEbooks. After finding them I am storing them intoBLink.json;src/leanpub_selenium.py: Within this script I usedSelenium WebDriverto automate adding each book to the shopping cart. You can make logging in possible by uncommenting some lines, but it is possible without logging in as well. One of the hardest parts I solved was to change the scrollbar's value and one of the hardest parts I could not solve was to clickContinuebutton to go to the next step for purchase. I think, since they usedReactJS, it was not easy for me to handle it. But I thought I need just all items in the shopping cart, I can handlecheckoutpart manually; so, I did. I got mail and saved theHTMLof the mail page intoleanpub_gmail.html, because now I have the links toPDF,EPUB, andMOBIversion of the books;src/leanpub_download.py: Although I useddownloadin the name of the file, I don't download anything by using it. Within this script I usedBeautifulSoupfor scraping overleanpub_gmail.htmland store all relevant information (the authors, the language of the book, the links to specific download options, etc.) into theBData.json;src/leanpub_categorization.py: Within this script I usedBeautifulSoupfor scraping over the pages of the various categories and updatingBData.jsonby adding the fields of the categories. Don't forget: one book might belong to several categories;src/leanpub_foldering.py: Within this script I usedBeautifulSoupfor downloadingPDFversion of all books using the previously obtained information inBData.json. I used categorization for the folder part; in other words, after downloading you can see one book can appear in multiple folders.
Notes:
- This is not
hack: I used some web scraping and web automation techniques withinPythonto legally download free books from LeanPub. Since what I downloaded can be legally downloaded, I am not doing any illegal thing. - I am not going to attach
BLink.json,BData.json, andleanpub_gmail.html; they are available upon request!