Build software better, together

kennethreitz / requests-html

9.3k

Pythonic HTML Parsing for Humans™

html scraping python requests http kennethreitz lxml pyquery css-selectors beautifulsoup

HTML Updated Mar 15, 2019 3 issues need help

gawel / pyquery

1.7k

A jquery-like library for python

python python3 jquery lxml css

Python Updated Nov 16, 2018

AlexMathew / scrapple

431

A framework for creating semi-automatic web content extractors

python css-selector xpath-expression web-scraper web-scraping scrapers scraping scrapy selector extractor crawler selector-expression tutorial lxml beautifulsoup

Python Updated Jan 7, 2019

scrapy / parsel

401

Parsel lets you extract data from XML/HTML documents using XPath or CSS selectors

python lxml xpath xml selectors css scraping

Python Updated Jan 22, 2019

hchasestevens / xpyth

112

A module for querying the DOM tree and writing XPath expressions using native Python syntax.

xpath-expression python comprehension python-comprehension-syntax lxml dsl metaprogramming

Python Updated Jun 13, 2018

bomquote / transistor

105

Transistor, a Python web scraping framework for intelligent use cases.

python-3 scraping requests lxml beautifulsoup4 headless-browsers mechanicalsoup framework

Python Updated Mar 22, 2019

MilesCranmer / gso

58

🏃 Automatic programming in Vim. Works by copy-pasting Stack Overflow answers.

stackoverflow vim developer-tools google python lxml automation editor-plugin autocomplete

Vim script Updated Jan 15, 2019

ksator / python-training-for-network-engineers

58

Python hands-on training for network engineers. How to automate Junos with Python

python pyez junos-automation jinja2 json yaml rest-api netconf ncclient lxml napalm

Python Updated Oct 18, 2018

codelv / enaml-web

34

Build interactive websites with enaml

enaml lxml python web-components web

Jupyter Notebook Updated Mar 21, 2019

shuizhubocai / crawler

33

requests+lxml爬虫，简单爬虫架构

requests lxml

Python Updated Aug 23, 2018

Harut / chakert

29

Python typography enhacer tool for lxml-based html and raw text

typography hyphens lxml double-quotes python html

Python Updated Feb 28, 2017

PhantomInsights / mexican-jobs

29

Reddit bots, web scraper and utility scripts used to perform EDA on thousands of job listings from the official Mexic…

python3 reddit-bot pandas matplotlib lxml praw seaborn

Python Updated Feb 26, 2019

sachin-bisht / Instagram_Stalker_Scraper

26

(UNMAINTAINED) Fetch data of any public Instagram profile, without using api

instagram-scraper instagram python3 requests json lxml matplotlib wget os infinite-scroll

Python Updated Oct 19, 2018

rohitthapliyal2000 / codechef-rank-comparator

21

Web application hosted on Heroku cloud platform based on web scraping in python using lxml library (XML Path Language).

python3 web-scraping codechef-crawler lxml xpath flask-application html5 heroku-deployment

Python Updated Jul 16, 2018

5hirish / tweet_scrapper

20

Scrape the Twitter frontend API without any authentication and restriction.

python3 twitter scrapper lxml

Jupyter Notebook Updated Nov 6, 2018

Boneflame / gpipe43

19

A full text RSS generator which can hosted on google app engine

rss rss-generator google-appengine python regex google-cloud google-cloud-platform google-cloud-storage python27 webapp2 webapp2-framework urllib2 xpath lxml chardet

Python Updated Nov 25, 2018

iHealth-ecnu / iHealth_crawler

18

iHealth 项目的内容爬虫（一个基于 python 和 MongoDB 的医疗咨询爬虫）

python requests pymongo lxml

Python Updated Jan 2, 2018

jurismarches / chopper

16

Chopper is a tool to extract elements from HTML by preserving ancestors and CSS rules

lxml python beautifulsoup scraping html css

Python Updated Jun 5, 2018

CompileInc / hodor

16

🕷Configuration based html scraper

hodor python scraping pagination lxml cssselect html-scraper

Python Updated Feb 28, 2019

scrapehero / yellowpages-scraper

15

Yellowpages.com Web Scraper written in Python and LXML to extract business details available based on a particular ca…

business-directory yellow-pages scraper lxml web-scraper python yellow-pages-scraper html parsing extract

Python Updated Oct 30, 2018

shadz3rg / ru_address

15

Полная конвертация XML файлов ФИАС в MySQL - Схема, Данные, Ключи.

mysql xml lxml fias converter

Python Updated Oct 18, 2018

rohitthapliyal2000 / Amazon-Mobile-Sentiment-Analysis

13

Opinin mining of Mobile reviews on Amazon platform

python3 xpath lxml nltk-library xml web-crawling infinite-scrolling naive-bayes-classifier sentiment-analysis machine-learning

Python Updated Mar 8, 2018

pushrbx / python3-mal

12

Python interface to MyAnimeList

myanimelist lxml parsing myanimelist-api mal mal-api

Forked from shaldengeki/python-mal Python Updated Aug 11, 2018

jadbin / serlist

12

Search engine results page scraper

search-engine-scraper lxml

Python Updated Dec 19, 2018

krober / bapcs-stock-checker

10

Reddit price and stock checker bot - replies with useful info, and saves data for later analysis

praw reddit python python3 sqlalchemy lxml requests web-scraping

Python Updated Jun 29, 2018

lb2281075105 / LBDuoDian

9

《爬取多点商城整站商品》申明：如果侵犯了某公司权益，请及时告诉我，我会马上删除爬取的整站的商品信息。分析< 多点 >商城商品信息，爬取< 多点 >商城整站商品信息。1、分析< 多点 >商城特点；2、使用爬取方式；3、爬取数据解析(重点)。

python3 selenium-webdriver urllib request json pymysql lxml jsonpath ssl-certificates mysql ssl python-3-6 python2

PLpgSQL Updated Feb 3, 2018

msdeep14 / stayUpdated

9

get latest updates from aitplacements.com through SMS and desktop notification

python-script lxml sinchsms scheduled-notifications sms way2sms-api python way2sms notifications desktop-notifications notify2

Python Updated Jan 6, 2018

Sarath18 / terrain_generator

8

A wizard that generates terrains for Gazebo using height maps.

terrain generator python gazebo simulation heightmaps heightmap textures lxml xml automaitc auto surface elevation model world

Python Updated Jul 19, 2018

mnixry / Tieba2MD

8

一个简单的爬虫，能够帮助您将百度贴吧的帖子转换为Markdown格式

baidu-tieba spider python3 lxml

Python Updated Mar 15, 2019

lorien / selection

8

API to extract data from HTML and XML documents

dom xpath xpath-api html xml lxml query-api

Python Updated Aug 7, 2018

Please note that GitHub no longer supports Internet Explorer.

lxml

Repositories 137