Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
-
Updated
Apr 27, 2023 - HTML
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
A Repo For Document AI
Tutorial on how to deskew (straighten) text images
An OCR based document parser to extract information from identity document images
Integrate AI-powered Document Analysis Pipelines
A Python pipeline tool and plugin ecosystem for processing technical documents. Process papers from arXiv, SemanticScholar, PDF, with GROBID, LangChain, listen as podcast. Customize your own pipelines.
Extract text from your DOCX documents.
A simple library that I use for web scraping. Uses htmlparser2 to parse dom.
Resume Parsing app to extract information using AI
The invoice, document, and résumé parser powered by AI.
Ihugure Chatbot Streamlit User Interface
Shubham's REST APIs made at hackNY
Convert documents into Quizes! Built at HackNY (Android + NodeJS + Alexa skill)
Who likes lawyers? Me either; scrub your PII with ShamWow
Add a description, image, and links to the document-parser topic page so that developers can more easily learn about it.
To associate your repository with the document-parser topic, visit your repo's landing page and select "manage topics."