Here are
30 public repositories
matching this topic...
A Gtk/Qt front-end to tesseract-ocr.
OCR engine for all the languages
-
Updated
May 19, 2022
-
Python
Document Layout Analysis resources repos for development with PdfPig.
Validate and transform various OCR file formats (hOCR, ALTO, PAGE, FineReader)
-
Updated
Apr 2, 2022
-
JavaScript
Convert between Tesseract hOCR and ALTO XML using XSL stylesheets
-
Updated
Jan 24, 2021
-
XSLT
Conversions between various OCR formats
Text Overlay plugin for Mirador 3
-
Updated
May 20, 2022
-
JavaScript
Ergonomic line-by-line transcription of scanned text.
-
Updated
Dec 16, 2020
-
JavaScript
✏️ Integration of Tesseract for Python using a shared library
-
Updated
Mar 25, 2016
-
Python
Probabilistic Key Value pair extraction using word weights from Invoices - Non Searchable PDF
-
Updated
Jun 12, 2021
-
Python
The data for guides to breweries across the United States from 1896 to 1918
-
Updated
Feb 12, 2022
-
JavaScript
Fly Space-A Facebook flight schedule photo aggregator and processor back-end server.
Tools for manipulating and evaluating the hOCR format for representing multi-lingual OCR results by embedding them into HTML.
-
Updated
Mar 13, 2022
-
Python
A gem that parses positional text from hOCR output and provides convenience methods to find text.
-
Updated
May 18, 2022
-
Ruby
Quick and dirty visualization of HOCR bboxes on a page
Python parser for hOCR files using lxml
-
Updated
Aug 23, 2020
-
Python
A visual hOCR file editor
-
Updated
Apr 29, 2022
-
TypeScript
-
Updated
Dec 8, 2019
-
Python
HDaIP.scanner - Historical Document and Information Processing - Scanner
A simple Tesseract 3.02+ hOCR to djvused format converter written in Qt
Perform OCR on images within Nuxeo with Tesseract and hOCR
Tesseract Open Source OCR Engine (main repository)
The data for two editions of the Wilson New York City Business Directories, 1852-1853, and 1861-1862.
tesseract OCR for Clarion
-
Updated
Dec 19, 2019
-
Clarion
A visual editor for .hocr files.
some segment codes using in denoising
-
Updated
Dec 8, 2019
-
Python
Improve this page
Add a description, image, and links to the
hocr
topic page so that developers can more easily learn about it.
Curate this topic
Add this topic to your repo
To associate your repository with the
hocr
topic, visit your repo's landing page and select "manage topics."
Learn more
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session.
You signed out in another tab or window. Reload to refresh your session.
Looks like the function below returns bytes with value 1 instead of 255 which produces near black png. for all other type of filters it works fine.
Filter: FlateDecode
ColorSpace: DeviceGray
BitsPerComponent: 1
public static byte[] Convert(ColorSpaceDetails details, IReadOnlyList decoded, int bitsPerComponent, int imageWidth, int imageHeight);