#
tika
Here are 98 public repositories matching this topic...
Elasticsearch File System Crawler (FS Crawler)
-
Updated
Aug 11, 2020 - Java
Spark-Crawler: Apache Nutch-like crawler that runs on Apache Spark.
search
search-engine
distributed-systems
information-retrieval
big-data
spark
solr
web-crawler
nutch
tika
sparkles
-
Updated
May 21, 2020 - Java
A cross-platform command line tool for parallelised content extraction and analysis.
-
Updated
Jul 1, 2020 - Java
Use the Java Tika text extraction library on the .NET platform
-
Updated
Jan 28, 2020 - Rich Text Format
Viewers for statistics and dashboarding of Domain Search Engine data
-
Updated
Jan 19, 2016 - Python
ImageCat is an Apache OODT RADIX application that uses Apache Solr, Apache Tika and Apache OODT to ingest 10s of millions of files (images,but could be extended to other files) in place, and to extract metadata and OCR information from those files/images using Tika and Tesseract OCR.
-
Updated
Aug 26, 2018 - Java
Tika-Similarity uses the Tika-Python package (Python port of Apache Tika) to compute file similarity based on Metadata features.
python
machine-learning
information-retrieval
clustering
tika
cosine-similarity
jaccard-similarity
cosine-distance
similarity-score
tika-similarity
metadata-features
tika-python
-
Updated
Mar 2, 2020 - Python
Interactive Image similarity and Visual Search and Retrieval application
python
metadata
machine-learning
computer-vision
deep-learning
tika
image-viewer
image-recognition
alexnet
image-analysis
usc
chris
jpl
kitware
tika-python
image-similarity
imagespace-quickstart
-
Updated
Apr 21, 2020 - Python
Apache Tika bindings for PHP: extract text and metadata from documents, images and other formats
-
Updated
Aug 9, 2020 - PHP
R Interface to Apache Tika
-
Updated
Apr 24, 2020 - R
Extract and Visualize location from any file
docker
django
solr
tika
extract
geospatial-data
gazetteer
geospatial-processing
geospatial-analytics
geospatial-analysis
tika-server
geoparser
visualize-locations
covid-19
cord19
-
Updated
Jun 5, 2020 - JavaScript
Geographic Place, Date/time, and Pattern entity extraction toolkit along with text extraction from unstructured data and GIS outputters.
-
Updated
Jul 21, 2020 - Java
Apache NiFi Custom Processor Extracting Text From Files with Apache Tika
-
Updated
Aug 9, 2019 - Java
Distributed, fault tolerant batch processing for Natural Language Applications and Search, using remote partitioning
-
Updated
Jan 28, 2020 - Java
Small box of pandora to prototype your app with ready for use backend
python
redis
golang
elasticsearch
kibana
rabbitmq
docker-compose
tika
nats
minio
baas
celery
caddy
template-project
dgraph
quick-start
-
Updated
Aug 12, 2020 - Go
Code for Machine Learning with TensorFlow: 2nd Edition Published by Manning Publications
python
machine-learning
tensorflow
machine-learning-algorithms
tika
ml
tensorflow-tutorials
manning-publications
ml-with-tensorflow
-
Updated
Aug 12, 2020 - Jupyter Notebook
Extract text from a document by Apache Tika
-
Updated
Dec 17, 2019 - JavaScript
A suite of Machine Learning / Deep Learning Dockerfiles to allow Apache Tika to extract objects and to produce textual captions for images and video
docker
video
computer-vision
deep-learning
tensorflow
detection
tika
apache
image-captioning
usc
apache-tika
computer-vision-tools
tika-python
usc-data-science
-
Updated
Jul 7, 2018
An Elasticsearch engine plugin for Moodle's Global Search
-
Updated
Jun 15, 2020 - PHP
TYPO3 Extension: solr_file_indexer
-
Updated
May 26, 2020 - PHP
Improve this page
Add a description, image, and links to the tika topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the tika topic, visit your repo's landing page and select "manage topics."