#
hadoop
Here are 2,178 public repositories matching this topic...
Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization etc. It also comes with Hadoop support built in.
-
Updated
Jul 15, 2020 - Python
Eclipse Deeplearning4j, ND4J, DataVec and more - deep learning & linear algebra for Java/Scala with GPUs + Spark
python
java
clojure
scala
spark
hadoop
gpu
intellij
linear-algebra
artificial-intelligence
deeplearning
neural-nets
dl4j
matrix-library
deeplearning4j
-
Updated
Jul 6, 2020 - Java
Open Source Fast Scalable Machine Learning Platform For Smarter Applications: Deep Learning, Gradient Boosting & XGBoost, Random Forest, Generalized Linear Modeling (Logistic Regression, Elastic Net), K-Means, PCA, Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
python
java
data-science
machine-learning
multi-threading
opensource
r
big-data
spark
deep-learning
hadoop
random-forest
gpu
naive-bayes
h2o
distributed
pca
gbm
ensemble-learning
automl
-
Updated
Jul 17, 2020 - Jupyter Notebook
Alluxio, data orchestration for analytics and machine learning in the cloud
spark
presto
hadoop
tensorflow
data-analysis
alluxio
memory-speed
data-orchestration
virtual-distributed-filesystem
-
Updated
Jul 17, 2020 - Java
BigDL: Distributed Deep Learning Framework for Apache Spark
-
Updated
Jul 14, 2020 - Scala
Apache Ignite
iot
cloud
sql
database
big-data
hadoop
cache
osgi
ignite
network-server
in-memory-database
data-management-platform
network-client
distributed-sql-database
in-memory-computing
-
Updated
Jul 17, 2020 - Java
AI on Hadoop
-
Updated
May 20, 2020 - Java
A large-scale entity and relation database supporting aggregation of properties
-
Updated
Jul 17, 2020 - Java
基于Spark的电影推荐系统,包含爬虫项目、web网站、后台管理系统以及spark推荐系统
-
Updated
Apr 1, 2019 - Java
Home of the community managed version of Presto, the distributed SQL query engine for big data, under the auspices of the Presto Software Foundation.
java
distributed-systems
data-science
sql
database
big-data
presto
hive
hadoop
jdbc
databases
distributed-database
query-engine
datalake
prestodb
distributed-databases
-
Updated
Jul 17, 2020 - Java
生产环境的海量数据计算产品,文档地址:
-
Updated
Jul 17, 2020 - Java
450+ AWS, Hadoop, Cloud, Kafka, Docker, Elasticsearch, RabbitMQ, Redis, HBase, Solr, Cassandra, ZooKeeper, HDFS, Yarn, Hive, Presto, Drill, Impala, Consul, Spark, Jenkins, Travis CI, Git, MySQL, Linux, DNS, Whois, SSL Certs, Yum Security Updates, Kubernetes, Mesos, Riak, MongoDB, Memcached, Couchbase, CouchDB, Neo4j, Ambari, Cloudera, Hortonworks, MapR. Products: Attivio, Blue Talon, Datameer, H2O, WanDisco, Zaloni
docker
kubernetes
redis
jenkins
elasticsearch
kafka
cassandra
hadoop
rabbitmq
consul
travis-ci
solr
cloudera
hbase
zookeeper
nagios-plugins
ambari
hortonworks
hbase-client
datastax
-
Updated
Jul 15, 2020 - Python
MapReduce, Spark, Java, and Scala for Data Algorithms Book
-
Updated
May 12, 2020 - Java
Apache Hadoop docker image
-
Updated
Jun 28, 2020 - Shell
2
dragon2611
commented
Feb 1, 2020
Not sure if it's worth adding to the documentation but I found I had to add the following to the end of the fstab command otherwise the system would not boot cleanly.
,x-systemd.mount-timeout=30,_netdev
The line in fstab now looks like
mfsmount /mnt/mymnt fuse mfssubfolder=mydir,allow_other,x-systemd.mount-timeout=30,_netdev
I think it was trying to mount mooseFS before it had fini
Kylo is a data lake management software platform and framework for enabling scalable enterprise-class data lakes on big data technologies such as Teradata, Apache Spark and/or Hadoop. Kylo is licensed under Apache 2.0. Contributed by Teradata Inc.
-
Updated
Jul 1, 2020 - Java
DataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualization, and task scheduling.
workflow
airflow
spark
hive
hadoop
etl
kettle
hue
tableau
flink
zeppelin
griffin
azkaban
governance
davinci
visualis
supperset
linkis
scriptis
dataworks
-
Updated
Jul 13, 2020 - Java
50+ DockerHub public images for Docker & Kubernetes - Hadoop, Kafka, ZooKeeper, HBase, Cassandra, Solr, SolrCloud, Presto, Apache Drill, Nifi, Spark, Mesos, Consul, Riak, OpenTSDB, Jython, Advanced Nagios Plugins & DevOps Tools repos on Alpine, CentOS, Debian, Fedora, Ubuntu, Superset, H2O, Serf, Alluxio / Tachyon, FakeS3
linux
docker
kubernetes
kafka
spark
cassandra
presto
hadoop
rabbitmq
consul
solr
docker-image
hbase
zookeeper
nagios-plugins
dockerhub
nifi
rabbitmq-cluster
apache-drill
solrcloud
-
Updated
Jun 5, 2020 - Shell
Distributed Deep Learning, with a focus on distributed training, using Keras and Apache Spark.
data-science
machine-learning
spark
apache-spark
deep-learning
hadoop
tensorflow
keras
keras-models
optimization-algorithms
data-parallelism
distributed-optimizers
-
Updated
Jul 25, 2018 - Python
TonY is a framework to natively run deep learning frameworks on Apache Hadoop.
-
Updated
Jul 9, 2020 - Java
Improve this page
Add a description, image, and links to the hadoop topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the hadoop topic, visit your repo's landing page and select "manage topics."
Alexnet implementation in tensorflow has incomplete architecture where 2 convolution neural layers are missing. This issue is in reference to the python notebook mentioned below.
https://github.com/donnemartin/data-science-ipython-notebooks/blob/master/deep-learning/tensor-flow-examples/notebooks/3_neural_networks/alexnet.ipynb