spark
- Sign up for GitHub or sign in to edit this page
Here are 3,975 public repositories matching this topic...
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
-
Updated
Sep 24, 2019 - 543 commits
- 12 contributors
- Python
Learn and understand Docker technologies, with real DevOps practice!
-
Updated
Sep 24, 2019 - 996 commits
- 82 contributors
- Go
汇总java生态圈常用技术框架、开源中间件,系统架构、数据库、大公司架构案例、常用三方类库、项目管理、线上问题排查、个人成长、思考等知识
-
Updated
Sep 24, 2019 - 248 commits
- 1 contributors
List of Data Science Cheatsheets to rule the world
-
Updated
Sep 24, 2019 - 48 commits
- 9 contributors
Kubernetes中文指南/云原生应用架构实践手册 - https://jimmysong.io/kubernetes-handbook
-
Updated
Sep 24, 2019 - 1 commits
- 84 contributors
- Shell
A Flexible and Powerful Parameter Server for large-scale machine learning
-
Updated
Sep 24, 2019 - 2 commits
- 39 contributors
- Java
Hey Cube.Js team!
I get the following yarn warnings when installing dependencies. Can these be fixed? Thank you.
warning @cubejs-backend/server-core > @cubejs-backend/schema-compiler > joi@14.3.1: This module has moved and is now available at @hapi/joi. Please update your dependencies as this version is no longer maintained an may contain bugs and security issues.
warning @cubejs-backe
macOS development environment setup: Easy-to-understand instructions with automated setup scripts for developer tools like Vim, Sublime Text, Bash, iTerm, Python data analysis, Spark, Hadoop MapReduce, AWS, Heroku, JavaScript web development, Android development, common data stores, and dev-based OS X defaults.
-
Updated
Sep 23, 2019 - 356 commits
- 14 contributors
- Python
Alluxio, data orchestration for analytics and machine learning in the cloud
-
Updated
Sep 24, 2019 - 30 commits
- 1,077 contributors
- Java
Open Source Fast Scalable Machine Learning Platform For Smarter Applications: Deep Learning, Gradient Boosting & XGBoost, Random Forest, Generalized Linear Modeling (Logistic Regression, Elastic Net), K-Means, PCA, Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
-
Updated
Sep 24, 2019 - 25 commits
- 127 contributors
- Java
Open-source IoT Platform - Device management, data collection, processing and visualization.
-
Updated
Sep 24, 2019 - 2 commits
- 76 contributors
- Java
PipelineAI: Real-Time Enterprise AI Platform
-
Updated
Sep 24, 2019 - 573 commits
- 2 contributors
- Java
TensorFlowOnSpark brings TensorFlow programs to Apache Spark clusters.
-
Updated
Sep 24, 2019 - 459 commits
- 17 contributors
- Python
BigDL: Distributed Deep Learning Library for Apache Spark
-
Updated
Sep 24, 2019 - 2 commits
- 64 contributors
- Scala
Currently they use sys.process to drop down to Linux command line.
Also there are manual steps where the user is expected to do things.
e.g. mllib/planes.snb talks about downloading a bz2 file, uncompressing, copying to correct destination.
All well and good, but doesn't work too well for Windows users.
Would be improved by instead keeping it pure JVM, e.g. use apache commons.io/compress calls
酷玩 Spark: Spark 源代码解析、Spark 类库等
-
Updated
Sep 24, 2019 - 137 commits
- 11 contributors
- Scala
Hello,
I was able to run python scripts in dev mode using the steps provided in documentation. but for production, I am not sure which all folders to keep and the process to follow. editing the local conf and local sh files and running the server_deploy script, I was able to generate the server jar. But still i had to manually add the python context and upload my egg file.
Can someone pleas
flink learning blog. http://www.54tianzhisheng.cn
-
Updated
Sep 24, 2019 - 166 commits
- 2 contributors
- Java
A minimal benchmark for scalability, speed and accuracy of commonly used open source implementations (R packages, Python scikit-learn, H2O, xgboost, Spark MLlib etc.) of the top machine learning algorithms for binary classification (random forests, gradient boosted trees, deep neural networks etc.).
-
Updated
Sep 24, 2019 - 522 commits
- 9 contributors
- R
Problem
Since Java 8 was introduced there is no need to use Joda as it has been replaced the native Date-Time API.
Solution
Ideally greping and replacing the text should work (mostly)
Additional context
Need to check if de/serializing will still work.
I am trying to explain the predictions made by my XGboost model using MMLSparks Lime package for scala. This is my first time using LIME library, I am able to perform a fit operation on the dataset and when I am trying to perform the transform operation, the program stops with an exception, "Caused by: java.lang.ClassCastException: org.apache.spark.ml.linalg.SparseVector cannot be cast to org.apac
Fast, Scientific and Numerical Computing for the JVM (NDArrays)
-
Updated
Sep 19, 2019 - 7 commits
- 80 contributors
- Java
spark ml 算法原理剖析以及具体的源码实现分析
-
Updated
Sep 24, 2019 - 650 commits
- 1 contributors
Validation should be added to directed fields in schemas. This will be done as part of work for version 2 as adding in validation would cause breaking changes.
This template isn't a strict requirement to open issues, but please try to provide as much information as possible.
Version: 3.4.3
Module: quill-codegen-jdbc
Database: postgres
Expected behavior
It should provide a clear example.
Actual behavior
So confusing and cannot follow
Issue Description
I'm unable to run DL4J with nvidia CUDA back end despiite following the instructions here:
https://deeplearning4j.org/docs/latest/deeplearning4j-config-gpu-cpu
Project works fine with native back end. When I debug, I can see the service loader finding the JCublasBackend.java class and then failing on isAvailable().
As far as I can tell I've done everything recommende