bigdata
- Sign up for GitHub or sign in to edit this page
Here are 1,030 public repositories matching this topic...
A curated list of awesome big data frameworks, ressources and other awesomeness.
-
Updated
Oct 17, 2019 - 469 commits
Hello,
While following the docs (https://docs.vaex.io/en/latest/ml.html), I've encountered the error stated in the title while trying the code in cell [7].
Responsible line:
xgmodel = vaex.ml.xgboost.XGBModel(features=features, num_round=10, param=param)
An easy-to-use BI server built for SQL lovers. Power data analysis in SQL and gain faster business insights.
-
Updated
Oct 16, 2019 - 196 commits
- Java
Distributed Big Data Orchestration Service
-
Updated
Oct 17, 2019 - 2 commits
- Java
When a new version of Apache Spark is released, we'd have to make changes in the code base to add any new APIs, account for new protocol changes on the worker-side etc. We should document this to allow anyone to upgrade .NET for Apache Spark to newer versions.
Success Criteria
- Clearly call out code changes with examples
- Capture caveats one might encounter during this upgrade process
The Programming Language Designed For Big Data and AI
-
Updated
Oct 16, 2019 - 2 commits
- JavaScript
C# and F# language binding and extensions to Apache Spark
-
Updated
Oct 12, 2019 - 1 commits
- C#
On home page of website: https://nlp.johnsnowlabs.com/ I read "Full Python, Scala, and Java support"
Unfortunately it's 3 days now I'm trying to use Spark NLP in Java without any success.
- I cannot find Java API (JavaDoc) of the framework.
- not event a single example in Java is available
- I do not know Scala, I do not know how to convert things like:
val testData = spark.createDataFrame(
Because some user has had problems configuring these services could be helpful to make some examples or videos about how to properly setup Optimus in this services.
Lightweight real-time big data streaming engine over Akka
-
Updated
Oct 16, 2019 - 2 commits
- Scala
An on-line movie recommender using Spark, Python Flask, and the MovieLens dataset
-
Updated
Oct 15, 2019 - 48 commits
- Jupyter Notebook
Google, Naver multiprocess image web crawler (Selenium)
-
Updated
Oct 16, 2019 - 54 commits
- Python
Is this a BUG REPORT or FEATURE REQUEST?:
/kind feature
Description:
There're several features in volcano right now; not only scheduling algorithm but also job management. It's better to manage those features by version, e.g. Alpha, Beta, GA, that'll be helpful for user to evaluate which features should be enabled in their environment.
A batch scheduler of kubernetes for high performance workload, e.g. AI/ML, BigData, HPC
-
Updated
Oct 16, 2019 - 1 commits
- Go
@JeanPaulShapo, there is a recommendation In current installation instruction to install python packages with sudo pip install, which isn't a good idea. It may be worth replacing it with pip install --user or via virtualenv.
d3 library to build circular graphs
-
Updated
Oct 11, 2019 - 282 commits
- JavaScript
学习记录的一些笔记,以及所看得一些电子书eBooks、视频资源和平常收纳的一些自己认为比较好的博客、网站、工具。涉及大数据几大组件、Python机器学习和数据分析、Linux、操作系统、算法、网络等
-
Updated
Oct 17, 2019 - 146 commits
- Python
A book about running Elasticsearch
-
Updated
Oct 15, 2019 - 73 commits

必须先将通过STable创建的表先删除,然后才能删除超级表。当级联表很多时就麻烦了,除非写代码来删除。