-
Updated
Dec 22, 2020 - C
bigdata
Here are 1,367 public repositories matching this topic...
-
Updated
Dec 17, 2020
-
Updated
Sep 2, 2020 - Java
-
Updated
Dec 22, 2020 - Java
This is to track implementation of the ML-Features: https://spark.apache.org/docs/latest/ml-features
Bucketizer has been implemented in dotnet/spark#378 but there are more features that should be implemented.
- Feature Extractors
- TF-IDF
- Word2Vec (dotnet/spark#491)
- CountVectorizer (https://github.com/dotnet/spark/p
-
Updated
Dec 15, 2020 - Java
/kind feature
What happened:
We'd like to schedule jobs only on certain nodes
What you expected to happen:
We would expect to use node selectors to be able to do this through volcano.
-
Updated
Dec 4, 2020 - C++
-
Updated
Sep 30, 2020 - Jupyter Notebook
-
Updated
Dec 22, 2020 - JavaScript
-
Updated
Dec 17, 2020 - Jupyter Notebook
-
Updated
Oct 13, 2020 - C#
-
Updated
Oct 1, 2020 - Python
Describe the ideal solution
We need a new endpoint that functions as getIntegrationById endpoint.
Describe your use cases
We currently fetching all integration via appsync (or more specifically a sub-category of integrations based on integrationType) and iterate until we find one that matches the integrationId passed.
How frequently would you use such feature
Although, we
-
Updated
Aug 12, 2020 - Go
-
Updated
Dec 14, 2020 - Scala
-
Updated
Oct 10, 2020 - Jupyter Notebook
-
Updated
Nov 18, 2020 - Python
-
Updated
Nov 30, 2020
-
Updated
Nov 26, 2020 - C++
Right now, these aren't caught until we try to gob-encode. Consider failing faster in type-checking to avoid too much confusion/loss when it works with local execution.
Improve this page
Add a description, image, and links to the bigdata topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the bigdata topic, visit your repo's landing page and select "manage topics."
Hello,
Considering your amazing efficiency on pandas, numpy, and more, it would seem to make sense for your module to work with even bigger data, such as Audio (for example .mp3 and .wav). This is something that would help a lot considering the nature audio (ie. where one of the lowest and most common sampling rates is still 44,100 samples/sec). For a use case, I would consider vaex.open('Hu