bigdata

Hello,
Considering your amazing efficiency on pandas, numpy, and more, it would seem to make sense for your module to work with even bigger data, such as Audio (for example .mp3 and .wav). This is something that would help a lot considering the nature audio (ie. where one of the lowest and most common sampling rates is still 44,100 samples/sec). For a use case, I would consider vaex.open('Hu

This is to track implementation of the ML-Features: https://spark.apache.org/docs/latest/ml-features

Bucketizer has been implemented in dotnet/spark#378 but there are more features that should be implemented.

Feature Extractors
- TF-IDF
- Word2Vec (dotnet/spark#491)
- CountVectorizer (https://github.com/dotnet/spark/p

/kind feature

What happened:
We'd like to schedule jobs only on certain nodes

What you expected to happen:
We would expect to use node selectors to be able to do this through volcano.

Describe the ideal solution

We need a new endpoint that functions as getIntegrationById endpoint.

Describe your use cases

We currently fetching all integration via appsync (or more specifically a sub-category of integrations based on integrationType) and iterate until we find one that matches the integrationId passed.

How frequently would you use such feature

Although, we

Right now, these aren't caught until we try to gob-encode. Consider failing faster in type-checking to avoid too much confusion/loss when it works with local execution.

bigdata

Here are 1,367 public repositories matching this topic...

taosdata / TDengine

onurakpolat / awesome-bigdata

heibaiying / BigData-Notes

vaexio / vaex

wangzhiwubigdata / God-Of-BigData

douban / dpark

DTStack / flinkx

shzlw / poli

apache / avro

apache / hudi

dotnet / spark

Netflix / genie

volcano-sh / volcano

griddb / griddb

DTStack / flinkStreamSQL

jadianes / spark-py-notebooks

allwefantasy / mlsql

ironmussa / Optimus

microsoft / Mobius

YoongiKim / AutoCrawler

panther-labs / panther

Describe the ideal solution

Describe your use cases

How frequently would you use such feature

CheckChe0803 / BigData-Interview

kubernetes-sigs / kube-batch

gearpump / gearpump

jadianes / spark-movie-lens

josonle / Coding-Now

Dr11ft / BigDataGuide

fdv / running-elasticsearch-fun-profit

bigartm / bigartm

grailbio / bigslice

Improve this page

Add this topic to your repo