bigdata

Hello,
Considering your amazing efficiency on pandas, numpy, and more, it would seem to make sense for your module to work with even bigger data, such as Audio (for example .mp3 and .wav). This is something that would help a lot considering the nature audio (ie. where one of the lowest and most common sampling rates is still 44,100 samples/sec). For a use case, I would consider vaex.open('Hu

This is to track implementation of the ML-Features: https://spark.apache.org/docs/latest/ml-features

Bucketizer has been implemented in dotnet/spark#378 but there are more features that should be implemented.

Feature Extractors
- TF-IDF
- Word2Vec (dotnet/spark#491)
- CountVectorizer (https://github.com/dotnet/spark/p

Is this a BUG REPORT or FEATURE REQUEST?:

/kind feature

What happened:

Automatically set GOMAXPROCS to match Linux container CPU quota, xref https://github.com/uber-go/automaxprocs

Overview:

Developers may run mage commands against multiple different versions of panther.

The mage logs do not specify the version of panther.

If we have the git commit we can be sure of the command context.

Example:

Specifically I am running integration tests for release testing. I pulled the repo and achieved successful integration test results. Pull resulted in updates to

Right now, these aren't caught until we try to gob-encode. Consider failing faster in type-checking to avoid too much confusion/loss when it works with local execution.

bigdata

Here are 1,402 public repositories matching this topic...

taosdata / TDengine

onurakpolat / awesome-bigdata

heibaiying / BigData-Notes

vaexio / vaex

wangzhiwubigdata / God-Of-BigData

douban / dpark

DTStack / flinkx

apache / avro

shzlw / poli

apache / hudi

dotnet / spark

volcano-sh / volcano

Netflix / genie

DTStack / flinkStreamSQL

griddb / griddb

jadianes / spark-py-notebooks

allwefantasy / mlsql

ironmussa / Optimus

YoongiKim / AutoCrawler

microsoft / Mobius

panther-labs / panther

Overview:

Example:

CheckChe0803 / BigData-Interview

kubernetes-sigs / kube-batch

Dr11ft / BigDataGuide

gearpump / gearpump

jadianes / spark-movie-lens

josonle / Coding-Now

fdv / running-elasticsearch-fun-profit

bigartm / bigartm

grailbio / bigslice

Improve this page

Add this topic to your repo