data-processing

So, we have some existing code over at the PyTorch Ignite project that is actually pretty general and might be really handy to have in DALI: pytorch/ignite#766

In core PyTorch, you can chain transformations and FileIO together easily with the Compose() operation: https://pytorch.org/docs/stable/torchvision/transforms.html#torchvision.transforms.Compose

Something like

Hello.
I've come across what (to me) seems to be a problem with the FILENAME and FILENUM variables.

# mlr --version
Miller v5.6.2

# cat /tmp/csv1
A,B,C
_2GB,255,2
_4GB,120,4
_6GB,50,6
_10GB,10,10

# cat /tmp/csv2
FIRST,SECOND,THIRD,FOURTH
1,2,3,4
5,6,7,8
9,10,11,12
13,14,15,16

# mlr --icsv cat then put 'print FILENAME'   /tmp/csv1 /tmp/csv2
/tmp/csv1
A=_2GB,B=255,C=2
/

Add CI test for building documentations (Do not ignore warnings and add spellcheck).
Fix docstrings with incorrect/inconsistent Sphinx format. Currently, such issues are treated as warnings in the docs building.

As far as I can tell, the guide present in the Orphan Nodes / Chains is incorrect, or at least, not compliant with the current lib version.

Consider the following base:

def gen():
    yield 'asda'
    yield 'another1'
    
def upper(x):
    return x.upper()

def show(x):
    print(x)

Chain

We should add an extra step that shows how to publish a message. It can be via command line but, if so, we should link to the Elixir API for them too (ExAWS for SQS, AMPQ for RabbitMQ, etc).

Generally speaking, the guides should be thin on details and include references to the docs for any in depth topic. For example, on the "Create queue" section for RabbitMQ, we can include the CLI example an

Right now BigQueryIO doesn't offer a way to specify that the tables, when created, should be marked as time partitioned.

Documentation: https://cloud.google.com/bigquery/docs/creating-partitioned-tables

What I would like is something like:

...
                .apply(BigQueryIO.Write
                        .setTimePartitioning(TimePartitioning.Type.DAY)
                        .with

TransformerDecoder.forward: where does self.training come from?
https://github.com/asyml/texar-pytorch/blob/d17d502b50da1d95cb70435ed21c6603370ce76d/texar/torch/modules/decoders/transformer_decoders.py#L448-L449
All arguments should say their types explicitly in the docstring. E.g., what is the type of infer_mode? The [method signature](https://texar-pytorch.readthedocs.

While attempting to fill an incomplete batch, no attempt is made to follow the strategy specified by step_to_index_fn in choosing the next batch/samples from the next batch as of now.

Any usage examples of vaspy in jupyter notebook are welcomed 🙂 I've commit an force information example to VASPy/examples/, see https://github.com/PytLab/VASPy/blob/master/examples/force_info.ipynb

欢迎使用VASPy的小伙伴以jupyternotebook的形式向VASPy添加使用例子。我已经上传了一个获取OUTCAR力信息的例子，详见https://github.com/PytLab/VASPy/blob/master/examples/force_info.ipynb

Currently, the SeriesSchema object doesn't validate the index of the schema. The purpose of this task is to extend the __init__ signature of SeriesSchema to take an index argument, which would take a pa.Index or pa.MultiIndex. In the validate / __call__ call, the index should be checked.

Run read and write ends of the conduits concurrently.
When reading a Partition created by <> operations, consume the smaller partitions in parallel.

Database
- alterDatabase - currently, a database is mapped to a namespace in Pulsar, we are unable to store or change metadata of the DB in k/v manner
View
- listViews - we don't support view in pulsar on top of a topic
Table
- alterTable - metadata in k/v of a topic cannot be store in Pulsar, right?
- renameTable - can we change the name of a topic after creation?
Partition

The current documentation structure on readthedoc only covers a small portion of the docstring. We need to update to provide a better documentation source. In additional to adding documentation for our own processors and infrastracture, some utilities from Texar, e.g. HParam, can be referenced here.

data-processing

Here are 382 public repositories matching this topic...

lorien / awesome-web-scraping

NVIDIA / DALI

johnkerl / miller

asyml / texar

python-bonobo / bonobo

onceupon / Bash-Oneliner

dashbitco / broadway

GoogleCloudPlatform / DataflowJavaSDK

microsoft / DialoGPT

GoogleCloudPlatform / data-science-on-gcp

asyml / texar-pytorch

infoslack / awesome-kafka

alttch / rapidtables

benibela / xidel

msamogh / nonechucks

svenkreiss / pysparkling

Yord / pxi

maykulkarni / Machine-Learning-Notebooks

PytLab / VASPy

pandera-dev / pandera

utdemir / distributed-dataset

matousc89 / padasip

tollwerk / data-processing-agreements

thu-coai / cotk

classtag / ijcai18-mama-ads-competition

ColasGael / Machine-Learning-for-Solar-Energy-Prediction

streamnative / pulsar-flink

mech-lang / mech

asyml / forte

unidentifieddeveloper / blaze

Improve this page

Add this topic to your repo