tabular-data

Hi there! I wanted to propose adding the following badge to the README to indicate how many TODO comments are in this codebase:

The badge links to tickgit.com which is a free service that indexes and displays TODO comme

Hi,
I am trying to load a CSV with no header using

df = vaex.open('data/star0000-1.csv',sep=",", header=None, error_bad_lines=False)

but I get

could not convert column 0, error: TypeError('getattr(): attribute name must be string'), will try to convert it to string
Giving up column 0, error: TypeError('getattr(): attribute name must be string')
could not convert column

Hello.
I've come across what (to me) seems to be a problem with the FILENAME and FILENUM variables.

# mlr --version
Miller v5.6.2

# cat /tmp/csv1
A,B,C
_2GB,255,2
_4GB,120,4
_6GB,50,6
_10GB,10,10

# cat /tmp/csv2
FIRST,SECOND,THIRD,FOURTH
1,2,3,4
5,6,7,8
9,10,11,12
13,14,15,16

# mlr --icsv cat then put 'print FILENAME'   /tmp/csv1 /tmp/csv2
/tmp/csv1
A=_2GB,B=255,C=2
/

I'm using Tad Version 0.8.5 (0.8.5) on MacOS 10.13.4.

Tad doesn't appear to respond to the system shortcut to hide the current window (Cmd-H). I think it should for parity with other native MacOS apps.

Those who fail to find a document or an example refer here to resolve the issue first.
We are planned to build our official document site and before that is completed, please refer to the below link.

I'm using tsv-utils from the arch linux aur, trying to format some word frequency data from the new general services list dataset. tsv-utils makes at least two errors that I'm able to see when I'm running this commandline:

tsv-select -f 1,7 NGSL+1.01+with+SFI.tsv | tsv-pretty | less

adding -s 5 to tsv-pretty works around this problem. The tsv file was converted from the file NGSL+1.01+with

Hi,

Would be helpful to see shuffle, shuffle! functions in DataFrames. Used in randomizing machine learning mini batches.

What do you think?

While using rows for a project, could not use the import_from_pdf function after putting rows as a dependency. The solution, ultimately, was to also include PyMuPDF and cached_property as dependencies for my project, so as to enable the pdf plugin for rows. This information, however, is not clear by looking only at the docs. It would be desirable to list the dependencies for each plugi

This is a suggested code or documentation change, improvement to the code, or feature request

The package is great works in most conditions (many thanks for this) but also makes lazy that I don't want to wrangle misread pdf pages. Hence, I'd like to request below.

Provide a parameter/method to specify the number of columns; start and end co-ordinates of each column so that table is extracted

The documentation provides no examples on how to do this and I cannot find tests that cover this feature. From the code I understand more or less how it should work but I'm not sure. Is there some example somewhere?

Overview

This issue contains CLI improvements

Tasks

support headers options
support granular check options
support providing validation config instead of options (like stringified json)

My guess would be:

goodtables --checks "{'duplicate-row': True}" datapackage datapackage.json

But it returns:

goodtables.exceptions.GoodtablesE

Hi,

Thanks for this excellent implementation. I am trying it out now.
Your homepage docs and the Jupyter notebook example say max_epoch=100 but in fact, in the latest install, it is set to 5.

Please reconcile the descriptions. The model is computationally heavy, so it is good to have a realistic expectation about the epochs to run before one starting fitting :-)

Feature request

What is the expected behavior?

What is motivation or use case for adding/changing the behavior?

How should this be implemented in your opinion?

Are you willing to work on this yourself?
yes

Either on/off or maybe a frequency (e.g. every N epochs)

At present sno reset exists, but it actually does something totally unrelated to git reset: it throws away uncommited changes in your working tree. That's the equivalent of git checkout .

git reset actually doesn't touch your working tree; it modifies the repository head and possibly the index. So it's a totally different command.

In testing sno, I find myself needing a more gittish re

These are the planned activity for 0.3.0
These are in progress in nightly
Few of these may be dropped (as all of these are experimental)

Complete Heuristic maturation phase 1 : complete r-rudra/tidycells#5
Introduce grammar
%<^% kind of symbols for NNE etc.
Add shinytest prototype as de

tabular-data

Here are 121 public repositories matching this topic...

bvaughn / react-virtualized

vaexio / vaex

johnkerl / miller

antonycourtney / tad

nhn / tui.grid

eBay / tsv-utils

JuliaData / DataFrames.jl

turicas / rows

ropensci / tabulizer

reubano / meza

continuum / active_importer

firmai / deltapy

jrzaurin / pytorch-widedeep

frictionlessdata / goodtables-py

Overview

Tasks

scottrhoyt / SwiftyTextTable

pavankataria / SwiftDataTables

mirador / mirador

csvreader / csvreader

adrienjoly / npm-pdfreader

nirum / tableprint

sdv-dev / TGAN

dreamquark-ai / tabnet

Feature request

sdv-dev / CTGAN

yubowenok / visflow

DataCanvasIO / DeepTables

olehmberg / winter

koordinates / sno

csvreader / csvpack

aerosol / Tabula

r-rudra / tidycells

Improve this page

Add this topic to your repo