-
Updated
May 17, 2020 - Java
dataframe
Here are 301 public repositories matching this topic...
Hi,
I am trying to load a CSV with no header using
df = vaex.open('data/star0000-1.csv',sep=",", header=None, error_bad_lines=False)
but I get
could not convert column 0, error: TypeError('getattr(): attribute name must be string'), will try to convert it to string
Giving up column 0, error: TypeError('getattr(): attribute name must be string')
could not convert column
Plotly has an out of the box responsive flag: https://plotly.com/javascript/configuration-options/#making-a-responsive-chart
We just need to add a boolean responsive to the Config class.
Series.reindex
Implement Series.reindex.
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.reindex.html
Support error function and fresnel integrals in https://docs.scipy.org/doc/scipy/reference/special.html#error-function-and-fresnel-integrals, those are not universal functions may not need to be supported.
The documentation file appears to have been generated with no space between the hashes and the header text. This is causing the headers to not display correctly, and is difficult to read. See below for an example of with and without the space:
##
Mobius API Documentation
###Microsoft.Spark.CSharp.Core.Accumulator</
-
Updated
Jan 6, 2019 - Python
-
Updated
May 13, 2020 - C++
Hi, would it be possible to make the user warnings display only when using pipes that actually depend on these imports? Or at least display them in a way that allows filtering out (with logging package perhaps)?
It's just a minor flaw on otherwise great package. Awesome work!
janitor.biology could do with a to_fasta function, I think. The intent here would be to conveniently export a dataframe of sequences as a FASTA file, using one column as the fasta header.
strawman implementation below:
import pandas_flavor as pf
from Bio.SeqRecord import SeqRecord
from Bio.Seq import Seq
from Bio import SeqIO
@pf.register_dataframe_method
def to_fasta(dIt would be good to have Gradle write the version number (and other build info) to the manifest of the executor jar so that the executor can print its version number on startup.
This applies to both the jvm and spark executor.
Any plans to get this into DefinitelyTyped?
Originally posted by @danielgwilson in Gmousse/dataframe-js#43 (comment)
improve csv import
*fix column header issues in preview
*handle arbitrary whitespace
-
Updated
May 27, 2019 - Python
-
Updated
May 6, 2020 - Python
Hello,
I haven't tested append() yet, and I was wondering if duplicates are removed when an append is managed.
I had a look in collection.py script and following pandas function are used:
combined = dd.concat([current.data, new]).drop_duplicates(keep="last")
After a look into pandas documentation, I understand that duplicate lines are removed, only the last occurence is kept.
-
Updated
Dec 17, 2019 - Go
-
Updated
May 16, 2020 - Go
-
Updated
May 13, 2020 - Python
Update docs
In order to update https://bluenote10.github.io/NimData/nimdata.html I tried running build_docs.sh, but ran into the following Nim doc gen issues:
The following command is somewhat working, besides the missing dochack.js and with the `git.c
-
Updated
May 17, 2020 - Rust
-
Updated
Nov 26, 2018 - Java
To improve spotting differences between datasets visually
(especially when there are many columns) it would be helpful if one could sort the categorical columns by the Jensen–Shannon divergence.
The code below tries to do so but it seems to distort the labels on the y-axis. Also, in case the jsd column contains missing values, those variables are deleted from the graph.
library(inImprove this page
Add a description, image, and links to the dataframe topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the dataframe topic, visit your repo's landing page and select "manage topics."
Describe the problem
We should test on larger datasets that are commonly used in