Skip to content
#

Data Science

Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge from structured and unstructured data. Data scientists perform data analysis and preparation, and their findings inform high-level decisions in many organizations.

Here are 23,433 public repositories matching this topic...

superset

Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.

  • Updated Nov 4, 2021
  • Python
pytorch-lightning
awaelchli
awaelchli commented Dec 15, 2021

Proposed refactor

The output produced by the standalone tests is unfiltered and produces 1000's of lines due to the progress bar being turned on. This is not needed for 99% - 100% of tests. For regular tests this is not an issue since the tests get batched and the output is not printed in verbose.

Motivation

Easier to read and scroll the test output logs

Pitch

Set

`Tra

dash
gensim
danieldeutsch
danieldeutsch commented Jun 2, 2021

Is your feature request related to a problem? Please describe.
I typically used compressed datasets (e.g. gzipped) to save disk space. This works fine with AllenNLP during training because I can write my dataset reader to load the compressed data. However, the predict command opens the file and reads lines for the Predictor. This fails when it tries to load data from my compressed files.

nni