Skip to content
#

cleaning-data

Here are 149 public repositories matching this topic...

thatlittleboy
thatlittleboy commented Jan 2, 2022

Background

This thread is borne out of the discussion from #968 , in an effort to make documentation more beginner-friendly & more understandable.
One of the subtasks mentioned in that thread was to go through the function docstrings and include a minimal working example to each of the public functions in pyjanitor.

Criteria reiterated here for the benefit of discussion:

It sh

Simple and automatic data cleaning in one line of code! It performs one-hot encoding, date & time casting to datetime dtype, detects binary columns, safely convert non-numeric columns to numeric dtypes, cleaning dirty/empty values, normalizing values and removing unwanted columns all in one line of code. Get your data ready for model training and fitting quickly.

  • Updated May 22, 2021
  • Python

Some little notes from the author for everyone who wants to know or learn about the process that a data scientist must do from the beginning of data collection to making predictions with a model that has been built. These notes are based on the knowledge that the authors have learned and implemented. Enjoy it!

  • Updated Sep 29, 2020
  • Jupyter Notebook

Project No. 4 in the Udacity Data Analyst Nanodegree Winter 2019-2020. Using Python, we’ll gather data from a variety of sources, assess its quality and tidiness, then clean it. We’ll document our wrangling efforts in a Jupyter Notebook, plus showcase them through analyses and visualizations using Python and SQL.

  • Updated May 31, 2021
  • Jupyter Notebook

Unsupervised Machine Learning- CyrptoCurrency Analysis, using several models on a cryptocurrency data in order to discover patterns and groups in data. Analysis done to create a report that includes what cryptocurrencies are on the trading market and how they could be grouped in order to create a classification system for potential new investments into the cryptocurrency market.

  • Updated Nov 1, 2021
  • Jupyter Notebook

This exclusive repository consists of various minor data analysis projects and study materials to acquire the knowledge behind data visualization and programming with MATLAB. Diverse topics are covered from Crime against women, Sentiment Analysis, Digital Signal Analysis, Student Academic Performance Data to Analyzing Temperature and Humidity in Finland.

  • Updated Jan 15, 2022
  • Jupyter Notebook

Improve this page

Add a description, image, and links to the cleaning-data topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the cleaning-data topic, visit your repo's landing page and select "manage topics."

Learn more