Skip to content
#

datacleaning

Here are 155 public repositories matching this topic...

gardnerdev
gardnerdev commented Jan 16, 2021

Describe the bug
When trying to run scaffolding (profiling) command, it fails because of commas in columns.

To Reproduce
Steps to reproduce the behavior:

  1. Run great_expectations suite scaffold scaffold-name on datasource where commas are in column
  2. Bug pandas.errors.ParserError: Error tokenizing data. C error: Expected 1 fields in line 5323 saw 2

Expected behavior
D

It is a Natural Language Processing Problem where Sentiment Analysis is done by Classifying the Positive tweets from negative tweets by machine learning models for classification, text mining, text analysis, data analysis and data visualization

  • Updated May 14, 2019
  • Jupyter Notebook

This repo contains 4 different projects. Built various machine learning models for Kaggle competitions. Also carried out Exploratory Data Analysis, Data Cleaning, Data Visualization, Data Munging, Feature Selection etc

  • Updated May 7, 2020
  • Jupyter Notebook

My fictitious firm, GDSMC Global, is a security consultancy focusing on supporting governments around the world in understanding, predicting, and stopping terrorism attacks. Our goal is to allow individual nation states to better deploy security resources to reduce the likelihood of successful terrorism in the future, and to understand what are the likely coming costs of terrorism so that resources can be set aside, in advance, to rebuild after inevitable and unfortunate attack.Although governments can submit their own internal security data to us for study, our models are constructed using the Global Terrorism Database (GTD) maintained by the National Consortium for the Study of Terrorism and Responses to Terrorism at the University of Maryland ( http://start.umd.edu/gtd/ ).

  • Updated Feb 28, 2018
  • R
Nelson-Gon
Nelson-Gon commented Feb 6, 2021

Description

I would like to preserve "reorder" row names when sorting in na_summary.

Similar Features

This is related to na_summary when sorted.

Feature Details

Given a data.frame object, running na_summary on this data works as expected except the returned rows are in their original order. Example:

df <- data.frame(A=1:5,B=c(NA,NA,25,24,53), C=c(NA,1,2,3

Improve this page

Add a description, image, and links to the datacleaning topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the datacleaning topic, visit your repo's landing page and select "manage topics."

Learn more