datautils
The best toolbox for processing textual data.
Introduction & Rationale
The Data Utilities are a collection of handy text manipulation tools. These tools are supposed to make a data wrangler’s life on the command-line easier.
Much of the functionality can be solved with standard command-line tools (awk, sed, cut, sort, uniq, …), but that would often become tedious. Zealots of the Unix philosophy will probably not use these tools and create a set of sophisticated aliases instead.
On the other hand, some of the tools fix actual problems. The tools use UTF-8 by default. As a consequence, one does not have to deal with the quirks of sort and uniq w.r.t. non-ASCII input.
Tool Overview
These tools are part of the collection:
countnormrowstexttrim
Usage Examples
count
norm
$ echo "¹²³" | norm --nfc
¹²³
$ echo "¹²³" | norm --nfkc
123rows
text
trim
$ echo " abc" | trim -l
abcInstallation
Debian & Ubuntu
snap
sudo apt-get install snapd
sudo snap install --channel=candidate datautils
sudo snap alias datautils.norm count
sudo snap alias datautils.norm norm
sudo snap alias datautils.norm rows
sudo snap alias datautils.norm text
sudo snap alias datautils.trim trimapt
sudo add-apt-repository ppa:sfischer13/datautils
sudo apt-get update
sudo apt-get install datautilsDevelopers
go get
go get github.com/sfischer13/datautils/...go dep
go get -u github.com/golang/dep/cmd/dep
git clone https://github.com/sfischer13/datautils.git
cd datautils
dep ensure
go installCredits
This project is authored and maintained by Stefan Fischer.
The source code is available under the MIT License.
See LICENSE for further details.
