Skip to content
#

tokenization

Here are 186 public repositories matching this topic...

blondfrogs
blondfrogs commented May 23, 2019

OSX build notes have the following line
brew install automake berkeley-db4 libtool boost --c++11 miniupnpc openssl pkg-config protobuf python3 qt libevent

However, the boost --c++11 isn't a valid command anymore. Need to update it

Ekphrasis is a text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenization, word normalization, word segmentation (for splitting hashtags) and spell correction, using word statistics from 2 big corpora (english Wikipedia, twitter - 330mil english tweets).

  • Updated Oct 22, 2019
  • Python

Improve this page

Add a description, image, and links to the tokenization topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the tokenization topic, visit your repo's landing page and select "manage topics."

Learn more

You can’t perform that action at this time.