Kashgari is a production-level NLP Transfer learning framework built on top of tf.keras for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.
nlp
machine-learning
text-classification
named-entity-recognition
seq2seq
transfer-learning
ner
sequence-labeling
nlp-framework
bert-model
gpt-2
-
Updated
Jul 5, 2020 - Python
Users misspell things. Having spell-check and synonyms helps a lot, but doesn't catch everything.
One solution would be to use the python metaphone package's implementation of the Double Metaphone algorithm.
At component train time, it could look at the normal entity lists, find the DM representation of all the synonyms, and store them.