word-segmentation

It would be worth to provide a tutorial how to train a simple cross-language classification model using sentencepiece. Supposed to have a given training set and have chosen a model (let'say a simple Word2Vec plus softmax or a LSTM model, etc), how to use the created sentencepiece model (vocabulary/codes) to feed this model for train and inference?

word-segmentation

Here are 82 public repositories matching this topic...

google / sentencepiece

Tutorial to train a cross-language model with sentencepiece

Guidance on how to implement subword sampling at train time

wolfgarbe / SymSpell

baidu / lac

VKCOM / YouTokenToMe

PyThaiNLP / pythainlp

Add training script for language models

Known scripts/notebooks and data

PyThaiNLP 2.2 change log

Website improvement

cbaziotis / ekphrasis

mammothb / symspellpy

JayYip / bert-multitask-learning

vncorenlp / VnCoreNLP

ku-nlp / jumanpp

Design and document a stable C API

Reduce the space taken by RNN

taishi-i / nagisa

monpa-team / monpa

fudannlp16 / CWS_Dict

jcyk / CWS

bab2min / Kiwi

phongnt570 / UETsegmenter

ikegami-yukino / mecab

datquocnguyen / RDRsegmenter

KrakenAI / SynThai

ye-kyaw-thu / sylbreak

peterolson / hanzi-tools

bamtercelboo / pytorch_Joint-Word-Segmentation-and-POS-Tagging

undertheseanlp / word_tokenize

MighTguY / customized-symspell

qiaofei32 / dnn-lstm-word-segment

wolfgarbe / WordSegmentationTM

JayYip / cws-tensorflow

dnanhkhoa / python-vncorenlp

crackcell / gonlpir

jidasheng / bi-lstm-crf

Improve this page

Add this topic to your repo