Pinned issues
Feature Request: Support for sentencepiece unigram model
#53
opened Jan 12, 2020 by
leslyarun
Open
6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Why doesn't this library share the same tokenizer api as the transformers library?
#259
opened May 6, 2020 by
sabetAI
Tokenizer stalls / hangs when used in DataLoader (multiprocessing issue)
#258
opened May 5, 2020 by
jiahuei
Advice for training BertWordPieceTokenizer multilingual tokenizer?
#257
opened May 5, 2020 by
shenkev
Convert saved pretrained tokenizers from transformers to tokenizers
#230
opened Apr 11, 2020 by
NonaryR
Is it possible to convert the tokenized tokens back to sentence?
#211
opened Mar 30, 2020 by
Hhhhhhhhhhao
How to use tokenizers library for my own dataset with a mix of existing and new vocabulary
#194
opened Mar 11, 2020 by
nikhilno1
Previous Next
ProTip!
Type g i on any issue or pull request to go back to the issue listing page.