bert
Here are 2,117 public repositories matching this topic...
-
Updated
Jul 25, 2021 - Jupyter Notebook
-
Updated
Jul 1, 2021 - Python
-
Updated
Oct 22, 2020
-
Updated
Jan 24, 2022 - Python
I wonder if it would be useful to have a sequence object for the decoders too.
It seems to me for example that if we build a tokenizer with a BPE model that defines a end_of_word_suffix, we will need to use the BPEDecoder decoder to replace theend_of_word_suffix and if we also used a ByteLevel pre-tokenization we will need the ByteLevel decoder to realign the codes.
At the moment, i
chooses 15% of token
From paper, it mentioned
Instead, the training data generator chooses 15% of tokens at random, e.g., in the sentence my
dog is hairy it chooses hairy.
It means that 15% of token will be choose for sure.
From https://github.com/codertimo/BERT-pytorch/blob/master/bert_pytorch/dataset/dataset.py#L68,
for every single token, it has 15% of chance that go though the followup procedure.
PositionalEmbedding
-
Updated
Feb 28, 2022 - Python
The current FileClassifier calculates the file type based on the file extensions by using the guess_type method. If this method does not return a match, we should guess the file type based on the content.
Python-magic provides this kind of functionality.
-
Updated
Feb 24, 2021 - Python
-
Updated
Feb 25, 2022 - Python
-
Updated
Oct 22, 2020 - Python
-
Updated
Jul 15, 2021 - Jupyter Notebook
-
Updated
Mar 4, 2022 - Python
文档增加tokenizer类别及样例建议
欢迎您反馈PaddleNLP使用问题,非常感谢您对PaddleNLP的贡献!
在留下您的问题时,辛苦您同步提供如下信息:
- 版本、环境信息
1)PaddleNLP和PaddlePaddle版本:请提供您的PaddleNLP和PaddlePaddle版本号,例如PaddleNLP 2.0.4,PaddlePaddle2.1.1
2)系统环境:请您描述系统类型,例如Linux/Windows/MacOS/,python版本 - 复现信息:如为报错,请给出复现环境、复现步骤
paddle版本2.0.8 paddlenlp版本2.1.0
建议,能否在paddlenlp文档中,整理列出各个模型的tokenizer是基于什么类别的based,如bert tokenizer是word piece的,xlnet tokenizer是sentence piece的,以及对应的输入输出样例
关于一些具体建议
-
Updated
Mar 12, 2022 - Python
-
Updated
Mar 12, 2022 - Scala
-
Updated
Jan 7, 2022 - Python
-
Updated
Jan 22, 2022 - Python
-
Updated
Jul 9, 2021 - Python
-
Updated
Aug 26, 2021 - Python
-
Updated
Feb 16, 2022 - Python
-
Updated
Mar 11, 2022 - Cuda
-
Updated
Jul 8, 2021 - Python
-
Updated
Apr 23, 2021 - Python
-
Updated
Mar 12, 2022 - Python
-
Updated
Mar 11, 2022 - Python
-
Updated
Mar 21, 2021
-
Updated
Jan 11, 2022 - Python
Improve this page
Add a description, image, and links to the bert topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the bert topic, visit your repo's landing page and select "manage topics."
This issue is part of our Great Code Cleanup 2022. If you're interested in helping out, take a look at this thread, or come join us on Discord and talk with other contributors!
Type hints are used inconsistently in the
transformersrepo across both TF and PT models, and