bert
Here are 1,267 public repositories matching this topic...
-
Updated
Jan 1, 2021 - Python
-
Updated
Oct 20, 2020 - Jupyter Notebook
-
Updated
Oct 22, 2020
-
Updated
Nov 20, 2020 - Python
-
Updated
Jan 18, 2021 - Rust
chooses 15% of token
From paper, it mentioned
Instead, the training data generator chooses 15% of tokens at random, e.g., in the sentence my
dog is hairy it chooses hairy.
It means that 15% of token will be choose for sure.
From https://github.com/codertimo/BERT-pytorch/blob/master/bert_pytorch/dataset/dataset.py#L68,
for every single token, it has 15% of chance that go though the followup procedure.
PositionalEmbedding
-
Updated
Jan 13, 2021 - Python
-
Updated
Oct 16, 2020 - Python
-
Updated
Oct 22, 2020 - Python
-
Updated
Jan 22, 2021 - Python
-
Updated
Nov 16, 2020 - Jupyter Notebook
-
Updated
Jan 24, 2021 - Jupyter Notebook
-
Updated
Sep 17, 2020 - Python
-
Updated
Jul 28, 2020 - Python
-
Updated
Dec 9, 2020 - Python
-
Updated
Jan 15, 2021 - Python
-
Updated
Jan 23, 2021 - Scala
-
Updated
Dec 9, 2020 - Python
-
Updated
Oct 23, 2019
-
Updated
Nov 6, 2020 - Python
-
Updated
Jun 29, 2020 - Python
-
Updated
Dec 17, 2020 - Python
-
Updated
Jan 3, 2021 - Python
-
Updated
Sep 18, 2020 - Jupyter Notebook
Is your feature request related to a problem? Please describe.
With the new flexible Pipelines introduced in deepset-ai/haystack#596, we can build way more flexlible and complex search routes.
One common challenge that we saw in deployments: We need to distinguish between real questions and keyword queries that come in. We only want to route questions to the Reader b
-
Updated
Jan 14, 2021 - Python
-
Updated
Dec 17, 2020 - Erlang
-
Updated
Jan 23, 2021 - Python
训练数据集问题
你好,看代码使用的训练数据为Restaurants_Train.xml.seg,请问这是这是在哪里下载的吗,还是semeval14的任务4中xml文件生成的?如果是后续生成的,请问有数据生成部分的代码吗?
Improve this page
Add a description, image, and links to the bert topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the bert topic, visit your repo's landing page and select "manage topics."
To get the full speed-up of FP16 training, every tensor passed through the model should have all its dimensions be a multiple of 8. In the new PyTorch examples, when using dynamic padding, the tensors are padded to the length of the biggest sentence of the batch, but that number is not necessarily a multiple of 8.
The examples should be improved to pass along the option
pad_to_multiple_of=8w