bert

Consider this code that downloads models and tokenizers to disk and then uses BertTokenizer.from_pretrained to load the tokenizer from disk.

ISSUE: BertTokenizer.from_pretrained() does not seem to be compatible with Python's native pathlib module.

# -*- coding: utf-8 -*-
"""
Created on: 25-04-2020
Author: MacwanJ

ISSUE:

Prerequisites

Please fill in by replacing [ ] with [x].

Are you running the latest bert-as-service?
Did you follow the installation and the usage instructions in README.md?
Did you check the [FAQ list in README.md](https://github.com/hanxiao/bert-as-se

The position embedding in the BERT is not the same as in the transformer. Why not use the form in bert?

Spacy has customizable word level tokenizers with rules for multiple languages. I think porting that to rust would add nicely to this package. Having a customizable uniform word level tokenization across platforms (client web, server) and languages would be beneficial. Currently, idk any clean way or whether it's even possible to write bindings for spacy cython.

Spacy Tokenizer Code

https:

近期在看模型的时候，因为README.md文件里涉及到了代码，但是markdown文件里代码的变量为Python的关键字str，如下所示，

import time
from bert_base.client import BertClient

with BertClient(show_server_config=False, check_version=False, check_length=False, mode='NER') as bc:
   start_t = time.perf_counter()
   str = '1月24日，新华社对外发布了中央对雄安新区的指导意见，洋洋洒洒1.2万多字，17次提到北京，4次提到天津，信息量很大，其实也回答了人们关心的很多问题。'
   rst = bc.encode([str, str])
   pri

Looks like spacy 2.1 --> 2.2 has changed the way lemmatizer objects are built. See stack-overflow answer for details.

I can update the library to account for this migration. I have a fork that I can create a pull request from. Let me know.

Steps to reproduce the behavior:
Run
"fr

Add CI test for building documentations (Do not ignore warnings and add spellcheck).
Fix docstrings with incorrect/inconsistent Sphinx format. Currently, such issues are treated as warnings in the docs building.

Currently, we are not logging macro/micro averages on Tensorboard since it was appearing strangely in the interface (picture below), so it was removed.

Add macro/micro to Tensorboard.

If I want to use both of them, how to modify code in aen.py? Thanks a lot.

When the code is doing the evaluation, there is an error when returning the evaluation result : result = estimator.evaluate(input_fn=eval_input_fn). Detailed error is probably related to the confusion matrix.
It says that: TypeError: eval_metric_ops[confusion_matrix] must be Operation or Tensor, given: <tf.Variable 'total_confusion_matrix:0' shape=(12, 12) dtype=float64_ref>
my tensorflo

bert

Here are 854 public repositories matching this topic...

huggingface / transformers

hanxiao / bert-as-service

graykode / nlp-tutorial

brightmart / nlp_chinese_corpus

codertimo / BERT-pytorch

huggingface / tokenizers

Spacy Tokenizer Code

PaddlePaddle / ERNIE

ymcui / Chinese-BERT-wwm

macanv / BERT-BiLSTM-CRF-NER

brightmart / albert_zh

NervanaSystems / nlp-architect

bentrevett / pytorch-sentiment-analysis

asyml / texar

jessevig / bertviz

CyberZHG / keras-bert

Separius / awesome-sentence-embedding

Jiakui / awesome-bert

shibing624 / pycorrector

JohnSnowLabs / spark-nlp

kaushaltrivedi / fast-bert

synrc / n2o

ChineseGLUE / ChineseGLUE

brightmart / roberta_zh

github / CodeSearchNet

msgi / nlp-journey

CLUEbenchmark / CLUE

nyu-mll / jiant

songyouwei / ABSA-PyTorch

kyzhouhzau / BERT-NER

deepset-ai / FARM

Improve this page

Add this topic to your repo