-
Updated
Mar 31, 2020 - Python
#
cws
Here are 20 public repositories matching this topic...
Jiagu深度学习自然语言处理工具 知识图谱关系抽取 中文分词 词性标注 命名实体识别 情感分析 新词发现 关键词 文本摘要 文本聚类
基于Pytorch和torchtext的自然语言处理深度学习框架,包含序列标注、文本分类、句子关系、文本生成、结构分析、五大功能模块,已实现了命名实体识别、中文分词、词性标注、语义角色标注、情感分析、关系抽取、语言模型、文本相似度、文本蕴含、依存句法分析、词向量训练、聊天机器人、机器翻译、文本摘要等功能。框架功能丰富,开箱可用,极易上手!基本都是学习他人实现然后自己修改融合到框架中,没有细致调参,且有不少Bug~
python
nlp
deep-learning
text-classification
word2vec
pytorch
chinese
pos
skip-gram
cbow
language-model
cws
dependency-parsing
srl
relation-extraction
sentence-similarity
hierarchical-softmax
torchtext
negative-sampling
nature-language-process
-
Updated
Jan 10, 2020 - Python
BERT for Multitask Learning
nlp
text-classification
transformer
named-entity-recognition
pretrained-models
part-of-speech
ner
word-segmentation
bert
cws
encoder-decoder
multi-task-learning
multitask-learning
-
Updated
May 21, 2020 - Python
自然语言处理工具Macropodus,基于Albert+BiLSTM+CRF深度学习网络架构,中文分词,词性标注,命名实体识别,新词发现,关键词,文本摘要,文本相似度,科学计算器,中文数字阿拉伯数字(罗马数字)转换,中文繁简转换,拼音转换。tookit(tool) of NLP,CWS(chinese word segnment),POS(Part-Of-Speech Tagging),NER(name entity recognition),Find(new words discovery),Keyword(keyword extraction),Summarize(text summarization),Sim(text similarity),Calculate(scientific calculator),Chi2num(chinese number to arabic number)
-
Updated
May 14, 2020 - Python
API of Articut 中文斷詞 (兼具語意詞性標記):「斷詞」又稱「分詞」,是中文資訊處理的基礎。Articut 不用機器學習,不需資料模型,只用現代白話中文語法規則,即能達到 SIGHAN 2005 F1-measure 91% 以上,Recall 96% 以上的成績。
nlp
natural-language-processing
nlu
artificial-intelligence
cws
pos-tagging
part-of-speech-tagger
pos-tagger
natural-language-understanding
part-of-speech-embdding
-
Updated
May 6, 2020 - Python
Source code for an ACL2017 paper on Chinese word segmentation
-
Updated
Jan 8, 2019 - Python
Source codes for paper "Neural Networks Incorporating Dictionaries for Chinese Word Segmentation", AAAI 2018
-
Updated
Feb 1, 2018 - Python
Source code for an ACL2016 paper of Chinese word segmentation
-
Updated
Jan 8, 2019 - Python
Chinese & English Cws Pos Ner Entity Recognition implement using CNN bi-directional lstm and crf model with char embedding.基于字向量的CNN池化双向BiLSTM与CRF模型的网络,可能一体化的完成中文和英文分词,词性标注,实体识别。主要包括原始文本数据,数据转换,训练脚本,预训练模型,可用于序列标注研究.注意:唯一需要实现的逻辑是将用户数据转化为序列模型。分词准确率约为93%,词性标注准确率约为90%,实体标注(在本样本上)约为85%。
-
Updated
Aug 11, 2019 - Python
Sub-Character Representation Learning
nlp
natural-language-processing
traditional-chinese
simplified-chinese
representation-learning
cws
chinese-word-segmentation
-
Updated
May 28, 2018 - Python
TrashBirdEcology
commented
Apr 4, 2020
The documentation for fetch_bbs_data does not inform me where the data is saving on machine, nor does it allow me to specify a location. Without having looked into the function, I haven't a clue where or how the data is downloaded and saved(?) locally. Please edit the documentation to inform us.
Further, I cloned the package, ran r cmd check (devtools::load_all()), then ran `fetch_bbs_data()
A script to generate an Atom feed from Chrome Web Store reviews and support feedback
python
atom
chrome-extension
atom-feed
chrome-web-store
chrome-extensions
cws
chrome-webstore
support-feedback
-
Updated
Aug 31, 2018 - Python
毕设:面向领域快速移植的高精度分词系统
-
Updated
Aug 16, 2018 - Java
gcws is CWS(Chinese Word Segmentation) for golang - 一个开源中文分词集成
-
Updated
May 4, 2018 - Go
CWS publish with Golang
-
Updated
Jul 28, 2018 - Go
Improve this page
Add a description, image, and links to the cws topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the cws topic, visit your repo's landing page and select "manage topics."
您好,我最近在做bilstm-crf分词实验,使用了您项目中预训练的word-embedding之后结果提升了两个点。所以想问一下您的word-embedding来源是哪,还是自己训练的?