-
Updated
Jul 9, 2022 - Jupyter Notebook
#
asr
Here are 500 public repositories matching this topic...
NeMo: a toolkit for conversational AI
nlp
text-to-speech
deep-learning
neural-network
machine-translation
tts
speech-synthesis
speech-recognition
speech-to-text
nmt
language-model
speaker-recognition
nlp-machine-learning
asr
speaker-diarization
text-normalization
A PyTorch-based Speech Toolkit
audio
deep-learning
transformers
pytorch
voice-recognition
speech-recognition
speech-to-text
language-model
speaker-recognition
speaker-verification
speech-processing
audio-processing
asr
speaker-diarization
speechrecognition
speech-separation
speech-enhancement
spoken-language-understanding
huggingface
speech-toolkit
-
Updated
Jul 9, 2022 - Python
nshmyrev
commented
Oct 12, 2021
good first issue
Good for newcomers
2
3
alexa
ai
amazon-echo
muse
tts
google-home
unit
bci
speaker
homeassistant
snowboy
asr
anyq
raspeberry-pi
-
Updated
May 13, 2022 - Python
Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple
text-to-speech
german
speech
pytorch
tts
speech-synthesis
english
speech-recognition
spanish
colab
speech-to-text
pretrained-models
stt
asr
capitalization
onnx
stt-benchmark
tts-models
torch-hub
repunctuation
-
Updated
Jun 30, 2022 - Jupyter Notebook
Lingvo
nlp
research
translation
tensorflow
machine-translation
speech
distributed
tts
speech-synthesis
mnist
speech-recognition
lm
seq2seq
speech-to-text
gpu-computing
language-model
asr
-
Updated
Jul 8, 2022 - Python
Production First and Production Ready End-to-End Speech Recognition Toolkit
pytorch
transformer
speech-recognition
automatic-speech-recognition
production-ready
asr
conformer
e2e-models
-
Updated
Jul 9, 2022 - C++
pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.
deep-neural-networks
deep-learning
speech
dnn
pytorch
recurrent-neural-networks
lstm
gru
speech-recognition
rnn
kaldi
rnn-model
asr
lstm-neural-networks
multilayer-perceptron-network
timit
dnn-hmm
-
Updated
Mar 14, 2022 - Python
DELTA is a deep learning based natural language and speech processing platform.
nlp
front-end
ops
deep-learning
text-classification
tensorflow
nlu
speech
inference
text-generation
speech-recognition
seq2seq
sequence-to-sequence
speaker-verification
asr
tensorflow-serving
emotion-recognition
custom-ops
serving
tensorflow-lite
-
Updated
May 26, 2022 - Python
BitBarrel
commented
Sep 19, 2021
Creating CSV files manually is a lot of work. This could be automated by a script if the name of the WAV file is the same as the transcript.
The same could be done for creating a language model input text file. A script could pull the transcript from the WAV file name.
SincNet is a neural architecture for efficiently processing raw audio samples.
audio
python
deep-learning
signal-processing
waveform
cnn
pytorch
artificial-intelligence
speech-recognition
neural-networks
convolutional-neural-networks
digital-signal-processing
filtering
speaker-recognition
speaker-verification
speech-processing
audio-processing
asr
timit
speaker-identification
-
Updated
Apr 28, 2021 - Python
Espresso: A Fast End-to-End Neural Speech Recognition Toolkit
-
Updated
Jun 27, 2022 - Python
A Python wrapper for Kaldi
python
wrapper
numpy
speech
feature-extraction
speech-recognition
kaldi
language-model
asr
openfst
clif
-
Updated
May 29, 2022 - Python
The official repository of the Eesen project
-
Updated
May 23, 2019 - C++
an open-source implementation of sequence-to-sequence based speech processing engine
deployment
tensorflow
tts
speech-synthesis
transformer
speech-recognition
sequence-to-sequence
unsupervised-learning
speaker-recognition
asr
ctc
wfst
-
Updated
Mar 20, 2022 - Python
Open STT
-
Updated
Mar 11, 2022 - Python
Open
Design a Logo
iceychris
commented
Nov 16, 2020
Design a logo for LibreASR and share it here.
To make an open source project cool, it should have a logo
good first issue
Good for newcomers
Open
Raspberry Pi Support
A PyTorch implementation of Speech Transformer, an End-to-End ASR with Transformer network on Mandarin Chinese.
-
Updated
May 7, 2020 - Python
End-to-end ASR/LM implementation with PyTorch
streaming
speech
language-modeling
pytorch
transformer
speech-recognition
seq2seq
attention
automatic-speech-recognition
sequence-to-sequence
language-model
attention-mechanism
asr
ctc
rnn-transducer
transformer-xl
-
Updated
Aug 30, 2021 - Python
Open
Create REST server
1
nshmyrev
commented
Sep 26, 2021
good first issue
Good for newcomers
On-device streaming speech-to-text engine powered by deep learning
android
python
c
raspberry-pi
iot
ios
machine-learning
arm
deep-learning
offline
webassembly
voice-recognition
speech-recognition
speech-to-text
stt
asr
-
Updated
Jul 5, 2022 - TypeScript
OleguerCanal
commented
Apr 9, 2022
1
Open tools and data for cloudless automatic speech recognition
-
Updated
Mar 30, 2021 - Python
Chinese text normalization for speech processing
-
Updated
May 1, 2022 - Python
Sequence-to-Sequence Framework in PyTorch
deep-learning
cnn
pytorch
speech-recognition
seq2seq
neural-machine-translation
nmt
multimodality
asr
-
Updated
Jul 13, 2021 - Jupyter Notebook
python
pypi
speech-recognition
nlp-library
asr
nlp-tool
arabic-numbers
arabic-numerals
chinese-numerals
cn2an
-
Updated
Apr 23, 2022 - Python
Improve this page
Add a description, image, and links to the asr topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the asr topic, visit your repo's landing page and select "manage topics."

目前的多音字使用 pypinyin 或者 g2pM,精度有限,想做一个基于 BERT (或者 ERNIE) 多音字预测模型,简单来说就是假设某语言有 100 个多音字,每个多音字最多有 3 个发音,那么可以在 BERT 后面接 100 个 3 分类器(简单的 fc 层即可),在预测时,找到对应的分类器进行分类即可。
参考论文:
tencent_polyphone.pdf
数据可以用 https://github.com/kakaobrain/g2pM 提供的数据
进阶:多任务的 BERT
![image](https://user-images.githubusercontent.com/24568452