A PyTorch-based Speech Toolkit
-
Updated
Mar 18, 2023 - Python
A PyTorch-based Speech Toolkit
Reading list for research topics in multimodal machine learning
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
WaveNet vocoder
PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models
Transformers at any scale
A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.
SincNet is a neural architecture for efficiently processing raw audio samples.
Open source audio annotation tool for humans
A neural network for end-to-end speech denoising
A tutorial for Speech Enhancement researchers and practitioners. The purpose of this repo is to organize the world’s resources for speech enhancement and make them universally accessible and useful.
Speech recognition toolkit for the arduino
Problem Agnostic Speech Encoder
Tensorflow 2.x implementation of the DTLN real time speech denoising model. With TF-lite, ONNX and real-time audio processing support.
General Speech Restoration
Novoic's audio feature extraction library
PyTorch implementation of "FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."
This repository has implementation for "Neural Voice Cloning With Few Samples"
Add a description, image, and links to the speech-processing topic page so that developers can more easily learn about it.
To associate your repository with the speech-processing topic, visit your repo's landing page and select "manage topics."