Skip to content
A TensorFlow implementation of Baidu's DeepSpeech architecture
C++ Python C Shell C# Java Other
Branch: master
Clone or download

Latest commit

lissyx Merge pull request #2929 from lissyx/ci-py37-py38
Fix #2928: Add Python 3.7 CI coverage
Latest commit de5bfed Apr 20, 2020

Files

Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.github Add lock bot config Dec 28, 2018
bin M-AILAB importer: Ensure all samples are 16 kHz Apr 20, 2020
data Refactor generate_package.py (#2903) Apr 17, 2020
doc rebased docs on master Apr 17, 2020
examples Remove example code Dec 10, 2019
images Updating Geometry Dec 2, 2019
native_client Force ds-swig first in PATH to avoid messing if system-wide exists Apr 20, 2020
taskcluster Fix #2928: Add Python 3.7, 3.8 CI coverage Apr 20, 2020
tests Package training code to avoid sys.path hacks Mar 25, 2020
training/deepspeech_training Split --load into two to avoid unexpected behavior at evaluation time Apr 7, 2020
util Package training code to avoid sys.path hacks Mar 25, 2020
.cardboardlint.yml Update cardboardlint configuration Oct 4, 2019
.compute Fix .compute for packaged training code Apr 1, 2020
.gitattributes Address review comments and update docs Feb 11, 2020
.gitignore Package and expose TypeScript for JS interface Apr 6, 2020
.gitmodules Use submodule for building contrib examples into docs Dec 10, 2019
.isort.cfg Sort importer imports with isort Mar 31, 2020
.pylintrc Fix linter errors Feb 11, 2020
.readthedocs.yml Re-enable readthedocs.io Sep 24, 2019
.taskcluster.yml Use KVM for Android emulator Feb 26, 2020
.travis.yml Package training code to avoid sys.path hacks Mar 25, 2020
BIBLIOGRAPHY.md Update BIBLIOGRAPHY.md Feb 21, 2020
CODE_OF_CONDUCT.md Add Mozilla Code of Conduct file Mar 29, 2019
CONTRIBUTING.rst Move from Markdown to reStructuredText Oct 4, 2019
DeepSpeech.py Package training code to avoid sys.path hacks Mar 25, 2020
Dockerfile Ensure docker build pip really install locally built package Apr 8, 2020
GRAPH_VERSION Bump graph version Jan 24, 2020
ISSUE_TEMPLATE.md Create an issue template Nov 27, 2017
LICENSE Added LICENSE Sep 20, 2016
README.rst Make readthedocs link more obvious Mar 12, 2020
RELEASE.rst Move from Markdown to reStructuredText Oct 4, 2019
SUPPORT.rst Point people to Matrix room instead of IRC Feb 11, 2020
VERSION Bump VERSION to 0.7.0-alpha.3 Mar 25, 2020
bazel.patch Proper re-use of Bazel cache Jan 31, 2018
build-python-wheel.yml-DISABLED_ENABLE_ME_TO_REBUILD_DURING_PR Move to ARMbian Buster Aug 21, 2019
evaluate.py Package training code to avoid sys.path hacks Mar 25, 2020
evaluate_tflite.py Package training code to avoid sys.path hacks Mar 25, 2020
lm_optimizer.py Merge pull request #2826 from TeHikuMedia/add_trial_pruning Apr 1, 2020
requirements_eval_tflite.txt Update evaluate_tflite requirements Jan 12, 2020
requirements_tests.txt Converting importers from multiprocessing.dummy to multiprocessing Mar 18, 2020
requirements_transcribe.txt Make webrtcvad really optional Feb 24, 2020
setup.py Do not use m/mu ABI for Py3.8+ Apr 20, 2020
stats.py Package training code to avoid sys.path hacks Mar 25, 2020
transcribe.py Split --load into two to avoid unexpected behavior at evaluation time Apr 7, 2020

README.rst

Project DeepSpeech

Documentation Task Status

DeepSpeech is an open source Speech-To-Text engine, using a model trained by machine learning techniques based on Baidu's Deep Speech research paper. Project DeepSpeech uses Google's TensorFlow to make the implementation easier.

NOTE: This documentation applies to the MASTER version of DeepSpeech only. Documentation for the latest stable version is published on deepspeech.readthedocs.io.

To install and use deepspeech all you have to do is:

# Create and activate a virtualenv
virtualenv -p python3 $HOME/tmp/deepspeech-venv/
source $HOME/tmp/deepspeech-venv/bin/activate

# Install DeepSpeech
pip3 install deepspeech

# Download pre-trained English model and extract
curl -LO https://github.com/mozilla/DeepSpeech/releases/download/v0.6.1/deepspeech-0.6.1-models.tar.gz
tar xvf deepspeech-0.6.1-models.tar.gz

# Download example audio files
curl -LO https://github.com/mozilla/DeepSpeech/releases/download/v0.6.1/audio-0.6.1.tar.gz
tar xvf audio-0.6.1.tar.gz

# Transcribe an audio file
deepspeech --model deepspeech-0.6.1-models/output_graph.pbmm --scorer deepspeech-0.6.1-models/kenlm.scorer --audio audio/2830-3980-0043.wav

A pre-trained English model is available for use and can be downloaded using the instructions below. A package with some example audio files is available for download in our release notes.

Quicker inference can be performed using a supported NVIDIA GPU on Linux. See the release notes to find which GPUs are supported. To run deepspeech on a GPU, install the GPU specific package:

# Create and activate a virtualenv
virtualenv -p python3 $HOME/tmp/deepspeech-gpu-venv/
source $HOME/tmp/deepspeech-gpu-venv/bin/activate

# Install DeepSpeech CUDA enabled package
pip3 install deepspeech-gpu

# Transcribe an audio file.
deepspeech --model deepspeech-0.6.1-models/output_graph.pbmm --scorer deepspeech-0.6.1-models/kenlm.scorer --audio audio/2830-3980-0043.wav

Please ensure you have the required CUDA dependencies.

See the output of deepspeech -h for more information on the use of deepspeech. (If you experience problems running deepspeech, please check required runtime dependencies).


Table of Contents

You can’t perform that action at this time.