Skip to content

Pre-trained models V2

Compare
Choose a tag to compare
@SeanNaren SeanNaren released this 12 Jan 18:42
· 254 commits to master since this release
e2c2d83

Supplied are a set of pre-trained networks that can be used for evaluation. Do not expect these models to perform well on your own data! They are heavily tuned to the datasets they are trained on.

Results are given using greedy decoding. Expect a well trained language model to reduce WER/CER substantially.

These models should work with later versions of deepspeech.pytorch. A note to consider is that parameters have changed from underscores to dashes (i.e --rnn_type is now --rnn-type).

AN4

Commit hash: e2c2d832357a992f36e68b5f378c117dd270d6ff

Training command:

python train.py  --rnn_type gru --hidden_size 800 --hidden_layers 5 --checkpoint --train_manifest data/an4_train_manifest.csv --val_manifest data/an4_val_manifest.csv --epochs 100 --num_workers $(nproc) --cuda --batch_size 32 --learning_anneal 1.01 --augment
Dataset WER CER
AN4 test 10.58 4.88

Download here.

Librispeech

Commit hash: e2c2d832357a992f36e68b5f378c117dd270d6ff

Training command:

python train.py  --rnn_type gru --hidden_size 800 --hidden_layers 5 --checkpoint --visdom --train_manifest data/libri_train_manifest.csv --val_manifest data/libri_val_manifest.csv --epochs 15 --num_workers $(nproc) --cuda --checkpoint --batch_size 10 --learning_anneal 1.1
Dataset WER CER
Librispeech clean 11.27 3.09
Librispeech other 30.74 10.97

Download here.

TEDLIUM

Commit hash: e2c2d832357a992f36e68b5f378c117dd270d6ff

Training command:

python train.py  --rnn_type gru --hidden_size 800 --hidden_layers 5 --checkpoint --visdom --train_manifest data/ted_train_manifest.csv --val_manifest data/ted_val_manifest.csv --epochs 15 --num_workers $(nproc) --cuda --checkpoint --batch_size 10 --learning_anneal 1.1
Dataset WER CER
Ted test 31.04 10.00

Download here.