v0.1.0 #605

erogol · 2021-07-03T12:18:49Z

erogol
Jul 3, 2021
Maintainer

🐸 v0.1.0

In a nutshell, there are a ton of updates in this release. I don't know if we can cover them all here but let's try.

After this release, 🐸 TTS stands on the following architecture.

Trainer API for training.
Synthesizer API for inference.
ModelManager API for managing 🐸TTS model zoo.
SpeakerManager API for managing speakers in a multi-speaker setting.
(TBI) Exporter API for exporting models to ONNX, TorchScript, etc.
(TBI) Data Processing API for making a dataset ready for training.
Model API for implementing models, compatible with all the other components above.

Updates

💾 Code updates

Brand new Trainer API

We unified all the training code in a lightweight but feature complete Trainer API. From now on all the 🐸TTS
models will use this new API for training.

It provides mixed precision (with Nvidia's APEX of torch.amp) and multi-gpu training for all the models.
Brand new Model API

Abstract BaseModel and its BaseTTS, BaseVocoder child classes are used as the basis of the 🐸TTS models now.
Any model that implements one of these classes, works seamlessly with the Trainer and Synthesizer.
Brand new 🐸TTS recipes.

We decided to merge the recipes to the main project. Now we host recipes for the LJspeech dataset, covering all the implemented models.
So you can pick the model you want, change the parameters, and train your own model easily.

Thanks to the new Trainer API and 👩‍✈️Coqpit integration, we could implement these recipes with pure python.
Updates SpeakerManager API

TTS.utilsSpeakerManager is now the core unit to manage speakers in a multi-speaker model and interface a SpeakerEncoder model with the tts and vocoder models.
Updated model training mechanics.

You can now use pure Python to define your model and run the training. It is useful to train models on a Jupyter
Notebook or the other python environments.

We also keep the old mechanics by using TTS/bin/train_tts.py or ``TTS/bin/train_vocoder.py`. You just need to
change the previous training script name with one of these two based on your model.
```
python TTS/bin/train_tacotron.py --config_path config.json
```
becomes
```
python TTS/bin/train_tts.py --config_path config.json
```
Use 👩‍✈️Coqpit for managing model class arguments.

Now all the model arguments are defined in a coqpit class and imported by the model config.
gruut based character to phoneme conversion. (👑 @synesthesiam)

As a drop-in replacement for the previous solution that is compatible with the released models. So now all these
models are functional again without version nitpicking.
Set test_sentences in the config rather than providing a txt file.
Set the maximum number of decoder steps of Tacotron1-2 models in the config.

🏃‍♀️ Operational Updates

FINALLY DOCUMENTATION!! https://tts.readthedocs.io

Pages will be visible after merging this PR. For now, you can check https://tts.readthedocs.io/en/dev/.
Enable support for Python 3.9
Changes for PyTorch 1.9.0

🏅 Model implementations

Univnet GAN Vocoder: https://arxiv.org/pdf/2106.07889.pdf (👑 @rishikksh20)

🚀 Model releases

We solved the compat issues and re-release some of the models. You can see them in the released binaries section.

You don't need to change anything. If you use v0.1.0, by default, it uses these new models.

This discussion was created from the release v0.1.0.

Sadam1195 · 2021-07-06T17:11:13Z

Sadam1195
Jul 6, 2021

Great additions, specially the Trainer API (this has made life a lot easier).

I think project still needs to have a simple config structure. With newer version and very little documentation things might be difficult for new developers to try experimenting. I am still tagged in a lot of discussions for simple running inference and training commands.

Sad to see espeak-ng being dropped due to license issues as it offered a lot out of box for training models on many languages.

Nonetheless, great work.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.1.0 #605

{{title}}

Replies: 1 comment

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

v0.1.0 #605

erogol Jul 3, 2021 Maintainer

🐸 v0.1.0

Updates

💾 Code updates

🏃‍♀️ Operational Updates

🏅 Model implementations

🚀 Model releases

Replies: 1 comment

Sadam1195 Jul 6, 2021

erogol
Jul 3, 2021
Maintainer

Sadam1195
Jul 6, 2021