v0.1.0 #605
erogol
started this conversation in
Versions & Releases
v0.1.0
#605
Replies: 1 comment
-
Great additions, specially the Trainer API (this has made life a lot easier). I think project still needs to have a simple config structure. With newer version and very little documentation things might be difficult for new developers to try experimenting. I am still tagged in a lot of discussions for simple running inference and training commands. Sad to see Nonetheless, great work. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
🐸 v0.1.0
In a nutshell, there are a ton of updates in this release. I don't know if we can cover them all here but let's try.
After this release, 🐸 TTS stands on the following architecture.
Trainer API
for training.Synthesizer API
for inference.ModelManager API
for managing 🐸TTS model zoo.SpeakerManager API
for managing speakers in a multi-speaker setting.Exporter API
for exporting models to ONNX, TorchScript, etc.Data Processing API
for making a dataset ready for training.Model API
for implementing models, compatible with all the other components above.Updates
💾 Code updates
Brand new
Trainer API
We unified all the training code in a lightweight but feature complete
Trainer API
. From now on all the 🐸TTSmodels will use this new API for training.
It provides mixed precision (with Nvidia's APEX of
torch.amp
) and multi-gpu training for all the models.Brand new
Model API
Abstract
BaseModel
and itsBaseTTS
,BaseVocoder
child classes are used as the basis of the 🐸TTS models now.Any model that implements one of these classes, works seamlessly with the
Trainer
andSynthesizer
.Brand new 🐸TTS
recipes
.We decided to merge the recipes to the main project. Now we host recipes for the LJspeech dataset, covering all the implemented models.
So you can pick the model you want, change the parameters, and train your own model easily.
Thanks to the new✈️ Coqpit integration, we could implement these recipes with pure python.
Trainer API
and 👩Updates
SpeakerManager API
TTS.utilsSpeakerManager
is now the core unit to manage speakers in a multi-speaker model and interface aSpeakerEncoder
model with thetts
andvocoder
models.Updated model training mechanics.
You can now use pure Python to define your model and run the training. It is useful to train models on a Jupyter
Notebook or the other python environments.
We also keep the old mechanics by using
TTS/bin/train_tts.py
or ``TTS/bin/train_vocoder.py`. You just need tochange the previous training script name with one of these two based on your model.
becomes
Use 👩✈️ Coqpit for managing model class arguments.
Now all the model arguments are defined in a
coqpit
class and imported by the model config.gruut
based character to phoneme conversion. (👑 @synesthesiam)As a drop-in replacement for the previous solution that is compatible with the released models. So now all these
models are functional again without version nitpicking.
Set
test_sentences
in the config rather than providing a txt file.Set the maximum number of decoder steps of
Tacotron1-2
models in the config.🏃♀️ Operational Updates
FINALLY DOCUMENTATION!! https://tts.readthedocs.io
Pages will be visible after merging this PR. For now, you can check https://tts.readthedocs.io/en/dev/.
Enable support for Python 3.9
Changes for PyTorch 1.9.0
🏅 Model implementations
🚀 Model releases
We solved the compat issues and re-release some of the models. You can see them in the released binaries section.
You don't need to change anything. If you use v0.1.0, by default, it uses these new models.
This discussion was created from the release v0.1.0.
Beta Was this translation helpful? Give feedback.
All reactions