STFT and exploding training #3

eagomez2 · 2022-02-25T08:26:19Z

Hi Felix,

First of all thanks for sharing this repo, it's an amazingly interesting work. Since the datasets you used to my knowledge are paid and only at 16k and 8k, respectively, I wanted to train your network using the VCTK dataset. For training it, I only chose small files with high activity of the trimmed version of it to avoid long sections of silence. However, when I do so, the training goes well for a few epochs and the message becomes intelligible very fast, but then the loss (I am using L1) explodes to a value order of magnitudes higher as shown here. Have you observed this pattern during your trainings?

As a second question, I wanted to know what you used your own STFT implementation as opposed to the torchaudio ones that allow backprop.

Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

STFT and exploding training #3

STFT and exploding training #3

eagomez2 commented Feb 25, 2022

STFT and exploding training #3

STFT and exploding training #3

Comments

eagomez2 commented Feb 25, 2022