Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

STFT and exploding training #3

Open
eagomez2 opened this issue Feb 25, 2022 · 0 comments
Open

STFT and exploding training #3

eagomez2 opened this issue Feb 25, 2022 · 0 comments

Comments

@eagomez2
Copy link

Hi Felix,

First of all thanks for sharing this repo, it's an amazingly interesting work. Since the datasets you used to my knowledge are paid and only at 16k and 8k, respectively, I wanted to train your network using the VCTK dataset. For training it, I only chose small files with high activity of the trimmed version of it to avoid long sections of silence. However, when I do so, the training goes well for a few epochs and the message becomes intelligible very fast, but then the loss (I am using L1) explodes to a value order of magnitudes higher as shown here. Have you observed this pattern during your trainings?

As a second question, I wanted to know what you used your own STFT implementation as opposed to the torchaudio ones that allow backprop.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant