You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I hope this message finds you well. I came across your repository and found your paper implementation to be very interesting. I'm quite intrigued by your research and would like to follow your work more closely. In this regard, I have a question regarding the calculation of the Signal-to-Noise Ratio (SNR) mentioned in the paper.
While examining your code, I noticed that it provides two methods for waveform recovery: one using the original phase and another using the griff-lim algorithm. However, the paper itself does not explicitly mention how the message is retrieved from the spectrogram and returned to the time domain waveform. I have performed some experiments using the VCTK dataset with the default settings, and I found that using the original phase method for message waveform recovery resulted in a time domain SNR better than the one mentioned in the paper (14.34 vs 8.76). However, when I utilized the griff-lim algorithm for waveform recovery, the obtained time domain SNR was -2.53. Although the message remains intelligible, there is a significant difference between these two results and the one mentioned in the paper (8.76).
I would greatly appreciate it if you could provide some clarification regarding how the message is returned to the time domain waveform and how the SNR is calculated in the context of your paper. Any additional insights or guidance you can provide would be highly valuable to me.
Thank you for your time and consideration. I look forward to your response.
The text was updated successfully, but these errors were encountered:
I experimented according to the default values of the paper and the code and could not reach the implementation in the paper, the SNR of the carrier is only 17 at the highest. why is this?
Hi Felix,
I hope this message finds you well. I came across your repository and found your paper implementation to be very interesting. I'm quite intrigued by your research and would like to follow your work more closely. In this regard, I have a question regarding the calculation of the Signal-to-Noise Ratio (SNR) mentioned in the paper.
While examining your code, I noticed that it provides two methods for waveform recovery: one using the original phase and another using the griff-lim algorithm. However, the paper itself does not explicitly mention how the message is retrieved from the spectrogram and returned to the time domain waveform. I have performed some experiments using the VCTK dataset with the default settings, and I found that using the original phase method for message waveform recovery resulted in a time domain SNR better than the one mentioned in the paper (14.34 vs 8.76). However, when I utilized the griff-lim algorithm for waveform recovery, the obtained time domain SNR was -2.53. Although the message remains intelligible, there is a significant difference between these two results and the one mentioned in the paper (8.76).
I would greatly appreciate it if you could provide some clarification regarding how the message is returned to the time domain waveform and how the SNR is calculated in the context of your paper. Any additional insights or guidance you can provide would be highly valuable to me.
Thank you for your time and consideration. I look forward to your response.
The text was updated successfully, but these errors were encountered: