Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SOFA inference #29

Open
Arseny5 opened this issue Aug 9, 2024 · 5 comments
Open

SOFA inference #29

Arseny5 opened this issue Aug 9, 2024 · 5 comments

Comments

@Arseny5
Copy link

Arseny5 commented Aug 9, 2024

Hello!

Dear developers, thank you so much for your aligner SOFA. Do you have some inference scripts with SOFA?

@qiuqiao
Copy link
Owner

qiuqiao commented Aug 9, 2024

Thank you for reaching out and showing interest in SOFA!

The infer.py script is used for inference tasks, and its usage is documented in the README.md file.

Could you please provide more details on what you're trying to achieve or what information you're missing? This will help me better address your needs.

@Arseny5
Copy link
Author

Arseny5 commented Aug 9, 2024

Thank you for reaching out and showing interest in SOFA!

The infer.py script is used for inference tasks, and its usage is documented in the README.md file.

Could you please provide more details on what you're trying to achieve or what information you're missing? This will help me better address your needs.

I don't understand what data needs to the input in order to make a competent inference. I saw that we are using the --folder segments_path flag, where segments_path is the path to our data on which we want to apply SOFA. As I saw from the repository's readme, the data should consist of wav and lab files.

However, my problem is that I only have wav files. I don't have any lab files. And I don't quite understand how to get them. Moreover, I do not understand exactly what the file should look like. As far as I know, the pipeline is that we get the text transcription from wav. Next, we use g2p to get phonemes, and then we write these phonemes in the lab. After that, we use the SOFA.

Could you please share at least a few samples of what wav + lab files should ideally look like, or maybe share scripts on how to get these lab files from wav in order to further make SOFA inference on them?

@qiuqiao
Copy link
Owner

qiuqiao commented Aug 9, 2024

It appears that the description in the README.md is not clear enough, leading to a misunderstanding of the inference process. In fact, the g2p conversion is part of the SOFA inference process, so the lab files should record the text transcription of the corresponding wav files.

I will update the README.md and add a few examples of wav + lab files later.

@Arseny5
Copy link
Author

Arseny5 commented Aug 10, 2024

It appears that the description in the README.md is not clear enough, leading to a misunderstanding of the inference process. In fact, the g2p conversion is part of the SOFA inference process, so the lab files should record the text transcription of the corresponding wav files.

I will update the README.md and add a few examples of wav + lab files later.

Oh, it's very useful for me right now. Thank you very much!

@Arseny5
Copy link
Author

Arseny5 commented Aug 10, 2024

It appears that the description in the README.md is not clear enough, leading to a misunderstanding of the inference process. In fact, the g2p conversion is part of the SOFA inference process, so the lab files should record the text transcription of the corresponding wav files.

I will update the README.md and add a few examples of wav + lab files later.

And maybe you have a script for how to get a starter lab file with wav transcription?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants