Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding the num of segments into TranscriptionInfo #1053

Closed
wants to merge 2 commits into from

Conversation

stsfaroz
Copy link

Users will find it useful to know the number of segments.

@MahmoudAshraf97
Copy link
Collaborator

MahmoudAshraf97 commented Oct 10, 2024

This will not work, segments is a generator whose length is unknown until it is exhausted, so len(segments) will run the whole transcription just to get the number of segments and then return an empty generator hence the error that is appearing in the CI because the transcription is empty,

Anyway, I don't see why this should be added to the codebase when you can easily do it later after getting the transcription results

@stsfaroz
Copy link
Author

@MahmoudAshraf97 I want to do it in cpu , but this len(list(segments)) takes more seconds than the model.transcribe , is there a way to get it faster ?

@MahmoudAshraf97
Copy link
Collaborator

As I explained, it is not possible to get the length of the segments before transcription, it might be doable in batched mode but impossible to do in sequential, just get the length after transcription that is the only guaranteed way of doing that

@stsfaroz stsfaroz closed this Oct 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants