Skip to content

PhoWhisper: Automatic Speech Recognition for Vietnamese (2024)

License

Notifications You must be signed in to change notification settings

VinAIResearch/PhoWhisper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 

Repository files navigation

PhoWhisper: Automatic Speech Recognition for Vietnamese

We introduce PhoWhisper in five versions for Vietnamese automatic speech recognition. PhoWhisper's robustness is achieved through fine-tuning the multilingual Whisper on an 844-hour dataset that encompasses diverse Vietnamese accents. Our experimental study demonstrates state-of-the-art performances of PhoWhisper on benchmark Vietnamese ASR datasets. Please cite our PhoWhisper paper when it is used to help produce published results or is incorporated into other software:

@inproceedings{PhoWhisper,
  title     = {{PhoWhisper: Automatic Speech Recognition for Vietnamese}},
  author    = {Thanh-Thien Le and Linh The Nguyen and Dat Quoc Nguyen},
  booktitle = {Proceedings of the ICLR 2024 Tiny Papers track},
  year      = {2024}
}

Model download & WER results

Model #paras CMV–Vi VIVOS VLSP 2020 Task-1 VLSP 2020 Task-2
vinai/PhoWhisper-tiny 39M 19.05 10.41 20.74 49.85
vinai/PhoWhisper-base 74M 16.19 8.46 19.70 43.01
vinai/PhoWhisper-small 244M 11.08 6.33 15.93 32.96
vinai/PhoWhisper-medium 769M 8.27 4.97 14.12 26.85
vinai/PhoWhisper-large 1.55B 8.14 4.67 13.75 26.68

Run the model

Example usage with transformers

from transformers import pipeline
transcriber = pipeline("automatic-speech-recognition", model="vinai/PhoWhisper-small")
output = transcriber(path_to_audio_with_sampling_rate_16kHz)['text']

About

PhoWhisper: Automatic Speech Recognition for Vietnamese (2024)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published