Skip to content

Latest commit

 

History

History
171 lines (145 loc) · 3.49 KB

whisper.md

File metadata and controls

171 lines (145 loc) · 3.49 KB

whisper.cpp on Apple Silicon

This guide will install whisper.cpp as a simple transcriber and subtitle generator on macOS with Apple Silicon

Install XCode, then run

sudo xcode-select --switch /Applications/Xcode.app/Contents/Developer

Install Conda

wget https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-MacOSX-arm64.sh
bash Miniforge3-MacOSX-arm64.sh

And if you don't want conda to be activated by default:

conda config --set auto_activate_base false

Then you can activate and deactivate conda:

conda activate
conda deactivate

Source

Set Conda enviroment:

conda create -n py310-whisper python=3.10 -y
conda activate py310-whisper

Install dependencies:

pip install ane_transformers
pip install openai-whisper
pip install coremltools

Clone Whisper:

cd ~/.apps
git clone https://github.com/ggerganov/whisper.cpp.git && cd whisper.cpp

Build for Apple Silicon:

make clean
WHISPER_COREML=1 make -j

Generate and build the model you need, tiny works well:

./models/generate-coreml-model.sh tiny.en && make tiny.en
./models/generate-coreml-model.sh tiny && make tiny
./models/generate-coreml-model.sh base.en && make base.en
./models/generate-coreml-model.sh base && make base
./models/generate-coreml-model.sh small.en && make small.en
./models/generate-coreml-model.sh small && make small
./models/generate-coreml-model.sh medium.en && make medium.en
./models/generate-coreml-model.sh medium && make medium
./models/generate-coreml-model.sh large-v1 && make large-v1
./models/generate-coreml-model.sh large-v2 && make large-v2
./models/generate-coreml-model.sh large-v3 && make large-v3

Source

Model file names in models/ are the following:

ggml-tiny.en.bin
ggml-tiny.bin
ggml-base.en.bin
ggml-base.bin
ggml-small.en.bin
ggml-small.bin
ggml-medium.en.bin
ggml-medium.bin
ggml-large-v1.bin
ggml-large-v2.bin
ggml-large-v3.bin

Try whisper.cpp with your derired arguments:

./main \
-m models/ggml-medium.bin \
-l es \
-otxt \
-ovtt \
-osrt \
-olrc \
-owts \
-f /path/to/test.wav

To create an audio file compatible with whisper.cpp, you can use FFmpeg:

ffmpeg \
-i audio.m4a \
-ar 16000 \
-ac 1 \
-c:a pcm_s16le \
test.wav

Create the transcribe bash file:

cd ~/.bin 
touch transcribe
chmod +x transcribe
nano transcribe

transcribe bash file:

#!/usr/bin/env bash

# Modifying the internal field separator
IFS=$'\t\n'

if [ -z "$1" ]; then
	echo
	echo  ERROR!
	echo  No input file specified.
	echo
else
	~/.apps/whisper.cpp/main \
	-m ~/.apps/whisper.cpp/models/ggml-tiny.bin \
	-l es \
	-otxt \
	-ovtt \
	-osrt \
	-olrc \
	-owts \
	--print-colors \
	-f "$1"
fi

Alternative

Install Vosk

conda activate
pip install vosk

Help and models

vosk-transcriber --help
vosk-transcriber --list-model

Do a transcription to SRT, using the Espanish module, with warn log level

vosk-transcriber \
-n vosk-model-es-0.42 \
--log-level warn \
-i "audio.m4a" \
-t srt \
-o transcription.srt

Extra notes: