A web application written to accept an audio file, transcribe it using Google's Speech-to-Text API, and generate a downloadable Word document containing the transcription. It leverages Flask for the web server, Google Cloud Storage for storing audio files, and will soon use Google Cloud Firestore for managing metadata.
- Flask Web Server: Manages file uploads and serves the generated Word documents.
- Google Cloud Storage (GCS): Stores the uploaded audio files.
- Google Speech-to-Text API: Performs the audio file transcriptions.
- python-docx: Generates Word documents from the transcription texts.
- Python 3.8 or later.
- A Google Cloud account.
- Flask installed in your Python environment.
- Google Cloud SDK installed for deployment purposes. [not necessary for running locally]
-
Clone this repository:
git clone [email protected]:quarterlylearnings/reading-transcriber.git cd reading-transcriber
-
Install dependencies:
pip install -r requirements.txt
-
Configure Google Cloud service account authentication:
Set the
GOOGLE_APPLICATION_CREDENTIALS
environment variable to the path of your Google Cloud service account key file.export GOOGLE_APPLICATION_CREDENTIALS="/path/to/your/service-account-file.json"
-
Run the application:
flask run