Skip to content

Generate YouTube timestamps and subtitles from a video file with OpenAI Whisper and GPT-4

License

MIT, Apache-2.0 licenses found

Licenses found

MIT
LICENSE
Apache-2.0
LICENSE-APACHE
Notifications You must be signed in to change notification settings

dtbuchholz/yt-timestamps-subtitles

Repository files navigation

YouTube Timestamp & Subtitle Generator

License standard-readme compliant

YouTube video timestamp and subtitle generator using OpenAI's whisper and GPT-4o models

Table of Contents

Background

This provides a simple script to generate YouTube timestamps and subtitles from a video file using OpenAI's whisper and GPT-4 models. It outputs a segments.srt file and a timestamps.txt file, which can be used when uploading a new video to YouTube. You can see an example of what the output looks like with an uploaded video here.

Usage

This project uses uv for package and script management. To install, follow the setup instructions: here. Note that python 3.11 is ideal due to a transitive dependency on moviepy (ffmpeg) that will log some syntax errors, but 3.12 should still work just fine.

Setup

First, clone this repo, initialize the virtual environment, and install dependencies:

git clone https://github.com/dtbuchholz/yt-timestamps-subtitles.git
cd yt-timestamps-subtitles
uv sync

If you need to install the specified version of python (3.11), you can use the following command:

uv python install

Then, make sure you set up an OpenAI API key/organization and set the environment variables in your .env file (see .env.example):

OPENAI_API_ORG=your_org_id
OPENAI_API_KEY=your_api_key

The dependencies include:

  • openai-whisper: OpenAI's whisper model for transcribing an mp4 file's subtitles (.srt format).
  • openai: Generate YouTube timestamps from the whisper model's output.
  • moviepy: Used for utility function to get video duration and pass it to the prompt.
  • dotenv: Load environment variables from .env file for the OpenAI API key and organization ID.

Transcribe and generate timestamps

Run the script with uv, passing a path to your video file:

uv run main.py --file /path/to/video.mp4

The output will generate a segments.srt file for subtitles with something like the following:

1
00:00:00,000 --> 00:00:09,599
The Tableland Studio is designed to let you interact with the Tableland network from the comfort of a web application or CLI to create teams, projects and tables.

2
00:00:10,599 --> 00:00:14,480
There are a number of features that it offers and we'll walk through them here today.

And a corresponding YouTube timestamps will take this into consideration when creating the timestamps.txt file to be used in the YouTube video's description:

0:00 - Introduction to Tableland Studio
0:15 - Logging into Tableland web app

Contributing

PRs accepted. Please review additional details explained in the contributing section of the docs site.

Small note: If editing the README, please conform to the standard-readme specification.

License

MIT AND Apache-2.0, © 2024 @dtbuchholz

About

Generate YouTube timestamps and subtitles from a video file with OpenAI Whisper and GPT-4

Topics

Resources

License

MIT, Apache-2.0 licenses found

Licenses found

MIT
LICENSE
Apache-2.0
LICENSE-APACHE

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages