OpenAI Whisper-v3 API

Welcome to the OpenAI Whisper-v3 API! This API leverages the power of OpenAI's Whisper model to transcribe audio into text. Before diving in, ensure that your preferred PyTorch environment is set up—Conda is recommended.

Full guide available here
Live demo on Huggingface Spaces
Subtitle CLI available here

Introduction

Clone and set up the repository as follows:

git clone https://github.com/tdolan21/openai-whisper-v3-api
cd openai-whisper-v3-api
pip install -r requirements.txt
chmod +x run.sh

Alternatively, install the dependencies directly:

pip install transformers datasets fastapi uvicorn pillow soundfile librosa pydub yt-dlp

You may need to upgrade to the dev version of transformers:

pip install transformers --upgrade

Usage

./run.sh

Features

Short-Form Transcription: Quick and efficient transcription for short audio clips.
Long-Form Transcription: Tailored for longer audio sessions, ensuring high accuracy.
Batch Transcription: Process multiple audio files simultaneously with ease.
YouTube to MP3: Extract and transcribe audio from YouTube videos directly.

Batch Transcription

You can add any amount of mp3 files or subfolders containing .mp3 files and they will be normalized and transcribed with identifiers in the application.

For detailed usage and API endpoints, please refer to the API documentation once the server is running.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

OpenAI Whisper-v3 API

Introduction

Usage

Features

Batch Transcription

Files

README.md

Latest commit

History

README.md

File metadata and controls

OpenAI Whisper-v3 API

Introduction

Usage

Features

Batch Transcription