Skip to content

tdolan21/openai-whisper-v3-api

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

OpenAI Whisper-v3 API

Welcome to the OpenAI Whisper-v3 API! This API leverages the power of OpenAI's Whisper model to transcribe audio into text. Before diving in, ensure that your preferred PyTorch environment is set up—Conda is recommended.

  • Full guide available here
  • Live demo on Huggingface Spaces
  • Subtitle CLI available here

Whisper-v3

Python FastAPI Librosa Uvicorn

Introduction

Clone and set up the repository as follows:

git clone https://github.com/tdolan21/openai-whisper-v3-api
cd openai-whisper-v3-api
pip install -r requirements.txt
chmod +x run.sh

Alternatively, install the dependencies directly:

pip install transformers datasets fastapi uvicorn pillow soundfile librosa pydub yt-dlp

You may need to upgrade to the dev version of transformers:

pip install transformers --upgrade

Usage

./run.sh

Features

  • Short-Form Transcription: Quick and efficient transcription for short audio clips.
  • Long-Form Transcription: Tailored for longer audio sessions, ensuring high accuracy.
  • Batch Transcription: Process multiple audio files simultaneously with ease.
  • YouTube to MP3: Extract and transcribe audio from YouTube videos directly.

Batch Transcription

You can add any amount of mp3 files or subfolders containing .mp3 files and they will be normalized and transcribed with identifiers in the application.

For detailed usage and API endpoints, please refer to the API documentation once the server is running.

About

FastAPI + Streamlit interface for OpenAI Whisper-large-v3 with youtube-to-mp3

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published