Skip to content

Latest commit

 

History

History
59 lines (38 loc) · 1.93 KB

README.md

File metadata and controls

59 lines (38 loc) · 1.93 KB

OpenAI Whisper-v3 API

Welcome to the OpenAI Whisper-v3 API! This API leverages the power of OpenAI's Whisper model to transcribe audio into text. Before diving in, ensure that your preferred PyTorch environment is set up—Conda is recommended.

  • Full guide available here
  • Live demo on Huggingface Spaces
  • Subtitle CLI available here

Whisper-v3

Python FastAPI Librosa Uvicorn

Introduction

Clone and set up the repository as follows:

git clone https://github.com/tdolan21/openai-whisper-v3-api
cd openai-whisper-v3-api
pip install -r requirements.txt
chmod +x run.sh

Alternatively, install the dependencies directly:

pip install transformers datasets fastapi uvicorn pillow soundfile librosa pydub yt-dlp

You may need to upgrade to the dev version of transformers:

pip install transformers --upgrade

Usage

./run.sh

Features

  • Short-Form Transcription: Quick and efficient transcription for short audio clips.
  • Long-Form Transcription: Tailored for longer audio sessions, ensuring high accuracy.
  • Batch Transcription: Process multiple audio files simultaneously with ease.
  • YouTube to MP3: Extract and transcribe audio from YouTube videos directly.

Batch Transcription

You can add any amount of mp3 files or subfolders containing .mp3 files and they will be normalized and transcribed with identifiers in the application.

For detailed usage and API endpoints, please refer to the API documentation once the server is running.