A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
-
Updated
Jul 3, 2024 - Python
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Streamer-Sales 销冠 —— 卖货主播 LLM 大模型🛒🎁,一个能够根据给定的商品特点从激发用户购买意愿角度出发进行商品解说的卖货主播大模型。🚀⭐内含详细的数据生成流程❗ 📦另外还集成了 LMDeploy 加速推理🚀、RAG检索增强生成 📚、TTS文字转语音🔊、数字人生成 🦸、 Agent 使用网络查询实时信息🌐、ASR 语音转文字🎙️
Voicegain Enterprise Speech-to-Text Platform (API, Portal, etc.)
ICASSP 2023-2024 Papers: A complete collection of influential and exciting research papers from the ICASSP 2023-24 conferences. Explore the latest advancements in acoustics, speech and signal processing. Code included. Star the repository to support the advancement of audio and signal processing!
A PyTorch-based Speech Toolkit
NVIDIA NeMo's stt_en_fastconformer_ctc_large finetuned on open-source telugu data for Automatic Speech Recognition
Lingvo
Official Python SDK for Deepgram's automated speech recognition APIs.
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
Official JavaScript SDK for Deepgram's automated speech recognition APIs.
Production First and Production Ready End-to-End Speech Recognition Toolkit
A fast CPU-first video/audio transcriber for generating caption files with Whisper and CTranslate2, hosted on Hugging Face Spaces.
An open source NLP as a service project focused on providing state of the art systems with ease. Training and inference by simple docker commands
🚀 Framework for seamless fine-tuning of Whisper model on a multi-lingual dataset and deployment to prod.
The dataset of Speech Recognition
My Implementations' Archive
Speech Note Linux app. Note taking, reading and translating with offline Speech to Text, Text to Speech and Machine translation.
تفريغ النصوص وإنشاء ملفات SRT و VTT باستخدام نماذج Whisper وتقنية wit.ai.
Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
Add a description, image, and links to the asr topic page so that developers can more easily learn about it.
To associate your repository with the asr topic, visit your repo's landing page and select "manage topics."