🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
-
Updated
Jun 17, 2026 - Python
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
VoxCPM2: Tokenizer-Free TTS for Multilingual Speech Generation, Creative Voice Design, and True-to-Life Cloning
GUI for a Vocal Remover that uses Deep Neural Networks.
Industrial-grade speech recognition toolkit: 170x realtime, 50+ languages, speaker diarization, emotion detection, streaming, and OpenAI-compatible API.
A PyTorch-based Speech Toolkit
Speech recognition module for Python, supporting several engines and APIs, online and offline.
Code for the paper "Jukebox: A Generative Model for Music"
Automagically synchronize subtitles with video.
Python Audio Analysis Library: Feature Extraction, Classification, Segmentation and Applications
A lightweight yet powerful audio-to-MIDI converter with pitch bend detection
Picard is a cross-platform music tagger powered by the MusicBrainz database
Noise supression using deep filtering
An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.
Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.
SimpleMem: Efficient Lifelong Memory for LLM Agents — Text & Multimodal
MOSS‑TTS Family is an open‑source speech and sound generation model family from MOSI.AI and the OpenMOSS team. It is designed for high‑fidelity, high‑expressiveness, and complex real‑world scenarios, covering stable long‑form speech, multi‑speaker dialogue, voice/character design, environmental sound effects, and real‑time streaming TTS.
Data manipulation and transformation for audio signal processing, powered by PyTorch
Add a description, image, and links to the audio topic page so that developers can more easily learn about it.
To associate your repository with the audio topic, visit your repo's landing page and select "manage topics."