audio-transcriber
by Knuckles-Team
Overview
Provides an MCP micro-service and an A2A agent for audio transcription and processing using OpenAI Whisper models, capable of transcribing from files or live microphone input.
Installation
docker-compose up -dEnvironment Variables
- HOST
- PORT
- TRANSPORT
- OPENAI_API_KEY
- MCP_URL
- PROVIDER
- OPENAI_BASE_URL
- MODEL_ID
- DEBUG
- ENABLE_WEB_UI
- WHISPER_MODEL
- TRANSCRIBE_DIRECTORY
Security Notes
The `audio_transcriber/utils.py` module uses `pickle.load` which is a known deserialization vulnerability if `load_model` is called with a file from an untrusted source, potentially leading to arbitrary code execution. The `transcribe_audio` MCP tool accepts `audio_file` and `directory` as parameters, which could expose the server to path traversal or arbitrary file processing if not properly sanitized and validated by the client, though `whisper.load_model` and `Path.exists()` offer some protection. The `compose.yml` file hardcodes `OPENAI_API_KEY=llama` for both the MCP and Agent services, which, while likely intended for local Ollama-compatible setups, is a hardcoded secret in the configuration.
Similar Servers
yt-dlp-mcp
Integrate video platform capabilities like search, metadata extraction, and content download into AI agents using yt-dlp.
glm-asr
An all-in-one service for high-accuracy speech recognition (ASR) across multiple languages, featuring Web UI, REST API, SSE streaming, and MCP server integration.
video-transcriber-mcp-rs
High-performance video transcription and audio extraction from over 1000 online platforms or local video files, generating transcripts in plain text, JSON, and Markdown formats.
kokoro-mcp-server
This server provides a comprehensive Text-to-Speech toolkit for content creators and developers, integrating with AI tools via the Model Context Protocol (MCP), offering CLI and Streamlit interfaces, and supporting audio enhancement and multi-engine TTS (Kokoro, Indic, OpenVoice).