voicemode
Verified Safeby mbailey
Overview
Provides robust voice interaction capabilities for Model Context Protocol (MCP) agents, enabling real-time speech-to-text (STT) and text-to-speech (TTS) functionalities, with support for local and cloud-based services. It also includes tools for audio playback (DJ), service management, and diagnostics.
Installation
voice-mode serveEnvironment Variables
- OPENAI_API_KEY
- VOICEMODE_BASE_DIR
- VOICEMODE_DEBUG
- VOICEMODE_TTS_BASE_URLS
- VOICEMODE_STT_BASE_URLS
- VOICEMODE_VOICES
- VOICEMODE_TTS_MODELS
- VOICEMODE_WHISPER_MODEL
- VOICEMODE_WHISPER_PORT
- VOICEMODE_KOKORO_PORT
- VOICEMODE_PRONOUNCE
- VOICEMODE_SERVICE_AUTO_ENABLE
Security Notes
Extensive use of `subprocess.run`/`Popen` for system integration (package installation, Git cloning, running services) poses a risk for command injection if user-provided input is not rigorously sanitized, though `Path` objects and `shlex.split` offer some protection. The `serve` command exposes the MCP server via HTTP/SSE, requiring explicit configuration of IP allowlisting, secret path, or token authentication to prevent unauthorized access. Trust in external repositories (whisper.cpp, kokoro-fastapi) and their integrity is assumed for installation.
Similar Servers
mcp-node
Enables natural language interaction with Algolia data through Claude Desktop by exposing Algolia APIs via the Model Context Protocol (MCP).
consult-llm-mcp
An MCP server that allows AI agents like Claude Code to consult stronger, more capable AI models (e.g., GPT-5.2, Gemini 3.0 Pro) for complex code analysis, debugging, and architectural advice.
mcp-tts
Provides Text-to-Speech (TTS) capabilities to MCP (Model Context Protocol) clients using various AI and system-level TTS engines.
groq-mcp-server
Provides a Model Context Protocol (MCP) server to access Groq's AI capabilities, including ultra-fast LLM chat, vision, text-to-speech, speech-to-text, agentic tooling, and batch processing, from clients like Claude Desktop and Cursor.