Back to Home
cescroca1976 icon

voicemode-windows

Verified Safe

by cescroca1976

Overview

An MCP server enabling real-time voice interaction (Speech-to-Text and Text-to-Speech) for AI agents, integrating local services like Whisper and Kokoro, with configurable cloud fallback (OpenAI).

Installation

Run Command
python -m voice_mode

Environment Variables

  • OPENAI_API_KEY
  • VOICEMODE_WHISPER_MODEL
  • VOICEMODE_WHISPER_PORT
  • VOICEMODE_KOKORO_PORT
  • VOICEMODE_TTS_BASE_URLS
  • VOICEMODE_STT_BASE_URLS
  • VOICEMODE_AUDIO_FEEDBACK
  • VOICEMODE_DISABLE_SILENCE_DETECTION
  • VOICEMODE_VAD_AGGRESSIVENESS
  • VOICEMODE_DEFAULT_LISTEN_DURATION
  • VOICEMODE_SERVICE_AUTO_ENABLE
  • VOICEMODE_SKIP_TTS
  • VOICEMODE_PRONOUNCE
  • VOICEMODE_PRONUNCIATION_ENABLED

Security Notes

The server leverages extensive `subprocess` calls for installing and managing external tools (git, cmake, package managers, whisper.cpp, kokoro-fastapi), which broadens the attack surface. Its security heavily depends on the integrity of these external projects and the user's system configuration. The `whisper-server` is configured to bind to `0.0.0.0` by default, which can expose it externally if not properly firewalled. Sensitive `OPENAI_API_KEY`s are handled well via environment variables and masked in logs/outputs. No direct `eval` or malicious patterns were found, and `shlex.split` is used for parsing rules. The testing suite includes measures to prevent dangerous commands, indicating developer awareness of subprocess risks.

Similar Servers

Stats

Interest Score0
Security Score7
Cost ClassLow
Avg Tokens500
Stars0
Forks0
Last Update2026-01-18

Tags

voiceSTTTTSMCPAntigravitylocal-servicescloud-services