Back to Home
mbailey icon

voicemode

by mbailey

Overview

Enables voice interaction capabilities (Speech-to-Text and Text-to-Speech) for Model Context Protocol (MCP) servers, allowing for natural language conversations and voice-controlled actions.

Installation

Run Command
uvx voice-mode

Environment Variables

  • OPENAI_API_KEY
  • VOICEMODE_BASE_DIR
  • VOICEMODE_MODELS_DIR
  • VOICEMODE_DEBUG
  • VOICEMODE_SAVE_ALL
  • VOICEMODE_SAVE_AUDIO
  • VOICEMODE_SAVE_TRANSCRIPTIONS
  • VOICEMODE_AUDIO_FEEDBACK
  • VOICEMODE_TTS_BASE_URLS
  • VOICEMODE_STT_BASE_URLS
  • VOICEMODE_VOICES
  • VOICEMODE_TTS_MODELS
  • VOICEMODE_PREFER_LOCAL
  • VOICEMODE_ALWAYS_TRY_LOCAL
  • VOICEMODE_AUTO_START_KOKORO
  • VOICEMODE_WHISPER_MODEL
  • VOICEMODE_WHISPER_PORT
  • VOICEMODE_WHISPER_LANGUAGE
  • VOICEMODE_WHISPER_MODEL_PATH
  • VOICEMODE_KOKORO_PORT
  • VOICEMODE_KOKORO_MODELS_DIR
  • VOICEMODE_KOKORO_CACHE_DIR
  • VOICEMODE_KOKORO_DEFAULT_VOICE
  • LIVEKIT_URL
  • LIVEKIT_API_KEY
  • LIVEKIT_API_SECRET
  • LIVEKIT_ACCESS_PASSWORD
  • VOICEMODE_DISABLE_SILENCE_DETECTION
  • VOICEMODE_VAD_AGGRESSIVENESS
  • VOICEMODE_SILENCE_THRESHOLD_MS
  • VOICEMODE_MIN_RECORDING_DURATION
  • VOICEMODE_INITIAL_SILENCE_GRACE_PERIOD
  • VOICEMODE_DEFAULT_LISTEN_DURATION
  • VOICEMODE_STREAMING_ENABLED
  • VOICEMODE_STREAM_CHUNK_SIZE
  • VOICEMODE_STREAM_BUFFER_MS
  • VOICEMODE_STREAM_MAX_BUFFER
  • VOICEMODE_EVENT_LOG_ENABLED
  • VOICEMODE_EVENT_LOG_DIR
  • VOICEMODE_EVENT_LOG_ROTATION
  • VOICEMODE_PRONOUNCE
  • VOICEMODE_PRONOUNCE_ENABLED
  • VOICEMODE_PRONUNCIATION_LOG_SUBSTITUTIONS
  • VOICEMODE_CHIME_LEADING_SILENCE
  • VOICEMODE_CHIME_TRAILING_SILENCE
  • VOICEMODE_FRONTEND_PORT
  • VOICEMODE_FRONTEND_HOST
  • VOICEMODE_TOOLS_ENABLED
  • VOICEMODE_TOOLS_DISABLED
  • VOICEMODE_TOOLS
  • VOICEMODE_SERVICE_AUTO_ENABLE
  • FRONTEND_MODE

Security Notes

The installer and service management components extensively use `subprocess.run` and `subprocess.Popen` without consistently employing `shlex.quote` for user-controlled inputs (e.g., `install_dir`, `model_name`, `version`). This poses a significant risk of shell injection if malicious input is provided. Additionally, direct execution of remote scripts via `curl | bash` is used for `uv` and LiveKit installation (`livekit_install` in `voice_mode/tools/livekit/install.py`), which is a critical security vulnerability as it allows arbitrary remote code execution. Default hardcoded passwords (`voicemode123`, `devkey: secret`) exist for development modes of the LiveKit frontend and server, which could be inadvertently exposed.

Similar Servers

Stats

Interest Score97
Security Score3
Cost ClassMedium
Avg Tokens600
Stars480
Forks65
Last Update2025-12-04

Tags

AIVoiceSpeech-to-TextText-to-SpeechMCPCLI