glm-asr
Verified Safeby neosun100
Overview
An all-in-one service for high-accuracy speech recognition (ASR) across multiple languages, featuring Web UI, REST API, SSE streaming, and MCP server integration.
Installation
docker run -d --gpus all -p 7860:7860 neosun/glm-asr:v2.0.1Environment Variables
- MODEL_CHECKPOINT
- PORT
- HF_HOME
- NVIDIA_VISIBLE_DEVICES
Security Notes
CORS is broadly enabled (`allow_origins=["*"]`) in the FastAPI (`main.py`) and Flask (`app.py`) implementations, which might be a concern if the API handles sensitive data beyond ASR. The `mcp_server.py` directly consumes `audio_path` from the client; a malicious MCP client could attempt to pass arbitrary file paths, though the `gpu_manager.transcribe` mitigates some direct exploitation by re-saving to a temporary file for processing. The `eval` call in `inference.py` uses a hardcoded string and is not user-controllable, thus not a direct vulnerability.
Similar Servers
stt-mcp-server-linux
Local speech-to-text server for Linux, designed to integrate with Claude Code via the MCP protocol or run in standalone mode to inject transcribed text into a Tmux session.
listenhub-mcp-server
An MCP server for ListenHub, enabling AI-powered podcast and FlowSpeech audio generation within various client applications.
kokoro-mcp-server
This server provides a comprehensive Text-to-Speech toolkit for content creators and developers, integrating with AI tools via the Model Context Protocol (MCP), offering CLI and Streamlit interfaces, and supporting audio enhancement and multi-engine TTS (Kokoro, Indic, OpenVoice).
video-transcriber-mcp-rs
High-performance video transcription and audio extraction from over 1000 online platforms or local video files, generating transcripts in plain text, JSON, and Markdown formats.