glm-asr
by neosun100
Overview
Provides an all-in-one speech recognition service with Web UI, REST API, and MCP integration.
Installation
docker run -d --gpus all -p 7860:7860 neosun/glm-asr:latestEnvironment Variables
- PORT
- MODEL_CHECKPOINT
- NVIDIA_VISIBLE_DEVICES
- HF_HOME
Security Notes
The model loading (`AutoModelForCausalLM.from_pretrained`) and VAD model loading (`torch.hub.load`) use `trust_remote_code=True` and `trust_repo=True` respectively. This allows arbitrary code execution from the specified HuggingFace model or GitHub repository, posing a significant security risk if the external source is compromised or malicious. While common in the ML ecosystem for flexibility, it requires explicit trust in the model/repository maintainers. The MCP server's `transcribe` tool accepts `audio_path` directly, which could lead to path traversal if the MCP client is untrusted or improperly configured, although the web API handles file uploads to a temporary directory.
Similar Servers
stt-mcp-server-linux
Provides local push-to-talk speech-to-text transcription for Linux, injecting transcribed text into a Tmux session for applications like Claude Code.
listenhub-mcp-server
An MCP server for ListenHub, enabling AI-powered podcast and FlowSpeech audio generation within various client applications.
kokoro-mcp-server
This server provides a comprehensive Text-to-Speech toolkit for content creators and developers, integrating with AI tools via the Model Context Protocol (MCP), offering CLI and Streamlit interfaces, and supporting audio enhancement and multi-engine TTS (Kokoro, Indic, OpenVoice).
mcp-server
A web-based Docker management platform for deploying, managing, and building custom AI tools (MCP servers) for integration with language models.