vision-mcp-server
Verified Safeby stex2005
Overview
This server processes videos, RTSP streams, and images using OpenAI's GPT-4.1 Vision models to generate summaries, perform custom analyses, and count specific objects.
Installation
python server.pyEnvironment Variables
- OPENAI_API_KEY
Security Notes
The server uses environment variables for the OpenAI API key, which is good practice. No 'eval' or direct execution of arbitrary shell commands from user input was found. File path arguments (video_path, image_path) are local paths, meaning the server expects these files to be pre-existing. While the code does not perform explicit sanitization against path traversal for these arguments, the server operates on a local file system, and general best practice would be to ensure client-provided paths are restricted to safe directories. RTSP URLs have basic format validation. The system relies on well-maintained libraries (OpenCV, OpenAI).
Similar Servers
yt-dlp-mcp
Integrate video platform capabilities like search, metadata extraction, and content download into AI agents using yt-dlp.
luma-mcp
Provides multi-model vision understanding capabilities to AI assistants that lack native image understanding.
crawl-mcp
A comprehensive Model Context Protocol (MCP) server that wraps the crawl4ai library for advanced web crawling, content extraction, and AI-powered summarization from various sources including web pages, PDFs, Office documents, and YouTube videos.
cloudglue-mcp-server
Connects Cloudglue to AI assistants for video collection understanding, enabling LLMs to analyze videos, extract structured data, and gain insights from visual and audio content.