genai-mcp
Verified Safeby adamydwang
Overview
This server acts as a Model Context Protocol (MCP) gateway for various GenAI image generation and editing services, with optional S3-compatible storage for generated images.
Installation
go build . && ./genai-mcpEnvironment Variables
- GENAI_PROVIDER
- GENAI_BASE_URL
- GENAI_API_KEY
- GENAI_GEN_MODEL_NAME
- GENAI_EDIT_MODEL_NAME
- GENAI_TIMEOUT_SECONDS
- GENAI_IMAGE_FORMAT
- SERVER_ADDRESS
- SERVER_PORT
- OSS_ENDPOINT
- OSS_REGION
- OSS_ACCESS_KEY
- OSS_SECRET_KEY
- OSS_BUCKET
- LOG_LEVEL
- LOG_FORMAT
- LOG_OUTPUT
- LOG_FILE
Security Notes
The server loads sensitive API keys and OSS credentials from `.env` files or environment variables, which is a good practice. Logging of API keys is masked. The `DownloadImageFromURL` function fetches images from external URLs, which could be a vector for Server-Side Request Forgery (SSRF) if user-provided URLs are not thoroughly validated to prevent access to internal networks or malicious external resources. However, this is an inherent risk of services processing external URLs, and the code itself doesn't show obvious malicious patterns. S3/OSS upload paths use UUIDs to prevent predictable file names.
Similar Servers
gemini-mcp-server
An MCP server providing a suite of 7 AI-powered tools (Image Gen/Edit, Chat, Audio Transcribe, Code Execute, Video/Image Analysis) powered by Google Gemini, featuring a self-learning "Smart Tool Intelligence" system for prompt enhancement and user preference adaptation.
nanobanana-api-mcp
An MCP server providing image generation and editing capabilities via the Google Gemini API, integrable with various AI coding assistants and IDEs.
ultimate-image-gen-mcp
A professional MCP server for Google's Gemini 3 Pro Image Preview, enabling state-of-the-art image generation with advanced reasoning, high-resolution output (1K-4K), up to 14 reference images, Google Search grounding, and automatic thinking mode.
gemini-mcp
The server provides a Model Context Protocol (MCP) interface to Google Gemini AI services, enabling multimodal generation including image creation, image editing, and video production.