docsynthai
Verified Safeby raahulrawat
Overview
An intelligent document processing server that classifies documents using rule-based and AI (Gemini Vision) methods via the Model Context Protocol (MCP).
Installation
python server.pyEnvironment Variables
- DOCSYNTH_RULES_FILE
- DOCSYNTH_MAX_BASE64_BYTES
- DOCSYNTH_TRY_HTTP
- DOCSYNTH_HOST
- DOCSYNTH_PORT
Security Notes
The server uses `json.load`/`json.dump` for rule persistence, which is safe. Base64 decoding and image processing use PIL, which is generally robust but not immune to image-based exploits; however, size limits are enforced (`MAX_IMAGE_BYTES`, `MAX_BASE64_BYTES`). The Google API key is supplied at runtime via a tool, not hardcoded, and stored in memory, which is acceptable for server operation. No direct `eval` or `exec` of user-controlled input found. Overall, the system appears designed with reasonable security considerations for its scope.
Similar Servers
kreuzberg
High-performance document intelligence platform for extracting text, metadata, and structured information (tables, images, chunks) from over 50 diverse document formats (PDFs, Office, images, HTML, etc.). It offers advanced OCR capabilities, multilingual support, and features like chunking, embeddings, and keyword extraction. Functionality is exposed via multiple language bindings and a Micro-service Communication Protocol (MCP) server for flexible integration.
kreuzberg
High-performance document intelligence for extracting text, metadata, and structured information from a wide range of document formats including PDFs, Office documents, images, and HTML. It supports advanced features like OCR, table extraction, chunking, language detection, and embedding generation, powered by a Rust core for native performance.
mcp-documentation-server
A local-first MCP server for document management, semantic search, and AI-powered document intelligence.
mineru-tianshu
An enterprise-grade AI data preprocessing platform that converts unstructured data (documents, images, audio, video, bioinformatics formats) into AI-ready structured Markdown and JSON formats.