MCP
Verified Safeby stevenpto
Overview
Extracts text from PDF documents, including support for OCR on scanned pages, and summarizes the extracted content using context-aware guidance.
Installation
python -m src.serverSecurity Notes
The server processes local PDF files. Potential risks primarily arise from vulnerabilities within the PyMuPDF or Tesseract libraries when handling malformed or malicious PDF inputs, or if the `file_path` parameter is not properly controlled by the calling agent, potentially exposing unintended local files. The server code itself does not contain 'eval', obfuscation, or hardcoded secrets. NLTK data downloads are handled quietly.
Similar Servers
kreuzberg
Extracts text, tables, images, and metadata from a wide range of document formats (PDF, Office, images, HTML, etc.), with support for multiple OCR backends and an extensible plugin system. Can be run as a Micro-Agent Communication Protocol (MCP) server.
kreuzberg
Extracts text, tables, images, and metadata from 56 file formats including PDF, Office documents, and images. Supports multiple OCR backends, extensible plugins, and is designed for data preprocessing in AI/ML workflows.
pdf-reader-mcp
Provides a robust server for AI agents to extract text, images, and metadata from PDF documents, preserving content order for better comprehension.
mcp-pdf-reader
Exposes local PDFs for reading, semantic search, chunking, and table extraction to MCP-compatible agents or via a CLI.