pdf-reader-mcp
by BACH-AI-Tools
Overview
Extracts text, images, and metadata from PDF files (local or URL) for AI agent consumption.
Installation
npx @sylphx/pdf-reader-mcpSecurity Notes
The `resolvePath` function in `src/utils/pathUtils.ts` and the `README.md` explicitly permit and encourage the use of absolute paths (e.g., `C:\path\to\file.pdf` or `/home/user/file.pdf`). Additionally, for relative paths, `path.resolve(process.cwd(), userPath)` is used without further validation to prevent directory traversal (`../../`). This means an AI agent, if compromised or instructed to do so, could read *any* file on the host machine's filesystem where the server is running, bypassing the stated "Secure Context" and "Context Confinement" design principles. This is a critical security vulnerability. Users *must* run this server with extremely restrictive filesystem permissions for the Node.js process and be aware that an agent can request any file via an absolute path. The server also makes external network requests for URL-based PDFs, which could be a risk if untrusted URLs are processed.
Similar Servers
kreuzberg
Extracts text, tables, images, and metadata from a wide range of document formats (PDF, Office, images, HTML, etc.), with support for multiple OCR backends and an extensible plugin system. Can be run as a Micro-Agent Communication Protocol (MCP) server.
kreuzberg
Extracts text, tables, images, and metadata from 56 file formats including PDF, Office documents, and images. Supports multiple OCR backends, extensible plugins, and is designed for data preprocessing in AI/ML workflows.
pdf-reader-mcp
Provides a robust server for AI agents to extract text, images, and metadata from PDF documents, preserving content order for better comprehension.
mcp-pdf-reader
Exposes local PDFs for reading, semantic search, chunking, and table extraction to MCP-compatible agents or via a CLI.