mcp-pdf-reader
Verified Safeby patriciomartinns
Overview
Exposes local PDFs for reading, semantic search, chunking, and table extraction to MCP-compatible agents or via a CLI.
Installation
uvx --from git+https://github.com/patriciomartinns/pdf-toolbox -- pdf-toolbox --quietSecurity Notes
The server implements strong security measures, notably sandboxing PDF paths to a configurable base directory (defaults to CWD) and strictly validating file extensions to '.pdf'. It does not use 'eval' or other dangerous dynamic code execution. Network activity is limited to downloading SentenceTransformer models from Hugging Face for semantic search. `subprocess.run` is used only in development/testing scripts (`scripts/check.py`) and not in the server's runtime logic, with appropriate security comments (`nosec`). Memory usage is controlled via document and index caching limits.
Similar Servers
pdf-reader-mcp
Provides production-ready PDF processing capabilities for AI agents, including extraction of text, images, and metadata from local files or URLs.
pageindex-mcp
Provides vectorless, reasoning-based RAG capabilities for LLMs to navigate and retrieve information from hierarchical document structures, primarily for long PDFs.
Archive-Agent
An intelligent file indexer with powerful AI search (RAG engine), automatic OCR, and a seamless MCP interface for document retrieval and question answering.
pdflens-mcp
Provides an MCP server for AI agents to programmatically read and extract information (text, page count, images) from PDF documents within user-defined workspaces.