pdf-reader-mcp

Name: pdf-reader-mcp
Author: BACH-AI-Tools

by BACH-AI-Tools

View Source

Overview

Extracts text, images, and metadata from PDF files (local or URL) for AI agent consumption.

Installation

Run Command

npx @sylphx/pdf-reader-mcp

Security Notes

The `resolvePath` function in `src/utils/pathUtils.ts` and the `README.md` explicitly permit and encourage the use of absolute paths (e.g., `C:\path\to\file.pdf` or `/home/user/file.pdf`). Additionally, for relative paths, `path.resolve(process.cwd(), userPath)` is used without further validation to prevent directory traversal (`../../`). This means an AI agent, if compromised or instructed to do so, could read *any* file on the host machine's filesystem where the server is running, bypassing the stated "Secure Context" and "Context Confinement" design principles. This is a critical security vulnerability. Users *must* run this server with extremely restrictive filesystem permissions for the Node.js process and be aware that an agent can request any file via an absolute path. The server also makes external network requests for URL-based PDFs, which could be a risk if untrusted URLs are processed.

Similar Servers

kreuzberg

5420

Extracts text, tables, images, and metadata from a wide range of document formats (PDF, Office, images, HTML, etc.), with support for multiple OCR backends and an extensible plugin system. Can be run as a Micro-Agent Communication Protocol (MCP) server.

Other

$Medium

kreuzberg

5412

Extracts text, tables, images, and metadata from 56 file formats including PDF, Office documents, and images. Supports multiple OCR backends, extensible plugins, and is designed for data preprocessing in AI/ML workflows.

Other

$Medium