document-reader-mcp

Name: document-reader-mcp
Author: ggmenu

Verified Safe

by ggmenu

View Source

Overview

Extracts text from various document formats (PDF, DOCX, XLSX, CSV, TXT, JSON, Markdown) and converts them to Markdown.

Installation

Run Command

python -m server.main

Environment Variables

DOC_READER_RATE_LIMIT_PER_MINUTE
DOC_READER_MAX_OUTPUT_CHARS
DOC_READER_DEFAULT_MAX_ROWS
DOC_READER_DEFAULT_MAX_PAGES

Security Notes

The server explicitly states it is designed for local, trusted environments and has no built-in authentication. It processes local files using `os.path.expanduser`, which could lead to reading arbitrary files within the process's permissions if an untrusted client sends malicious paths. Document parsing libraries (pdfminer.six, openpyxl, python-docx, markitdown, PyMuPDF) are used, which inherently carry risks if malformed or malicious documents are processed (no internal sandboxing for these libraries). However, the project provides comprehensive security documentation (`SECURITY.md`), enforces a 100MB file size limit, implements rate limiting, and truncates output for AI context protection. The `convert_to_markdown` tool converts the *entire* document to a file, bypassing the output truncation for the AI's preview, which could consume significant local resources.

Similar Servers

kreuzberg

5412

Extracts text, tables, images, and metadata from 56 file formats including PDF, Office documents, and images. Supports multiple OCR backends, extensible plugins, and is designed for data preprocessing in AI/ML workflows.

Other

$Medium

html-to-markdown-mcp

Converts HTML content from web pages or raw strings into Markdown format, with options for including metadata, truncating content, and saving to files.

Other

$Medium

defuddle-fetch-mcp-server

This server allows LLMs to fetch web content, automatically cleaning HTML into markdown, extracting key metadata like title and author, and supporting chunked reading.

Other

$Medium

md-server

Converts various documents, webpages, and media files into markdown format, serving as an HTTP API or an MCP server for AI assistants to read and process content.

Other

$Medium

Stats

Interest Score30

Security Score7

Cost ClassHigh

Avg Tokens15000

Stars1

Forks0

Last Update2026-01-19

document-reader-mcp

Overview

Installation

Environment Variables

Security Notes

Similar Servers

kreuzberg

html-to-markdown-mcp

defuddle-fetch-mcp-server

md-server

Stats

Tags