datasheet-to-md-mcp
Verified Safeby monamaret
Overview
Converts PDF datasheets and documents into structured Markdown, extracts images, detects diagrams, and generates PlantUML for integration with AI coding assistants via the Model Context Protocol.
Installation
pdf-md-mcpEnvironment Variables
- PDF_INPUT_DIR
- OUTPUT_BASE_DIR
- MCP_SERVER_NAME
- MCP_SERVER_VERSION
- IMAGE_MAX_DPI
- IMAGE_FORMAT
- PRESERVE_ASPECT_RATIO
- DETECT_DIAGRAMS
- DIAGRAM_CONFIDENCE
- PLANTUML_STYLE
- PLANTUML_COLOR_SCHEME
- INCLUDE_TOC
- BASE_HEADER_LEVEL
- EXTRACT_TABLES
- EXTRACT_IMAGES
- LOG_LEVEL
- MCP_TRANSPORT
Security Notes
The server uses standard I/O for communication (stdio transport) and relies on robust Go libraries for PDF parsing and image processing. Path handling for input and output directories utilizes `filepath.Clean` and `filepath.Join`, mitigating simple path traversal vulnerabilities. Resource limits for image processing are in place to prevent memory exhaustion. There is no direct execution of arbitrary commands or `eval`-like patterns observed. Configuration is loaded from environment variables, which is a secure practice, provided the environment variables themselves are managed securely during deployment. The primary security risk would come from misconfiguring `PDF_INPUT_DIR` or `OUTPUT_BASE_DIR` to sensitive file system locations.
Similar Servers
html-to-markdown-mcp
Converts HTML content from web pages or raw strings into Markdown format, with options for including metadata, truncating content, and saving to files.
pdflens-mcp
This MCP server provides tools for reading and extracting information from PDF files, including text and images, designed for AI clients.
markitdown-mcp
A Model Context Protocol (MCP) server for converting 29+ file formats (e.g., PDF, Office, images, audio) to clean, structured Markdown, designed for integration with AI workflows and MCP clients like Claude Desktop.
md-server
Converts various documents, webpages, and media files into markdown format, serving as an HTTP API or an MCP server for AI assistants to read and process content.