docs-scraper-mcp
Verified Safeby adwonnacott
Overview
Scrapes documentation from websites using Firecrawl, stores it locally, backs it up to GitHub, and provides tools for retrieval, search, and management.
Installation
node ~/docs-scraper-mcp/dist/index.jsEnvironment Variables
- FIRECRAWL_API_KEY
- GITHUB_TOKEN
- GITHUB_REPO
Security Notes
The server correctly handles API keys and tokens via environment variables. File system operations are strictly confined to a dedicated 'scraped-docs' directory within the user's home. Input validation is performed using Zod schemas for tool arguments. There is no usage of 'eval' or apparent code obfuscation. Path construction for local files mitigates directory traversal risks. Interaction with external APIs (Firecrawl, GitHub) is well-structured with retry logic. The primary residual risk would be from the content scraped itself if it were crafted to exploit a downstream consumer, but this server does not introduce new vulnerabilities in that regard.
Similar Servers
scrapi-mcp
This MCP server enables AI agents to scrape web pages and retrieve their content as HTML or Markdown, with advanced browser interaction capabilities.
lyra-tool-discovery
This MCP server is designed to fetch, parse, and organize documentation from websites implementing the llms.txt standard. It transforms raw documentation into structured, agent-ready formats, exposing tools for AI agents, LLMs, and automation workflows to consume documentation programmatically.
firecrawl-mcp-server
A Model Context Protocol (MCP) server that provides web scraping, crawling, search, and structured data extraction capabilities using the Firecrawl API.
documan
A documentation tool that provides a built-in MCP server, allowing AI assistants to semantically search and understand documentation in real-time.