crawler-mcp-server
Verified Safeby osins
Overview
An MCP server providing web crawling, browser automation, and content extraction capabilities with support for multiple output formats and LLM integration.
Installation
dev-tool-mcpEnvironment Variables
- TEST_URL
- MCP_STREAMING_MODE
- MCP_SERVER_NAME
- MCP_DEFAULT_TOOL_TIMEOUT
- MCP_LOG_LEVEL
- OPENAI_API_KEY
Security Notes
The server demonstrates good security awareness with explicit URL validation and sanitization, Playwright browser hardening (headless mode, no-sandbox), and clear statements on file system security (files saved only to specified paths, no arbitrary access). Input validation is performed for tool arguments, including URL length limits. There are no obvious 'eval' or hardcoded secrets. Potential risks include: 1) if the 'save_path' argument for crawl_web_page is sourced from untrusted input, it could lead to directory traversal in some edge cases if not properly handled by the calling system (though os.path.join provides some protection); 2) LLM integration with an 'instruction' parameter could be susceptible to prompt injection if the user input is not sanitized by the calling agent.
Similar Servers
scrapegraph-mcp
Provides AI-powered web scraping, structured data extraction, multi-page crawling, and agentic automation capabilities for language models.
webscraping-ai-mcp-server
Integrates with WebScraping.AI to provide LLM-powered web data extraction, including question answering, structured data extraction, and HTML/text retrieval, with advanced features like JavaScript rendering and proxy management.
crawl-mcp
A comprehensive Model Context Protocol (MCP) server that wraps the crawl4ai library for advanced web crawling, content extraction, and AI-powered summarization from various sources including web pages, PDFs, Office documents, and YouTube videos.
scrapi-mcp
This MCP server enables AI agents to scrape web pages and retrieve their content as HTML or Markdown, with advanced browser interaction capabilities.