crawl-mcp

Name: crawl-mcp
Author: walksoda

Verified Safe

by walksoda

View Source

Overview

A comprehensive Model Context Protocol (MCP) server that wraps the crawl4ai library for advanced web crawling, content extraction, and AI-powered summarization from various sources including web pages, PDFs, Office documents, and YouTube videos.

Installation

Run Command

uvx --from git+https://github.com/walksoda/crawl-mcp crawl-mcp

Environment Variables

FASTMCP_LOG_LEVEL
PYTHONUNBUFFERED
CRAWL4AI_BROWSER_TYPE
CRAWL4AI_HEADLESS
PLAYWRIGHT_BROWSERS_PATH
DISPLAY
CHROME_FLAGS
CRAWL4AI_LANG
OPENAI_API_KEY
ANTHROPIC_API_KEY
AZURE_OPENAI_API_KEY
AZURE_OPENAI_ENDPOINT
MCP_TRANSPORT
MCP_HOST
MCP_PORT
CRAWL4AI_VERBOSE
PLAYWRIGHT_SKIP_BROWSER_GC

Security Notes

The project demonstrates awareness of security, including safeguards against ReDoS attacks using `_safe_regex_findall` with process-level timeouts, and secure file permissions (`0600`) for session/cache data. Environment variables are used for sensitive data like API keys. The `execute_js` parameter for crawling tools is powerful and, if misused by the client, could potentially execute arbitrary JavaScript within the browser context (though contained by Playwright/Chromium sandbox). The use of `--no-sandbox` in Docker Compose is a common practice for Playwright in containers but means reliance on Docker's isolation for browser sandboxing. Session data is stored in plaintext locally, albeit with restricted file permissions, posing a minor risk if the host system is compromised.

Similar Servers

DevDocs

1989

DevDocs is a web crawling and content extraction platform designed to accelerate software development by converting documentation into LLM-ready formats for intelligent data querying and fine-tuning.

Other

$High

mcp-omnisearch

261

Provides a unified interface for various search, AI response, content processing, and enhancement tools via Model Context Protocol (MCP).

Other

$Medium

mcp-server

A Model Context Protocol (MCP) server that integrates with SerpApi to provide comprehensive search engine results and data extraction to an LLM.

Other

$Medium

webscraping-ai-mcp-server

Integrates with WebScraping.AI to provide LLM-powered web data extraction, including question answering, structured data extraction, and HTML/text retrieval, with advanced features like JavaScript rendering and proxy management.

Other

$Medium

Stats

Interest Score41

Security Score7

Cost ClassHigh

Avg Tokens2500

Stars24

Forks6

Last Update2026-01-18

crawl-mcp

Overview

Installation

Environment Variables

Security Notes

Similar Servers

DevDocs

mcp-omnisearch

mcp-server

webscraping-ai-mcp-server

Stats

Tags