webclone
Verified Safeby ruslanmv
Overview
Clones and archives entire websites, including those requiring authentication, accessible via command line, a desktop GUI, or as an AI agent tool through Model Context Protocol (MCP).
Installation
python webclone-mcp.pyEnvironment Variables
- WEBCLONE_SELENIUM_HEADLESS
- WEBCLONE_SELENIUM_DISABLE_GPU
- WEBCLONE_SELENIUM_WINDOW_SIZE
- WEBCLONE_SELENIUM_USER_AGENT
- WEBCLONE_SELENIUM_TIMEOUT
- WEBCLONE_SELENIUM_NO_SANDBOX
- WEBCLONE_START_URL
- WEBCLONE_OUTPUT_DIR
- WEBCLONE_RECURSIVE
- WEBCLONE_MAX_DEPTH
- WEBCLONE_MAX_PAGES
- WEBCLONE_DELAY_MS
- WEBCLONE_WORKERS
- WEBCLONE_SAVE_PDF
- WEBCLONE_SAVE_SCREENSHOTS
- WEBCLONE_INCLUDE_ASSETS
- WEBCLONE_SAME_DOMAIN_ONLY
- WEBCLONE_DEFAULT_OUTPUT_DIR
- WEBCLONE_DEFAULT_WORKERS
- WEBCLONE_DEFAULT_DELAY_MS
- WEBCLONE_COOKIES_DIR
Security Notes
The server's core function involves interacting with external websites using a real browser (Selenium) and making HTTP requests. This carries inherent risks if malicious URLs are provided by the user/AI agent, as the browser can execute arbitrary JavaScript. However, the implementation includes input validation (Pydantic models) and explicit warnings about cookie file security. Sensitive session cookies are stored locally and are explicitly git-ignored. The `save_authentication` tool intentionally offloads manual browser interaction to the GUI/CLI to prevent blocking AI agents. Subprocess calls are limited to opening local file explorers. The code itself does not contain 'eval', obfuscation, or hardcoded secrets.
Similar Servers
Scrapling
Provides adaptive web scraping capabilities to AI chatbots and agents, allowing them to fetch, parse, and extract targeted data from websites, including dynamic content and anti-bot protected sites.
brightdata-mcp
Enables AI agents to access, search, extract, and navigate the live web in real-time without being blocked.
mcp
This server provides Hyperbrowser's Model Context Protocol (MCP) interface, offering tools for web scraping, structured data extraction, crawling, and general-purpose browser automation using AI agents like OpenAI's CUA and Anthropic's Claude Computer Use.
scrapegraph-mcp
Provides AI-powered web scraping, structured data extraction, multi-page crawling, and agentic automation capabilities for language models.