webclone
Verified Safeby ruslanmv
Overview
Clones and archives entire websites, including those requiring authentication, accessible via command line, a desktop GUI, or as an AI agent tool through Model Context Protocol (MCP).
Installation
python webclone-mcp.pyEnvironment Variables
- WEBCLONE_SELENIUM_HEADLESS
- WEBCLONE_SELENIUM_DISABLE_GPU
- WEBCLONE_SELENIUM_WINDOW_SIZE
- WEBCLONE_SELENIUM_USER_AGENT
- WEBCLONE_SELENIUM_TIMEOUT
- WEBCLONE_SELENIUM_NO_SANDBOX
- WEBCLONE_START_URL
- WEBCLONE_OUTPUT_DIR
- WEBCLONE_RECURSIVE
- WEBCLONE_MAX_DEPTH
- WEBCLONE_MAX_PAGES
- WEBCLONE_DELAY_MS
- WEBCLONE_WORKERS
- WEBCLONE_SAVE_PDF
- WEBCLONE_SAVE_SCREENSHOTS
- WEBCLONE_INCLUDE_ASSETS
- WEBCLONE_SAME_DOMAIN_ONLY
- WEBCLONE_DEFAULT_OUTPUT_DIR
- WEBCLONE_DEFAULT_WORKERS
- WEBCLONE_DEFAULT_DELAY_MS
- WEBCLONE_COOKIES_DIR
Security Notes
The server's core function involves interacting with external websites using a real browser (Selenium) and making HTTP requests. This carries inherent risks if malicious URLs are provided by the user/AI agent, as the browser can execute arbitrary JavaScript. However, the implementation includes input validation (Pydantic models) and explicit warnings about cookie file security. Sensitive session cookies are stored locally and are explicitly git-ignored. The `save_authentication` tool intentionally offloads manual browser interaction to the GUI/CLI to prevent blocking AI agents. Subprocess calls are limited to opening local file explorers. The code itself does not contain 'eval', obfuscation, or hardcoded secrets.
Similar Servers
Scrapling
Enables AI chatbots and agents to perform adaptive web scraping, extract targeted data, and bypass anti-bot protections conversationally.
brightdata-mcp
The MCP server enables AI agents to access real-time web data and perform browser automation for tasks like research, e-commerce intelligence, market analysis, and content creation, bypassing bot detection and CAPTCHAs.
mcp
This server provides Hyperbrowser's Model Context Protocol (MCP) interface, offering tools for web scraping, structured data extraction, crawling, and general-purpose browser automation using AI agents like OpenAI's CUA and Anthropic's Claude Computer Use.
scrapegraph-mcp
Provides a Model Context Protocol (MCP) server that integrates with ScrapeGraph AI, enabling language models to perform advanced AI-powered web scraping and structured data extraction across single pages, multiple pages, and search results.