scraping-mcp-server
by Readify-App
Overview
Web scraping for both static and dynamic web pages, including site structure analysis, content extraction, and integration with WordPress APIs (Rakuraku Media School, Cloud GYM) and Google Sheets.
Installation
uv run scraping-mcp-serverEnvironment Variables
- RAKURAKU_WP_USERNAME
- RAKURAKU_WP_APP_PASSWORD
- GOOGLE_APPLICATION_CREDENTIALS
Security Notes
Hardcoded WordPress API credentials for 'Rakuraku Media School' are present in `server.py` (RAKURAKU_WP_USERNAME, RAKURAKU_WP_APP_PASSWORD). While environment variables can override these, their presence in source code is a significant security vulnerability. Additionally, the Google Sheets API requires a service account JSON file (`braided-circuit-465415-m6-1cbbf338d9f0.json`) whose name is hardcoded and searched in specific local paths, which could pose a risk if not managed securely.
Similar Servers
playwright-mcp
Provides a Model Context Protocol (MCP) server for LLMs to automate browser interactions using Playwright's accessibility tree, avoiding pixel-based vision models.
fetcher-mcp
This MCP server is designed for fetching web page content using a Playwright headless browser, enabling intelligent content extraction, JavaScript execution, and flexible output formats.
mcp-accessibility-scanner
Automated web accessibility scanning and browser automation using Playwright and Axe-core, enabling LLMs to perform WCAG compliance checks and generate reports.
browser-devtools-mcp
This MCP server provides AI coding assistants with comprehensive browser automation and debugging capabilities using Playwright, enabling execution-level and visual debugging for web pages.