sf-doc-scraper
Verified Safeby salesforcebob
Overview
Scrapes Salesforce developer documentation and converts it to Markdown for consumption by MCP-compatible AI assistants.
Installation
npx @salesforcebob/sf-docs-mcp-serverEnvironment Variables
- PUPPETEER_EXECUTABLE_PATH
- PUPPETEER_SKIP_CHROMIUM_DOWNLOAD
- NODE_ENV
- PORT
Security Notes
The server uses Puppeteer to scrape web pages. It includes `--no-sandbox` and `--disable-setuid-sandbox` arguments for Puppeteer to run in resource-constrained environments like Heroku, which is a common practice but slightly reduces isolation compared to a full sandbox. Input URLs are validated to ensure they belong to Salesforce documentation domains, preventing arbitrary URL scraping. No obvious hardcoded secrets are present; environment variables are used for configuration. The disclaimer correctly advises hardening for production deployment, including authentication/authorization and rate limiting for HTTP endpoints.
Similar Servers
docfork
Provides live-synced, context-aware, and version-accurate documentation to AI models, preventing hallucinations and context bloat for developer tasks.
livewire-flux-mcp
This MCP server provides AI assistants with structured access to Livewire Flux component, layout, and icon documentation through web scraping.
lyra-tool-discovery
This MCP server is designed to fetch, parse, and organize documentation from websites implementing the llms.txt standard. It transforms raw documentation into structured, agent-ready formats, exposing tools for AI agents, LLMs, and automation workflows to consume documentation programmatically.
md-server
Converts various documents, webpages, and media files into markdown format, serving as an HTTP API or an MCP server for AI assistants to read and process content.