MegaMind
by oluwaeinstein007
Overview
A content ingestion and processing system for AI applications, capable of crawling web pages, parsing documents, chunking text, generating LLM embeddings, and storing data for semantic search.
Installation
node dist/index.jsEnvironment Variables
- LLM_PROVIDER
- LLM_API_KEY
- OPENAI_API_KEY
- GOOGLE_API_KEY
- GEMINI_EMBEDDING_MODEL
- EMBEDDING_VECTOR_SIZE
- QDRANT_HOST
- QDRANT_KEY
- QDRANT_ENABLED
- DATABASE_URL
- OPENWEATHER_API_KEY
- VISA_API_KEY
- IMMIGRATION_API_KEY
Security Notes
CRITICAL: The MCP server exposes tools like `INGEST_URL_TOOL` and `INGEST_FILE_TOOL` that accept arbitrary URLs for web crawling and arbitrary file paths for ingestion directly from the MCP client. This creates severe security vulnerabilities: 1. Server-Side Request Forgery (SSRF): An attacker could use `INGEST_URL_TOOL` to force the server to make requests to internal network resources, potentially disclosing sensitive information or exploiting internal services. 2. Local File Inclusion/Disclosure: An attacker could use `INGEST_FILE_TOOL` with paths like `/etc/passwd` or `../../.env` to read and ingest sensitive files from the server's filesystem. These tools lack explicit input validation or sanitization within the provided code, making them highly dangerous if exposed to untrusted input. The `transportType: 'stdio'` might mitigate direct network exposure, but a compromised MCP client or malicious input via the stdio channel still poses these risks.
Similar Servers
firecrawl-mcp-server
A Model Context Protocol (MCP) server for integrating Firecrawl's web scraping, crawling, search, and structured data extraction capabilities with AI agents.
DevDocs
DevDocs is a web crawling and content extraction platform designed to accelerate software development by converting documentation into LLM-ready formats for intelligent data querying and fine-tuning.
kindly-web-search-mcp-server
Provides web search with robust, LLM-optimized content retrieval from various sources (StackExchange, GitHub, Wikipedia, arXiv, and general webpages) for AI coding assistants.
flexible-graphrag
The Flexible GraphRAG MCP Server integrates document processing, knowledge graph building, hybrid search, and AI query capabilities via the Model Context Protocol (MCP) for clients like Claude Desktop and MCP Inspector.