mcp-server-for-rag
Verified Safeby yfujita
Overview
A Retrieval Augmented Generation (RAG) system that crawls web pages, indexes them in Elasticsearch, provides an MCP server for LLMs to query, and includes a chat UI for interaction.
Installation
./run.shEnvironment Variables
- OPENAI_API_KEY
- ENABLE_EMBEDDING
- EMBEDDING_API_URL
- MCP_SERVER_URL
- LLM_MODEL
- ES_URL
- MCP_TRANSPORT_TYPE
- CRAWLER_CONFIG_FILE
- ES_HOST
- ES_PORT
Security Notes
The system uses environment variables for sensitive data like OpenAI API keys (read from `openai_token.txt`), which is better than hardcoding. Tool arguments passed from LLMs are validated using Pydantic models in the `mcp-api`, mitigating some risks of arbitrary code execution through tool calls. The `mcp-api` includes a middleware to force `Host: localhost:8000` header for internal Docker communication, which is functional within the controlled environment but an unusual pattern that might need careful consideration in different deployment scenarios.
Similar Servers
DevDocs
DevDocs is a web crawling and content extraction platform designed to accelerate software development by converting documentation into LLM-ready formats for intelligent data querying and fine-tuning.
mcp-local-rag
Provides a local, RAG-like web search tool for Large Language Models to retrieve current information and context.
mcp-raganything
Provides a FastAPI REST API and MCP server for Retrieval Augmented Generation (RAG) capabilities, integrating with the RAG-Anything and LightRAG libraries for multi-modal document processing and knowledge graph operations.
viberag
Local codebase semantic search (RAG) for AI coding assistants via MCP server.