simple_local_rag
Verified Safeby dataML007
Overview
A multi-modal Retrieval-Augmented Generation (RAG) system for querying PDF documents with conversation memory via a Streamlit UI, FastAPI backend, and MCP server integration.
Installation
./start_mcp.shEnvironment Variables
- OPENAI_API_KEY
- API_HOST
- API_PORT
- STREAMLIT_SERVER_PORT
- MCP_SERVER_PORT
Security Notes
The system uses `os.getenv` for API keys and recommends storing them in a `.env` file excluded from version control, which is good practice. File uploads are handled with temporary files and explicit `.pdf` extension checks, reducing direct path traversal risks. The `VectorStore` uses `pickle.dump` and `pickle.load` for chunk metadata. While typically used for internal data, if an attacker could tamper with the `chunks.pkl` file, this could lead to a deserialization vulnerability. For a 'local' RAG system, this risk is mitigated by assuming trusted local file access. The FastAPI backend uses `allow_origins=["*"]` for CORS, which is noted as acceptable for local development but a security risk for production deployments.
Similar Servers
pageindex-mcp
Provides vectorless, reasoning-based RAG capabilities for LLMs to navigate and retrieve information from hierarchical document structures, primarily for long PDFs.
mcp-local-rag
A privacy-first, local document search server that leverages semantic search for Model Context Protocol (MCP) clients.
Docker_MCPGUIApp
A conversational AI chatbot leveraging Docker's Model and Component Protocol (MCP) to integrate with LLMs and perform various tool-augmented searches (web, academic papers).
agent-tool
A full-stack AI agent platform that integrates Retrieval Augmented Generation (RAG), Model Context Protocol (MCP) tools, and multi-LLM support through a modern ChatGPT-like interface.