Back to Home
gwyer icon

hybrid-rag-project

Verified Safe

by gwyer

Overview

A local, privacy-preserving Retrieval-Augmented Generation (RAG) system that combines semantic and keyword search to answer questions from user-provided documents, with an MCP server API for seamless integration with Claude Desktop.

Installation

Run Command
python scripts/mcp/server_claude.py

Security Notes

The system operates entirely locally, mitigating network-based data exfiltration risks. Configuration is handled via `config.yaml`, avoiding hardcoded secrets in the codebase. User-provided queries are processed through LangChain's RAG chain and structured query engine. While the `StructuredQueryEngine` internally uses `df.query()` for structured data, which can be a vector for injection if arbitrary user input were passed directly, the exposed MCP tools (`count_by_field`, `filter_dataset`) validate inputs by field and value, rather than raw query strings, significantly reducing this risk in the MCP context. The LLM prompt explicitly instructs the model to use 'ONLY the provided context' and 'NEVER make up or infer information', which aims to reduce hallucination and potential prompt injection leading to unintended actions, though LLM-based injection remains a theoretical challenge for any RAG system.

Similar Servers

Stats

Interest Score0
Security Score8
Cost ClassLow
Avg Tokens4000
Stars0
Forks0
Last Update2025-12-14

Tags

RAGLLMPythonOllamaSemantic SearchHybrid SearchMCP ServerClaude IntegrationLocal AIDocument Q&A