mcp_server4j
Verified Safeby jeremylem
Overview
Provides a local knowledge base using a Model Context Protocol (MCP) server that supports hybrid search (BM25 + vector similarity) over various document formats (PDF, Markdown, TXT).
Installation
docker-compose up -dEnvironment Variables
- SERVER_PORT
- CHROMA_HOST
- CHROMA_PORT
- CHROMA_COLLECTION
- SPRING_PROFILES_ACTIVE
- RETRIEVAL_BM25_WEIGHT
- RETRIEVAL_VECTOR_WEIGHT
- RETRIEVAL_CANDIDATE_POOL_SIZE
Security Notes
The server architecture follows sound security practices for a Java Spring Boot application. It does not use 'eval' or similar dangerous patterns. Input sanitization is applied to Lucene queries to prevent injection. Configuration is externalized through application.yml and environment variables, with no hardcoded secrets found in the provided code. External libraries like Apache Lucene, LangChain4j, Apache Tika, and PDFBox are well-established. The primary security consideration relates to the ingestion of untrusted documents via Apache Tika/PDFBox parsers, which could theoretically expose vulnerabilities in these libraries. However, the ingestion pipeline is designed as a separate CLI tool invoked by the user, implying a controlled and trusted source of documents, rather than direct exposure to untrusted external clients in the running server.
Similar Servers
qdrant-loader
The QDrant Loader MCP Server provides advanced Retrieval-Augmented Generation (RAG) capabilities to AI development tools by bridging a QDrant knowledge base. It offers intelligent search through semantic, hierarchy-aware, and attachment-focused tools, integrating seamlessly with MCP-compatible AI tools to provide context-aware code assistance, documentation lookup, and intelligent suggestions.
the-pensieve
The Pensieve server acts as a RAG-based knowledge management system, allowing users to store, query, and analyze their knowledge using natural language and LLM-powered insights.
concept-rag
A RAG-based conceptual search system that transforms simple vector search into sophisticated conceptual search for document libraries.
simple-mcp-rag
An MCP server for Retrieval Augmented Generation (RAG) that ingests documents into a vector database and retrieves relevant information based on queries.