Back to Home
Jack0319 icon

ai-safety-mcp-server

Verified Safe

by Jack0319

Overview

A centralized Model Context Protocol (MCP) server for AI Safety research, providing knowledge base, safety evaluation, mechanistic interpretability, and governance tools for research assistants and agentic systems.

Installation

Run Command
docker-compose up -d

Environment Variables

  • LITELLM_API_KEY
  • SAFETY_EVAL_MODEL
  • KB_VECTORSTORE_URL
  • KB_COLLECTION
  • INTERP_MODEL_DIR
  • LOG_LEVEL

Security Notes

Secrets (e.g., LITELLM_API_KEY) are managed via environment variables. The server defaults to stdio for communication (local IPC), with TCP transport planned but not yet implemented. The README provides strong warnings against exposing the server directly to the internet, explicitly recommending deployment behind an authenticated proxy and usage of VPNs or private networks. Interpretability tools load models from HuggingFace or local paths, which requires trust in the model source, a standard practice in ML development. No direct `eval()` of user input or dangerous `subprocess` calls were identified.

Similar Servers

Stats

Interest Score33
Security Score9
Cost ClassHigh
Avg Tokens1550
Stars1
Forks0
Last Update2025-11-24

Tags

AI SafetyMCPInterpretabilityLLM EvaluationsKnowledge Base