mcpbr
Verified Safeby greynewell
Overview
A benchmark runner for evaluating Model Context Protocol (MCP) servers and LLM-based coding agents against real-world software engineering tasks (SWE-bench).
Installation
mcpbr run -c mcpbr.yamlEnvironment Variables
- ANTHROPIC_API_KEY
- SUPERMODEL_API_KEY
Security Notes
The tool runs code within isolated Docker containers, which significantly reduces host system risks. It processes SWE-bench dataset inputs using `ast.literal_eval`, which is safe for trusted data. It uses the Claude Code CLI with `--dangerously-skip-permissions` inside the container for agent functionality, which is acceptable within its sandbox. Users must ensure their Docker daemon is secure and only use trusted MCP server implementations as they define the external server execution.
Similar Servers
toolsdk-mcp-registry
An API-driven registry for Model Context Protocol (MCP) servers, enabling discovery, detail retrieval, and execution of various AI tools and agents.
mcp-interviewer
A Python CLI tool to evaluate Model Context Protocol (MCP) servers for agentic use-cases, by inspecting capabilities, running functional tests, and providing LLM-as-a-judge evaluations.
1xn-vmcp
An open-source platform for composing, customizing, and extending multiple Model Context Protocol (MCP) servers into a single logical, virtual MCP server, enabling fine-grained context engineering for AI workflows and agents.
modular-mcp
A proxy server that efficiently manages and loads large tool collections from multiple Model Context Protocol (MCP) servers on-demand for LLMs, reducing context overhead.