gpt-researcher
by assafelovic
Overview
An autonomous AI agent designed for comprehensive online and local document research, capable of generating detailed, factual, and unbiased reports. It also supports integration with AI assistants (like Claude) via the Machine Conversation Protocol (MCP) for deep research capabilities.
Installation
uvicorn main:app --reloadEnvironment Variables
- OPENAI_API_KEY
- TAVILY_API_KEY
- LANGCHAIN_API_KEY
- DOC_PATH
- RETRIEVER
- EMBEDDING
- FAST_LLM
- SMART_LLM
- STRATEGIC_LLM
- SCRAPER
- FIRECRAWL_API_KEY
- MCP_API_KEY
- OPENAI_BASE_URL
- AZURE_OPENAI_API_KEY
- AZURE_OPENAI_ENDPOINT
- OLLAMA_BASE_URL
- GROQ_API_KEY
- ANTHROPIC_API_KEY
- MISTRAL_API_KEY
- TOGETHER_API_KEY
- NETMIND_API_KEY
- GOOGLE_API_KEY
- COHERE_API_KEY
- FIREWORKS_API_KEY
- BEDROCK_API_KEY
- VOYAGE_API_KEY
- DEEPSEEK_API_KEY
- DASHSCOPE_API_KEY
- AIMLAPI_API_KEY
- VLLM_OPENAI_API_KEY
Security Notes
Critical security vulnerabilities detected: 1. Hardcoded API Key: A Langchain API key (`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`) is hardcoded in `frontend/nextjs/components/Langgraph/Langgraph.js`. This is a severe security risk as it exposes a secret directly in the codebase. 2. Arbitrary Command Execution: The `mcp_configs` (Machine Conversation Protocol configurations) in `backend/server/websocket_manager.py` and `gpt_researcher/mcp/client.py` can define a `command` and `args` to be executed. If these configurations are controllable by untrusted input (e.g., a malicious LLM prompt or external user input), it could lead to arbitrary command execution on the host system where the MCP server is running. 3. File Operations: The server handles file uploads and deletions (`handle_file_upload`, `handle_file_deletion` in `backend/server/server_utils.py`) within the `DOC_PATH`. While `sanitize_filename` helps with filename sanitization, without stringent path validation beyond `os.path.basename`, it could still pose a risk for unintended file manipulation within the designated document path. The project uses various web scraping methods, some of which involve running a full browser instance (Selenium, ZenDriver), increasing the attack surface if the browser or its environment is not properly sandboxed.
Similar Servers
deep-research
An AI-powered research assistant that generates comprehensive reports, leverages various LLMs and web search engines, and offers integration as a SaaS or MCP service.
DevDocs
Provides intelligent web crawling and documentation extraction, storing content in a Model Context Protocol server for LLM querying and accelerating developer research.
mcp-omnisearch
Provides a unified interface for LLMs to access multiple web search, AI response, content processing, and enhancement tools from various providers through the Model Context Protocol (MCP).
mcp-server
Provides a Model Context Protocol (MCP) server for AI agents to search and retrieve curated documentation for the Strands Agents framework, facilitating AI coding assistance.