Back to Home
assafelovic icon

gpt-researcher

by assafelovic

Overview

An autonomous AI agent designed for comprehensive online and local document research, capable of generating detailed, factual, and unbiased reports. It also supports integration with AI assistants (like Claude) via the Machine Conversation Protocol (MCP) for deep research capabilities.

Installation

Run Command
uvicorn main:app --reload

Environment Variables

  • OPENAI_API_KEY
  • TAVILY_API_KEY
  • LANGCHAIN_API_KEY
  • DOC_PATH
  • RETRIEVER
  • EMBEDDING
  • FAST_LLM
  • SMART_LLM
  • STRATEGIC_LLM
  • SCRAPER
  • FIRECRAWL_API_KEY
  • MCP_API_KEY
  • OPENAI_BASE_URL
  • AZURE_OPENAI_API_KEY
  • AZURE_OPENAI_ENDPOINT
  • OLLAMA_BASE_URL
  • GROQ_API_KEY
  • ANTHROPIC_API_KEY
  • MISTRAL_API_KEY
  • TOGETHER_API_KEY
  • NETMIND_API_KEY
  • GOOGLE_API_KEY
  • COHERE_API_KEY
  • FIREWORKS_API_KEY
  • BEDROCK_API_KEY
  • VOYAGE_API_KEY
  • DEEPSEEK_API_KEY
  • DASHSCOPE_API_KEY
  • AIMLAPI_API_KEY
  • VLLM_OPENAI_API_KEY

Security Notes

Critical security vulnerabilities detected: 1. Hardcoded API Key: A Langchain API key (`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`) is hardcoded in `frontend/nextjs/components/Langgraph/Langgraph.js`. This is a severe security risk as it exposes a secret directly in the codebase. 2. Arbitrary Command Execution: The `mcp_configs` (Machine Conversation Protocol configurations) in `backend/server/websocket_manager.py` and `gpt_researcher/mcp/client.py` can define a `command` and `args` to be executed. If these configurations are controllable by untrusted input (e.g., a malicious LLM prompt or external user input), it could lead to arbitrary command execution on the host system where the MCP server is running. 3. File Operations: The server handles file uploads and deletions (`handle_file_upload`, `handle_file_deletion` in `backend/server/server_utils.py`) within the `DOC_PATH`. While `sanitize_filename` helps with filename sanitization, without stringent path validation beyond `os.path.basename`, it could still pose a risk for unintended file manipulation within the designated document path. The project uses various web scraping methods, some of which involve running a full browser instance (Selenium, ZenDriver), increasing the attack surface if the browser or its environment is not properly sandboxed.

Similar Servers

Stats

Interest Score100
Security Score2
Cost ClassMedium
Avg Tokens250000
Stars24374
Forks3224
Last Update2025-12-03

Tags

AI AgentResearchWeb ScrapingLLMMulti-AgentMCP IntegrationClaude IntegrationDeep ResearchReport Generation