largefile
Verified Safeby peteretelej
Overview
Navigate, search, and edit large codebases, logs, and data files that exceed AI context limits, using semantic code analysis and robust editing features.
Installation
uvx --from largefile largefile-mcpEnvironment Variables
- LARGEFILE_MEMORY_THRESHOLD_MB
- LARGEFILE_MMAP_THRESHOLD_MB
- LARGEFILE_MAX_LINE_LENGTH
- LARGEFILE_TRUNCATE_LENGTH
- LARGEFILE_FUZZY_THRESHOLD
- LARGEFILE_MAX_SEARCH_RESULTS
- LARGEFILE_CONTEXT_LINES
- LARGEFILE_SIMILAR_MATCH_LIMIT
- LARGEFILE_SIMILAR_MATCH_THRESHOLD
- LARGEFILE_STREAMING_CHUNK_SIZE
- LARGEFILE_BACKUP_DIR
- LARGEFILE_MAX_BACKUPS
- LARGEFILE_MAX_BATCH_CHANGES
- LARGEFILE_ENABLE_TREE_SITTER
- LARGEFILE_TREE_SITTER_TIMEOUT
- LARGEFILE_LOG_LEVEL
- LARGEFILE_ENABLE_METRICS
- LARGEFILE_LOG_FILE
- LARGEFILE_ENABLE_PARALLEL_SEARCH
- LARGEFILE_ENABLE_AST_CACHE
Security Notes
The server primarily performs file I/O, string processing, and AST parsing. It uses `os.path.expanduser` and `os.path.abspath` for path normalization, but does not restrict file access to a specific directory. Therefore, if the AI provides arbitrary absolute file paths, it could potentially read or write to sensitive system files (e.g., `/etc/passwd`) if the server process has the necessary operating system permissions. Atomic file writes (temp file + rename) and automatic backups are implemented for data integrity. Tree-sitter, a native extension, is used for semantic parsing, which carries the inherent risk of vulnerabilities in native code when processing malformed input, though the server includes error handling and timeouts for parsing. No direct `eval()` or `os.system()` calls with unchecked user input were found. The 'is_binary_file' function helps prevent text operations on binary files, but does not mitigate risks associated with executing them.
Similar Servers
chunkhound
Provides local-first codebase intelligence, extracting architecture, patterns, and institutional knowledge for AI assistants.
code-index-mcp
Intelligent code indexing and analysis for Large Language Models, enabling tasks such as code review, refactoring, documentation generation, debugging assistance, and architectural analysis.
consult7
Enables AI agents to analyze extensive file collections (e.g., codebases) using large context window models via OpenRouter, overcoming agent context limits.
aleph
Aleph is an MCP server that provides LLMs programmatic access to gigabytes of local data without consuming context, implementing the Recursive Language Model (RLM) architecture.