Back to Home
Goldziher icon

kreuzberg

Verified Safe

by Goldziher

Overview

High-performance document intelligence for extracting text, metadata, and structured information from a wide range of document formats including PDFs, Office documents, images, and HTML. It supports advanced features like OCR, table extraction, chunking, language detection, and embedding generation, powered by a Rust core for native performance.

Installation

Run Command
kreuzberg mcp

Environment Variables

  • KREUZBERG_BENCHMARK_DEBUG
  • KREUZBERG_DEBUG_GUTEN
  • KREUZBERG_ENCODING_CACHE_MAX_ENTRIES
  • KREUZBERG_ENCODING_CACHE_MAX_BYTES
  • TESSDATA_PREFIX
  • LD_LIBRARY_PATH
  • DYLD_LIBRARY_PATH
  • PATH
  • NODE_PATH
  • PYTHONPATH
  • RUBYLIB
  • KREUZBERG_FFI_DIR
  • KREUZBERG_GMFT_ISOLATED
  • DOTPRODUCT
  • SCROLLVIEW_PATH

Security Notes

The server includes an API and an MCP (Multi-language Communication Protocol) server designed for inter-process communication. While robust for local and internal use, exposing the API or MCP interface publicly without additional security measures (e.g., authentication, access control, network segmentation) could introduce vulnerabilities. The `biome.json` linter configuration shows some exceptions for `noExplicitAny`, indicating potential areas where type strictness is relaxed, but `pnpm` with a lockfile and `oxlint` are used for dependency and code quality management, respectively. No obvious malicious patterns or hardcoded critical secrets were found in the provided snippets.

Similar Servers

Stats

Interest Score100
Security Score7
Cost ClassHigh
Avg Tokens3000
Stars2565
Forks113
Last Update2025-12-06

Tags

document intelligencePDF extractionOCRdata extractiontext processingmetadata extractionPython libraryRust coremultilingualembeddings