Back to Home
omarkamali icon

semango

by omarkamali

Overview

A hybrid search engine for codebases, documentation, and knowledge bases, combining lexical (BM25) and semantic (vector) search with a web UI and REST API.

Installation

Run Command
docker run -p 8181:8181 -v $(pwd):/data ghcr.io/omarkamali/semango:latest

Environment Variables

  • SEMANGO_TOKENS
  • OPENAI_API_KEY
  • SEMANGO_ENV_FILE
  • SEMANGO_MODEL_DIR

Security Notes

CRITICAL VULNERABILITIES IDENTIFIED: 1. Authentication Bypass: The REST API (`/api/v1/search`, `/api/v1/health`, `/api/v1/stats`) is advertised as 'Token-authenticated' in the README, but the provided `api/server.go` code *does not implement any authentication middleware*. Requests from the `ui/App.tsx` also do not include an `Authorization` header. This means the API is publicly accessible without any token, which is a critical security vulnerability allowing unauthorized access to search and statistics. This server is unsafe to run in any exposed environment. 2. Arbitrary Code Execution (Plugins): The system supports dynamic loading of plugins (`.so` shared object files) specified in `semango.yml`. This feature allows arbitrary native code execution and presents a severe security risk if plugins are sourced from untrusted origins or if the configuration can be tampered with by an attacker. 3. External Binary Downloads: Installation scripts (`install_faiss.sh`, `install_onnxruntime.sh`) use `curl -L` to download external libraries (FAISS, ONNX Runtime) from GitHub releases. While GitHub is generally trusted, this introduces a dependency on external integrity and can be a supply chain risk. 4. CGO Usage: The project extensively uses CGO for FAISS and ONNX Runtime. This exposes the system to potential C/C++ vulnerabilities (e.g., buffer overflows, memory corruption) that Go's memory safety features typically mitigate. Careful review of CGO-bound code is essential.

Similar Servers

Stats

Interest Score0
Security Score2
Cost ClassLow
Avg Tokens20
Stars0
Forks0
Last Update2025-12-14

Tags

Hybrid SearchSemantic SearchLexical SearchKnowledge BaseCode Search