lemonade
Verified Safeby lemonade-sdk
Overview
The Lemonade C++ Server provides a lightweight, high-performance HTTP API for local Large Language Model (LLM) inference and model management, leveraging hardware accelerators like AMD Ryzen AI NPU, integrated GPUs, and discrete GPUs.
Installation
./lemonade-routerEnvironment Variables
- HF_TOKEN
- LEMONADE_CACHE_DIR
- LEMONADE_OFFLINE
- LEMONADE_CI_MODE
- LEMONADE_LLAMACPP_BACKEND
- LEMONADE_LLAMACPP_<BACKEND>_BIN
- RYZENAI_SKIP_PROCESSOR_CHECK
- OCL_SET_SVM_SIZE
- LD_LIBRARY_PATH
Security Notes
The server utilizes `system()` calls for external tool checks (e.g., `where flm`, `vulkaninfo`) and for installing/extracting binaries (e.g., `unzip`, PowerShell `Expand-Archive`). While the commands and paths are largely constructed internally or derived from trusted sources (GitHub/Hugging Face releases), and some user input is validated (e.g., custom llama-server args), any interaction with external processes carries inherent risk. The HTTP server defaults to binding on `localhost`, which mitigates the impact of its permissive CORS policy (`Access-Control-Allow-Origin: *`). Single-instance protection is implemented via system-wide mutexes or file locks.
Similar Servers
osaurus
Osaurus is a native macOS LLM server running local language models with OpenAI and Ollama compatible APIs, enabling tool calling and a plugin ecosystem for AI agents.
claude-prompts-mcp
Manages hot-reloadable prompt templates, structured reasoning, and multi-step chain workflows to enhance AI assistant interactions through a Model Context Protocol (MCP) compatible server.
finance-trading-ai-agents-mcp
A specialized MCP server for financial analysis and quantitative trading, designed to deploy local financial MCP services with a departmental architecture for LLM integration and algorithmic trading.
remembrances-mcp
Provides long-term memory capabilities to AI agents through key-value, vector/RAG, and graph database layers, with advanced code indexing for semantic search and navigation.