wayland-mcp
by kurojs
Overview
Enables AI assistants to interact with a Wayland desktop for automation, including screenshot capture, VLM analysis, mouse, and keyboard control.
Installation
uvx wayland-mcpEnvironment Variables
- OPENROUTER_API_KEY
- GEMINI_API_KEY
- VLM_PROVIDER
- VLM_MODEL
- XDG_RUNTIME_DIR
- WAYLAND_DISPLAY
- WAYLAND_MCP_PORT
Security Notes
The `setup.sh` script performs highly privileged operations, including installing `evemu-tools`, setting the `setuid` bit on `evemu-event`, adding a `NOPASSWD` sudoers rule for `evemu-event`, and changing permissions on `/dev/input/event*` to `0666`. This grants the Wayland MCP server extensive and low-level control over keyboard and mouse input, which is a significant security risk if the server or a connected AI client were compromised. The README explicitly warns about this.
Similar Servers
UI-TARS-desktop
UI-TARS-desktop is a native GUI Agent application powered by multimodal AI models, enabling users to control their computer and browser through natural language instructions.
Windows-MCP
This MCP server enables AI agents to directly interact with the Windows operating system, performing tasks such as file navigation, application control, UI interaction, and QA testing.
mcp-server-browserbase
Enables LLMs to perform cloud browser automation tasks such as navigating, interacting with elements, extracting data, and capturing screenshots on web pages.
Peekaboo
macOS automation server that integrates AI for screen capture analysis, UI interaction, and agentic workflows.