UI-TARS-desktop

Name: UI-TARS-desktop
Author: alaa-nadi

by alaa-nadi

Overview

A GUI Agent application allowing users to control their computer and perform tasks using natural language, leveraging Vision-Language Models (VLMs) and Multi-Channel Processing (MCP) for interaction.

Installation

Run Command

pnpm run dev:agent-tars

Environment Variables

VLM_PROVIDER
VLM_BASE_URL
VLM_API_KEY
VLM_MODEL_NAME
PORT
START_MINIMIZED
ELECTRON_RENDERER_URL
CI
UPGRADE_EXTENSIONS
OPENAI_API_KEY
OPENAI_API_BASE_URL
OPENAI_DEFAULT_MODEL
AZURE_OPENAI_ENDPOINT
AZURE_OPENAI_API_VERSION
AZURE_OPENAI_MODEL
AZURE_OPENAI_API_KEY
ANTHROPIC_API_KEY
ANTHROPIC_API_BASE_URL
ANTHROPIC_DEFAULT_MODEL
GEMINI_API_KEY
GEMINI_API_BASE_URL
GEMINI_DEFAULT_MODEL
MISTRAL_API_KEY
MISTRAL_API_BASE_URL
MISTRAL_DEFAULT_MODEL
TAVILY_API_KEY
BING_SEARCH_API_KEY

Security Notes

The `ui-tars-desktop` application has critical Electron security vulnerabilities including: 1) `preload/index.ts` directly exposes `ipcRenderer` methods to the renderer process (`contextIsolation` bypassed for `window.electron`), allowing potential full Node.js API access if a script is injected. 2) `apps/ui-tars/src/main/window/ScreenMarker.ts` creates new `BrowserWindow` instances with `nodeIntegration: true` and `contextIsolation: false`, making these windows highly vulnerable to arbitrary code execution. 3) `apps/ui-tars/src/main/window/createWindow.ts` uses `sandbox: false`. The `agent-tars-app` part, while using `contextIsolation: true` and Content Security Policy, sets `webSecurity: false` for its main window, allowing unrestricted cross-origin requests which is a significant risk. The integration with `mcp-servers/commands` package allows execution of arbitrary shell commands, posing a severe risk if LLM output is not perfectly sanitized. File system access (`ipcRoutes/filesystem.ts`) can be configured via `setAllowedDirectories`, but improper configuration or bypass could lead to unauthorized file operations. `shell.openExternal` and `shell.openPath` calls can open arbitrary URLs or local files from agent actions.

Similar Servers

UI-TARS-desktop

24014

UI-TARS-desktop is a native GUI Agent application powered by multimodal AI models, enabling users to control their computer and browser through natural language instructions.

Other

$High

Windows-MCP

3968

This MCP server enables AI agents to directly interact with the Windows operating system, performing tasks such as file navigation, application control, UI interaction, and QA testing.

Other

$Medium

Windows-MCP.Net

223

Enabling AI assistants to automate tasks and interact with the Windows desktop environment.

Other

$Medium

mcp-vnc

An MCP server for AI agents to remotely control VNC-enabled desktops (Windows, Linux, macOS) through mouse, keyboard, text input, and screen capture commands.

Other

$High

Stats

Interest Score34

Security Score2

Cost ClassHigh

Avg Tokens3500

Stars4

Forks0

Last Update2025-12-15

UI-TARS-desktop

Overview

Installation

Environment Variables

Security Notes

Similar Servers

UI-TARS-desktop

Windows-MCP

Windows-MCP.Net

mcp-vnc

Stats

Tags