Back to Home
alaa-nadi icon

UI-TARS-desktop

by alaa-nadi

Overview

A GUI Agent application allowing users to control their computer and perform tasks using natural language, leveraging Vision-Language Models (VLMs) and Multi-Channel Processing (MCP) for interaction.

Installation

Run Command
pnpm run dev:agent-tars

Environment Variables

  • VLM_PROVIDER
  • VLM_BASE_URL
  • VLM_API_KEY
  • VLM_MODEL_NAME
  • PORT
  • START_MINIMIZED
  • ELECTRON_RENDERER_URL
  • CI
  • UPGRADE_EXTENSIONS
  • OPENAI_API_KEY
  • OPENAI_API_BASE_URL
  • OPENAI_DEFAULT_MODEL
  • AZURE_OPENAI_ENDPOINT
  • AZURE_OPENAI_API_VERSION
  • AZURE_OPENAI_MODEL
  • AZURE_OPENAI_API_KEY
  • ANTHROPIC_API_KEY
  • ANTHROPIC_API_BASE_URL
  • ANTHROPIC_DEFAULT_MODEL
  • GEMINI_API_KEY
  • GEMINI_API_BASE_URL
  • GEMINI_DEFAULT_MODEL
  • MISTRAL_API_KEY
  • MISTRAL_API_BASE_URL
  • MISTRAL_DEFAULT_MODEL
  • TAVILY_API_KEY
  • BING_SEARCH_API_KEY

Security Notes

The `ui-tars-desktop` application has critical Electron security vulnerabilities including: 1) `preload/index.ts` directly exposes `ipcRenderer` methods to the renderer process (`contextIsolation` bypassed for `window.electron`), allowing potential full Node.js API access if a script is injected. 2) `apps/ui-tars/src/main/window/ScreenMarker.ts` creates new `BrowserWindow` instances with `nodeIntegration: true` and `contextIsolation: false`, making these windows highly vulnerable to arbitrary code execution. 3) `apps/ui-tars/src/main/window/createWindow.ts` uses `sandbox: false`. The `agent-tars-app` part, while using `contextIsolation: true` and Content Security Policy, sets `webSecurity: false` for its main window, allowing unrestricted cross-origin requests which is a significant risk. The integration with `mcp-servers/commands` package allows execution of arbitrary shell commands, posing a severe risk if LLM output is not perfectly sanitized. File system access (`ipcRoutes/filesystem.ts`) can be configured via `setAllowedDirectories`, but improper configuration or bypass could lead to unauthorized file operations. `shell.openExternal` and `shell.openPath` calls can open arbitrary URLs or local files from agent actions.

Similar Servers

Stats

Interest Score34
Security Score2
Cost ClassHigh
Avg Tokens3500
Stars4
Forks0
Last Update2025-12-15

Tags

GUI AutomationVision-Language ModelElectronNatural Language ProcessingAgent