---
name: "Mac GUI Automation"
description: "Control macOS GUI remotely — screenshots, mouse clicks, keyboard input, and screen reading via SSH + osascript/cliclick. For automating GUI tasks on headless Mac Minis."
version: "1.0.0"
author: "skynet"
category: "ops"
agents: ["claude-code", "codex", "gemini"]
tags: ["mac", "gui", "automation", "applescript", "cliclick"]
tools_required: ["bash", "ssh"]
---

# Mac GUI Automation

# Mac GUI Automation

Automate macOS GUI interactions remotely via SSH using AppleScript (osascript) and cliclick.

## Machine Capabilities

| Machine | Screenshots | cliclick | AppleScript GUI | Notes |
|---------|-------------|----------|-----------------|-------|
| vault | YES | YES (with warning) | Limited — System Events hangs | Accessibility not fully granted |
| bots | YES | YES | YES | Best machine for GUI automation |
| jarvis | NO (no display) | YES (with warning) | YES | No screencapture, use for non-visual automation |

**Recommended: Use `bots` for GUI automation tasks.** It has full support.

## Prerequisites

- SSH access to target Mac
- cliclick installed: `ssh bots 'eval "$(/opt/homebrew/bin/brew shellenv)" && brew install cliclick'`
- All cliclick/osascript paths must use full paths or source brew shellenv

## IMPORTANT: PATH Setup

SSH sessions don't have Homebrew in PATH. Use full paths:
```bash
ssh bots '/opt/homebrew/bin/cliclick p'
```
Or source brew:
```bash
ssh bots 'eval "$(/opt/homebrew/bin/brew shellenv)" && cliclick p'
```

## Screenshots

```bash
# Take a screenshot (works on vault and bots, NOT jarvis)
ssh bots 'screencapture /tmp/screen.png'
scp bots:/tmp/screen.png ./screen.png

# Screenshot of specific region (x,y,width,height)
ssh bots 'screencapture -R 0,0,800,600 /tmp/region.png'
```

## Mouse Control with cliclick

```bash
# Click at coordinates
ssh bots '/opt/homebrew/bin/cliclick c:500,300'

# Double-click
ssh bots '/opt/homebrew/bin/cliclick dc:500,300'

# Right-click
ssh bots '/opt/homebrew/bin/cliclick rc:500,300'

# Move mouse
ssh bots '/opt/homebrew/bin/cliclick m:500,300'

# Click and drag
ssh bots '/opt/homebrew/bin/cliclick dd:100,100 du:500,500'

# Get current mouse position
ssh bots '/opt/homebrew/bin/cliclick p'

# Multiple actions with delays (w:ms)
ssh bots '/opt/homebrew/bin/cliclick c:500,300 w:500 c:600,400'
```

Note: cliclick may show "Accessibility privileges not enabled" warning on vault/jarvis. Clicks may still work for some operations but not all. On bots, everything works.

## Keyboard Input with cliclick

```bash
# Type text
ssh bots '/opt/homebrew/bin/cliclick t:"Hello World"'

# Press keys
ssh bots '/opt/homebrew/bin/cliclick kp:return'
ssh bots '/opt/homebrew/bin/cliclick kp:tab'
ssh bots '/opt/homebrew/bin/cliclick kp:escape'
ssh bots '/opt/homebrew/bin/cliclick kp:space'

# Key combinations
ssh bots '/opt/homebrew/bin/cliclick kd:cmd t:a ku:cmd'  # Cmd+A (select all)
ssh bots '/opt/homebrew/bin/cliclick kd:cmd t:c ku:cmd'  # Cmd+C (copy)
ssh bots '/opt/homebrew/bin/cliclick kd:cmd t:v ku:cmd'  # Cmd+V (paste)
ssh bots '/opt/homebrew/bin/cliclick kd:cmd t:t ku:cmd'  # Cmd+T (new tab)
ssh bots '/opt/homebrew/bin/cliclick kd:cmd t:w ku:cmd'  # Cmd+W (close tab)

# Modifier combos: cmd, alt, ctrl, shift, fn
ssh bots '/opt/homebrew/bin/cliclick kd:cmd,shift t:n ku:cmd,shift'  # Cmd+Shift+N
```

## AppleScript (osascript)

### CRITICAL: Never use `tell app "AppName"` for window operations

Direct app scripting (e.g., `tell app "Google Chrome" to count windows`) **hangs indefinitely** over SSH because the app's main thread doesn't respond to Apple Events from SSH sessions.

**Always use the System Events process wrapper instead:**

```bash
# WRONG — will hang:
ssh bots 'osascript -e "tell app \"Google Chrome\" to count windows"'

# RIGHT — works:
ssh bots 'osascript -e "tell app \"System Events\" to tell process \"Google Chrome\" to get name of every window"'
```

### Commands that DO work

```bash
# Get frontmost app
ssh bots 'osascript -e "tell app \"System Events\" to get name of first process whose frontmost is true"'

# List visible apps
ssh bots 'osascript -e "tell app \"System Events\" to get name of every process whose visible is true"'

# Get window names for an app
ssh bots 'osascript -e "tell app \"System Events\" to tell process \"Google Chrome\" to get name of every window"'

# Get window count via JXA (JavaScript for Automation)
ssh bots 'osascript -l JavaScript -e "Application(\"System Events\").processes.byName(\"Google Chrome\").windows.length"'

# Open a URL (safe — uses open command, not AppleScript)
ssh bots 'open https://example.com'
ssh bots 'open -a "Google Chrome" https://example.com'

# Send notification
ssh bots 'osascript -e "display notification \"Task complete\" with title \"Factory\""'

# Activate/bring app to front
ssh bots 'osascript -e "tell app \"Google Chrome\" to activate"'

# Quit an app (safe — simple quit command works)
ssh bots 'osascript -e "tell app \"Safari\" to quit"'
```

### Commands that HANG (avoid over SSH)

```bash
# These all hang — do NOT use:
tell app "Chrome" to count windows
tell app "Chrome" to get URL of active tab
tell app "Finder" to count windows
tell app "Chrome" to set bounds of window 1
```

## Clipboard

```bash
ssh bots 'pbpaste'                    # Get clipboard
ssh bots 'echo "Hello" | pbcopy'       # Set clipboard
ssh bots 'pbpaste > /tmp/clipboard.txt'  # Save clipboard to file
```

## Screen Reading with OCR

```bash
# Install tesseract (if not already)
ssh bots 'eval "$(/opt/homebrew/bin/brew shellenv)" && brew install tesseract'

# Screenshot + OCR
ssh bots 'screencapture /tmp/screen.png && /opt/homebrew/bin/tesseract /tmp/screen.png /tmp/screen_text'
ssh bots 'cat /tmp/screen_text.txt'

# OCR a specific region
ssh bots 'screencapture -R 100,200,600,400 /tmp/region.png && /opt/homebrew/bin/tesseract /tmp/region.png /tmp/region_text'
```

## Common Workflow: Screenshot → Analyze → Act

```bash
# 1. Take screenshot
ssh bots 'screencapture /tmp/screen.png'
scp bots:/tmp/screen.png ./screen.png
# 2. Read/analyze the screenshot (use vision capabilities)
# 3. Decide what to click/type based on what you see
# 4. Execute the action
ssh bots '/opt/homebrew/bin/cliclick c:X,Y'
# 5. Repeat
```

## Common Workflow: Type into a Text Field

```bash
ssh bots bash <<'EOF'
/opt/homebrew/bin/cliclick c:400,300   # Click the field
sleep 0.5
/opt/homebrew/bin/cliclick kd:cmd t:a ku:cmd  # Select all
sleep 0.2
/opt/homebrew/bin/cliclick t:"New text here"   # Type
sleep 0.2
/opt/homebrew/bin/cliclick kp:return            # Submit
EOF
```

## Timeouts

Always use `timeout` for osascript commands that might hang:
```bash
timeout 5 ssh bots 'osascript -e "..."' 2>&1 || echo "Command timed out"
```
