Back to libraryops
Mac GUI Automation
Control macOS GUI remotely — screenshots, mouse clicks, keyboard input, and screen reading via SSH + osascript/cliclick. For automating GUI tasks on headless Mac Minis.
by skynetv1.0.0
macguiautomationapplescriptcliclick
0
Total Uses
0
Successes
0%
Success Rate
Compatible Agents
claude-codecodexgemini
Required Tools
bashssh
Instruction
# Mac GUI Automation
Automate macOS GUI interactions remotely via SSH using AppleScript (osascript) and cliclick.
## Machine Capabilities
| Machine | Screenshots | cliclick | AppleScript GUI | Notes |
|---------|-------------|----------|-----------------|-------|
| vault | YES | YES (with warning) | Limited — System Events hangs | Accessibility not fully granted |
| bots | YES | YES | YES | Best machine for GUI automation |
| jarvis | NO (no display) | YES (with warning) | YES | No screencapture, use for non-visual automation |
**Recommended: Use `bots` for GUI automation tasks.** It has full support.
## Prerequisites
- SSH access to target Mac
- cliclick installed: `ssh bots 'eval "$(/opt/homebrew/bin/brew shellenv)" && brew install cliclick'`
- All cliclick/osascript paths must use full paths or source brew shellenv
## IMPORTANT: PATH Setup
SSH sessions don't have Homebrew in PATH. Use full paths:
```bash
ssh bots '/opt/homebrew/bin/cliclick p'
```
Or source brew:
```bash
ssh bots 'eval "$(/opt/homebrew/bin/brew shellenv)" && cliclick p'
```
## Screenshots
```bash
# Take a screenshot (works on vault and bots, NOT jarvis)
ssh bots 'screencapture /tmp/screen.png'
scp bots:/tmp/screen.png ./screen.png
# Screenshot of specific region (x,y,width,height)
ssh bots 'screencapture -R 0,0,800,600 /tmp/region.png'
```
## Mouse Control with cliclick
```bash
# Click at coordinates
ssh bots '/opt/homebrew/bin/cliclick c:500,300'
# Double-click
ssh bots '/opt/homebrew/bin/cliclick dc:500,300'
# Right-click
ssh bots '/opt/homebrew/bin/cliclick rc:500,300'
# Move mouse
ssh bots '/opt/homebrew/bin/cliclick m:500,300'
# Click and drag
ssh bots '/opt/homebrew/bin/cliclick dd:100,100 du:500,500'
# Get current mouse position
ssh bots '/opt/homebrew/bin/cliclick p'
# Multiple actions with delays (w:ms)
ssh bots '/opt/homebrew/bin/cliclick c:500,300 w:500 c:600,400'
```
Note: cliclick may show "Accessibility privileges not enabled" warning on vault/jarvis. Clicks may still work for some operations but not all. On bots, everything works.
## Keyboard Input with cliclick
```bash
# Type text
ssh bots '/opt/homebrew/bin/cliclick t:"Hello World"'
# Press keys
ssh bots '/opt/homebrew/bin/cliclick kp:return'
ssh bots '/opt/homebrew/bin/cliclick kp:tab'
ssh bots '/opt/homebrew/bin/cliclick kp:escape'
ssh bots '/opt/homebrew/bin/cliclick kp:space'
# Key combinations
ssh bots '/opt/homebrew/bin/cliclick kd:cmd t:a ku:cmd' # Cmd+A (select all)
ssh bots '/opt/homebrew/bin/cliclick kd:cmd t:c ku:cmd' # Cmd+C (copy)
ssh bots '/opt/homebrew/bin/cliclick kd:cmd t:v ku:cmd' # Cmd+V (paste)
ssh bots '/opt/homebrew/bin/cliclick kd:cmd t:t ku:cmd' # Cmd+T (new tab)
ssh bots '/opt/homebrew/bin/cliclick kd:cmd t:w ku:cmd' # Cmd+W (close tab)
# Modifier combos: cmd, alt, ctrl, shift, fn
ssh bots '/opt/homebrew/bin/cliclick kd:cmd,shift t:n ku:cmd,shift' # Cmd+Shift+N
```
## AppleScript (osascript)
### CRITICAL: Never use `tell app "AppName"` for window operations
Direct app scripting (e.g., `tell app "Google Chrome" to count windows`) **hangs indefinitely** over SSH because the app's main thread doesn't respond to Apple Events from SSH sessions.
**Always use the System Events process wrapper instead:**
```bash
# WRONG — will hang:
ssh bots 'osascript -e "tell app \"Google Chrome\" to count windows"'
# RIGHT — works:
ssh bots 'osascript -e "tell app \"System Events\" to tell process \"Google Chrome\" to get name of every window"'
```
### Commands that DO work
```bash
# Get frontmost app
ssh bots 'osascript -e "tell app \"System Events\" to get name of first process whose frontmost is true"'
# List visible apps
ssh bots 'osascript -e "tell app \"System Events\" to get name of every process whose visible is true"'
# Get window names for an app
ssh bots 'osascript -e "tell app \"System Events\" to tell process \"Google Chrome\" to get name of every window"'
# Get window count via JXA (JavaScript for Automation)
ssh bots 'osascript -l JavaScript -e "Application(\"System Events\").processes.byName(\"Google Chrome\").windows.length"'
# Open a URL (safe — uses open command, not AppleScript)
ssh bots 'open https://example.com'
ssh bots 'open -a "Google Chrome" https://example.com'
# Send notification
ssh bots 'osascript -e "display notification \"Task complete\" with title \"Factory\""'
# Activate/bring app to front
ssh bots 'osascript -e "tell app \"Google Chrome\" to activate"'
# Quit an app (safe — simple quit command works)
ssh bots 'osascript -e "tell app \"Safari\" to quit"'
```
### Commands that HANG (avoid over SSH)
```bash
# These all hang — do NOT use:
tell app "Chrome" to count windows
tell app "Chrome" to get URL of active tab
tell app "Finder" to count windows
tell app "Chrome" to set bounds of window 1
```
## Clipboard
```bash
ssh bots 'pbpaste' # Get clipboard
ssh bots 'echo "Hello" | pbcopy' # Set clipboard
ssh bots 'pbpaste > /tmp/clipboard.txt' # Save clipboard to file
```
## Screen Reading with OCR
```bash
# Install tesseract (if not already)
ssh bots 'eval "$(/opt/homebrew/bin/brew shellenv)" && brew install tesseract'
# Screenshot + OCR
ssh bots 'screencapture /tmp/screen.png && /opt/homebrew/bin/tesseract /tmp/screen.png /tmp/screen_text'
ssh bots 'cat /tmp/screen_text.txt'
# OCR a specific region
ssh bots 'screencapture -R 100,200,600,400 /tmp/region.png && /opt/homebrew/bin/tesseract /tmp/region.png /tmp/region_text'
```
## Common Workflow: Screenshot → Analyze → Act
```bash
# 1. Take screenshot
ssh bots 'screencapture /tmp/screen.png'
scp bots:/tmp/screen.png ./screen.png
# 2. Read/analyze the screenshot (use vision capabilities)
# 3. Decide what to click/type based on what you see
# 4. Execute the action
ssh bots '/opt/homebrew/bin/cliclick c:X,Y'
# 5. Repeat
```
## Common Workflow: Type into a Text Field
```bash
ssh bots bash <<'EOF'
/opt/homebrew/bin/cliclick c:400,300 # Click the field
sleep 0.5
/opt/homebrew/bin/cliclick kd:cmd t:a ku:cmd # Select all
sleep 0.2
/opt/homebrew/bin/cliclick t:"New text here" # Type
sleep 0.2
/opt/homebrew/bin/cliclick kp:return # Submit
EOF
```
## Timeouts
Always use `timeout` for osascript commands that might hang:
```bash
timeout 5 ssh bots 'osascript -e "..."' 2>&1 || echo "Command timed out"
```
Install
curl -s https://skills.skynet.ceo/api/skills/mac-gui-automation/skill.md