✨ Add agent-browser skill · davidgasquez.com/dotfiles@6a569f4

davidgasquez.com / dotfiles

fork atom

🔧 Where my dotfiles lives in harmony and peace, most of the time

fork atom

✨ Add agent-browser skill

davidgasquez.com 2 months ago 6a569f45 a2d55ee9

+143

1 changed file

expand all

unified split

agents

skills

agent-browser

SKILL.md

+143

agents/skills/agent-browser/SKILL.md

··· 1 + --- 2 + name: agent-browser 3 + description: Automates browser interactions for web testing, form filling, screenshots, and data extraction. Use when the user needs to navigate websites, interact with web pages, fill forms, take screenshots, test web applications, or extract information from web pages. 4 + --- 5 + 6 + # Browser Automation with agent-browser 7 + 8 + ## Quick start 9 + 10 + ```bash 11 + agent-browser open <url> # Navigate to page 12 + agent-browser snapshot -i # Get interactive elements with refs 13 + agent-browser click @e1 # Click element by ref 14 + agent-browser fill @e2 "text" # Fill input by ref 15 + agent-browser close # Close browser 16 + ``` 17 + 18 + ## Core workflow 19 + 20 + 1. Navigate: `agent-browser open <url>` 21 + 2. Snapshot: `agent-browser snapshot -i` (returns elements with refs like `@e1`, `@e2`) 22 + 3. Interact using refs from the snapshot 23 + 4. Re-snapshot after navigation or significant DOM changes 24 + 25 + ## Commands 26 + 27 + ### Navigation 28 + ```bash 29 + agent-browser open <url> # Navigate to URL 30 + agent-browser back # Go back 31 + agent-browser forward # Go forward 32 + agent-browser reload # Reload page 33 + agent-browser close # Close browser 34 + ``` 35 + 36 + ### Snapshot (page analysis) 37 + ```bash 38 + agent-browser snapshot # Full accessibility tree 39 + agent-browser snapshot -i # Interactive elements only (recommended) 40 + agent-browser snapshot -c # Compact output 41 + agent-browser snapshot -d 3 # Limit depth to 3 42 + ``` 43 + 44 + ### Interactions (use @refs from snapshot) 45 + ```bash 46 + agent-browser click @e1 # Click 47 + agent-browser dblclick @e1 # Double-click 48 + agent-browser fill @e2 "text" # Clear and type 49 + agent-browser type @e2 "text" # Type without clearing 50 + agent-browser press Enter # Press key 51 + agent-browser press Control+a # Key combination 52 + agent-browser hover @e1 # Hover 53 + agent-browser check @e1 # Check checkbox 54 + agent-browser uncheck @e1 # Uncheck checkbox 55 + agent-browser select @e1 "value" # Select dropdown 56 + agent-browser scroll down 500 # Scroll page 57 + agent-browser scrollintoview @e1 # Scroll element into view 58 + ``` 59 + 60 + ### Get information 61 + ```bash 62 + agent-browser get text @e1 # Get element text 63 + agent-browser get value @e1 # Get input value 64 + agent-browser get title # Get page title 65 + agent-browser get url # Get current URL 66 + ``` 67 + 68 + ### Screenshots 69 + ```bash 70 + agent-browser screenshot # Screenshot to stdout 71 + agent-browser screenshot path.png # Save to file 72 + agent-browser screenshot --full # Full page 73 + ``` 74 + 75 + ### Wait 76 + ```bash 77 + agent-browser wait @e1 # Wait for element 78 + agent-browser wait 2000 # Wait milliseconds 79 + agent-browser wait --text "Success" # Wait for text 80 + agent-browser wait --load networkidle # Wait for network idle 81 + ``` 82 + 83 + ### Semantic locators (alternative to refs) 84 + ```bash 85 + agent-browser find role button click --name "Submit" 86 + agent-browser find text "Sign In" click 87 + agent-browser find label "Email" fill "user@test.com" 88 + ``` 89 + 90 + ## Example: Form submission 91 + 92 + ```bash 93 + agent-browser open https://example.com/form 94 + agent-browser snapshot -i 95 + # Output shows: textbox "Email" [ref=e1], textbox "Password" [ref=e2], button "Submit" [ref=e3] 96 + 97 + agent-browser fill @e1 "user@example.com" 98 + agent-browser fill @e2 "password123" 99 + agent-browser click @e3 100 + agent-browser wait --load networkidle 101 + agent-browser snapshot -i # Check result 102 + ``` 103 + 104 + ## Example: Authentication with saved state 105 + 106 + ```bash 107 + # Login once 108 + agent-browser open https://app.example.com/login 109 + agent-browser snapshot -i 110 + agent-browser fill @e1 "username" 111 + agent-browser fill @e2 "password" 112 + agent-browser click @e3 113 + agent-browser wait --url "**/dashboard" 114 + agent-browser state save auth.json 115 + 116 + # Later sessions: load saved state 117 + agent-browser state load auth.json 118 + agent-browser open https://app.example.com/dashboard 119 + ``` 120 + 121 + ## Sessions (parallel browsers) 122 + 123 + ```bash 124 + agent-browser --session test1 open site-a.com 125 + agent-browser --session test2 open site-b.com 126 + agent-browser session list 127 + ``` 128 + 129 + ## JSON output (for parsing) 130 + 131 + Add `--json` for machine-readable output: 132 + ```bash 133 + agent-browser snapshot -i --json 134 + agent-browser get text @e1 --json 135 + ``` 136 + 137 + ## Debugging 138 + 139 + ```bash 140 + agent-browser open example.com --headed # Show browser window 141 + agent-browser console # View console messages 142 + agent-browser errors # View page errors 143 + ```