๐Ÿ”ง Where my dotfiles lives in harmony and peace, most of the time

๐Ÿ”ง Link pi skills directory

+186 -191
+1
agents/setup.sh
··· 23 23 ln -sf "${AGENTS_DIR}/AGENTS.md" "${HOME}/.pi/agent/AGENTS.md" 24 24 ln -sf "${AGENTS_DIR}/pi/settings.json" "${HOME}/.pi/agent/settings.json" 25 25 ln -sfT "${AGENTS_DIR}/pi/extensions" "${HOME}/.pi/agent/extensions" 26 + ln -sfT "${AGENTS_DIR}/skills" "${HOME}/.pi/agent/skills" 26 27 } 27 28 28 29 setup_amp() {
+131 -191
agents/skills/agent-browser/SKILL.md
··· 1 1 --- 2 2 name: agent-browser 3 - description: Automates browser interactions for web testing, form filling, screenshots, and data extraction. Use when the user needs to navigate websites, interact with web pages, fill forms, take screenshots, test web applications, or extract information from web pages. 3 + description: Browser automation CLI for AI agents. Use when the user needs to interact with websites, including navigating pages, filling forms, clicking buttons, taking screenshots, extracting data, testing web apps, or automating any browser task. Triggers include requests to "open a website", "fill out a form", "click a button", "take a screenshot", "scrape data from a page", "test this web app", "login to a site", "automate browser actions", or any task requiring programmatic web interaction. 4 4 --- 5 5 6 6 # Browser Automation with agent-browser 7 7 8 - ## Quick start 8 + ## Core Workflow 9 + 10 + Every browser automation follows this pattern: 11 + 12 + 1. **Navigate**: `agent-browser open <url>` 13 + 2. **Snapshot**: `agent-browser snapshot -i` (get element refs like `@e1`, `@e2`) 14 + 3. **Interact**: Use refs to click, fill, select 15 + 4. **Re-snapshot**: After navigation or DOM changes, get fresh refs 9 16 10 17 ```bash 11 - agent-browser open <url> # Navigate to page 12 - agent-browser snapshot -i # Get interactive elements with refs 13 - agent-browser click @e1 # Click element by ref 14 - agent-browser fill @e2 "text" # Fill input by ref 15 - agent-browser close # Close browser 18 + agent-browser open https://example.com/form 19 + agent-browser snapshot -i 20 + # Output: @e1 [input type="email"], @e2 [input type="password"], @e3 [button] "Submit" 21 + 22 + agent-browser fill @e1 "user@example.com" 23 + agent-browser fill @e2 "password123" 24 + agent-browser click @e3 25 + agent-browser wait --load networkidle 26 + agent-browser snapshot -i # Check result 16 27 ``` 17 28 18 - ## Core workflow 29 + ## Essential Commands 19 30 20 - 1. Navigate: `agent-browser open <url>` 21 - 2. Snapshot: `agent-browser snapshot -i` (returns elements with refs like `@e1`, `@e2`) 22 - 3. Interact using refs from the snapshot 23 - 4. Re-snapshot after navigation or significant DOM changes 31 + ```bash 32 + # Navigation 33 + agent-browser open <url> # Navigate (aliases: goto, navigate) 34 + agent-browser close # Close browser 24 35 25 - ## Commands 36 + # Snapshot 37 + agent-browser snapshot -i # Interactive elements with refs (recommended) 38 + agent-browser snapshot -i -C # Include cursor-interactive elements (divs with onclick, cursor:pointer) 39 + agent-browser snapshot -s "#selector" # Scope to CSS selector 26 40 27 - ### Navigation 28 - ```bash 29 - agent-browser open <url> # Navigate to URL 30 - agent-browser back # Go back 31 - agent-browser forward # Go forward 32 - agent-browser reload # Reload page 33 - agent-browser close # Close browser 34 - ``` 41 + # Interaction (use @refs from snapshot) 42 + agent-browser click @e1 # Click element 43 + agent-browser fill @e2 "text" # Clear and type text 44 + agent-browser type @e2 "text" # Type without clearing 45 + agent-browser select @e1 "option" # Select dropdown option 46 + agent-browser check @e1 # Check checkbox 47 + agent-browser press Enter # Press key 48 + agent-browser scroll down 500 # Scroll page 35 49 36 - ### Snapshot (page analysis) 37 - ```bash 38 - agent-browser snapshot # Full accessibility tree 39 - agent-browser snapshot -i # Interactive elements only (recommended) 40 - agent-browser snapshot -c # Compact output 41 - agent-browser snapshot -d 3 # Limit depth to 3 42 - agent-browser snapshot -s "#main" # Scope to CSS selector 43 - ``` 50 + # Get information 51 + agent-browser get text @e1 # Get element text 52 + agent-browser get url # Get current URL 53 + agent-browser get title # Get page title 44 54 45 - ### Interactions (use @refs from snapshot) 46 - ```bash 47 - agent-browser click @e1 # Click 48 - agent-browser dblclick @e1 # Double-click 49 - agent-browser focus @e1 # Focus element 50 - agent-browser fill @e2 "text" # Clear and type 51 - agent-browser type @e2 "text" # Type without clearing 52 - agent-browser press Enter # Press key 53 - agent-browser press Control+a # Key combination 54 - agent-browser keydown Shift # Hold key down 55 - agent-browser keyup Shift # Release key 56 - agent-browser hover @e1 # Hover 57 - agent-browser check @e1 # Check checkbox 58 - agent-browser uncheck @e1 # Uncheck checkbox 59 - agent-browser select @e1 "value" # Select dropdown 60 - agent-browser scroll down 500 # Scroll page 61 - agent-browser scrollintoview @e1 # Scroll element into view 62 - agent-browser drag @e1 @e2 # Drag and drop 63 - agent-browser upload @e1 file.pdf # Upload files 64 - ``` 55 + # Wait 56 + agent-browser wait @e1 # Wait for element 57 + agent-browser wait --load networkidle # Wait for network idle 58 + agent-browser wait --url "**/page" # Wait for URL pattern 59 + agent-browser wait 2000 # Wait milliseconds 65 60 66 - ### Get information 67 - ```bash 68 - agent-browser get text @e1 # Get element text 69 - agent-browser get html @e1 # Get innerHTML 70 - agent-browser get value @e1 # Get input value 71 - agent-browser get attr @e1 href # Get attribute 72 - agent-browser get title # Get page title 73 - agent-browser get url # Get current URL 74 - agent-browser get count ".item" # Count matching elements 75 - agent-browser get box @e1 # Get bounding box 61 + # Capture 62 + agent-browser screenshot # Screenshot to temp dir 63 + agent-browser screenshot --full # Full page screenshot 64 + agent-browser pdf output.pdf # Save as PDF 76 65 ``` 77 66 78 - ### Check state 79 - ```bash 80 - agent-browser is visible @e1 # Check if visible 81 - agent-browser is enabled @e1 # Check if enabled 82 - agent-browser is checked @e1 # Check if checked 83 - ``` 67 + ## Common Patterns 84 68 85 - ### Screenshots & PDF 86 - ```bash 87 - agent-browser screenshot # Screenshot to stdout 88 - agent-browser screenshot path.png # Save to file 89 - agent-browser screenshot --full # Full page 90 - agent-browser pdf output.pdf # Save as PDF 91 - ``` 69 + ### Form Submission 92 70 93 - ### Video recording 94 71 ```bash 95 - agent-browser record start ./demo.webm # Start recording (uses current URL + state) 96 - agent-browser click @e1 # Perform actions 97 - agent-browser record stop # Stop and save video 98 - agent-browser record restart ./take2.webm # Stop current + start new recording 72 + agent-browser open https://example.com/signup 73 + agent-browser snapshot -i 74 + agent-browser fill @e1 "Jane Doe" 75 + agent-browser fill @e2 "jane@example.com" 76 + agent-browser select @e3 "California" 77 + agent-browser check @e4 78 + agent-browser click @e5 79 + agent-browser wait --load networkidle 99 80 ``` 100 - Recording creates a fresh context but preserves cookies/storage from your session. If no URL is provided, it automatically returns to your current page. For smooth demos, explore first, then start recording. 101 81 102 - ### Wait 103 - ```bash 104 - agent-browser wait @e1 # Wait for element 105 - agent-browser wait 2000 # Wait milliseconds 106 - agent-browser wait --text "Success" # Wait for text 107 - agent-browser wait --url "**/dashboard" # Wait for URL pattern 108 - agent-browser wait --load networkidle # Wait for network idle 109 - agent-browser wait --fn "window.ready" # Wait for JS condition 110 - ``` 82 + ### Authentication with State Persistence 111 83 112 - ### Mouse control 113 84 ```bash 114 - agent-browser mouse move 100 200 # Move mouse 115 - agent-browser mouse down left # Press button 116 - agent-browser mouse up left # Release button 117 - agent-browser mouse wheel 100 # Scroll wheel 118 - ``` 85 + # Login once and save state 86 + agent-browser open https://app.example.com/login 87 + agent-browser snapshot -i 88 + agent-browser fill @e1 "$USERNAME" 89 + agent-browser fill @e2 "$PASSWORD" 90 + agent-browser click @e3 91 + agent-browser wait --url "**/dashboard" 92 + agent-browser state save auth.json 119 93 120 - ### Semantic locators (alternative to refs) 121 - ```bash 122 - agent-browser find role button click --name "Submit" 123 - agent-browser find text "Sign In" click 124 - agent-browser find label "Email" fill "user@test.com" 125 - agent-browser find first ".item" click 126 - agent-browser find nth 2 "a" text 94 + # Reuse in future sessions 95 + agent-browser state load auth.json 96 + agent-browser open https://app.example.com/dashboard 127 97 ``` 128 98 129 - ### Browser settings 99 + ### Data Extraction 100 + 130 101 ```bash 131 - agent-browser set viewport 1920 1080 # Set viewport size 132 - agent-browser set device "iPhone 14" # Emulate device 133 - agent-browser set geo 37.7749 -122.4194 # Set geolocation 134 - agent-browser set offline on # Toggle offline mode 135 - agent-browser set headers '{"X-Key":"v"}' # Extra HTTP headers 136 - agent-browser set credentials user pass # HTTP basic auth 137 - agent-browser set media dark # Emulate color scheme 102 + agent-browser open https://example.com/products 103 + agent-browser snapshot -i 104 + agent-browser get text @e5 # Get specific element text 105 + agent-browser get text body > page.txt # Get all page text 106 + 107 + # JSON output for parsing 108 + agent-browser snapshot -i --json 109 + agent-browser get text @e1 --json 138 110 ``` 139 111 140 - ### Cookies & Storage 112 + ### Parallel Sessions 113 + 141 114 ```bash 142 - agent-browser cookies # Get all cookies 143 - agent-browser cookies set name value # Set cookie 144 - agent-browser cookies clear # Clear cookies 145 - agent-browser storage local # Get all localStorage 146 - agent-browser storage local key # Get specific key 147 - agent-browser storage local set k v # Set value 148 - agent-browser storage local clear # Clear all 149 - ``` 115 + agent-browser --session site1 open https://site-a.com 116 + agent-browser --session site2 open https://site-b.com 117 + 118 + agent-browser --session site1 snapshot -i 119 + agent-browser --session site2 snapshot -i 150 120 151 - ### Network 152 - ```bash 153 - agent-browser network route <url> # Intercept requests 154 - agent-browser network route <url> --abort # Block requests 155 - agent-browser network route <url> --body '{}' # Mock response 156 - agent-browser network unroute [url] # Remove routes 157 - agent-browser network requests # View tracked requests 158 - agent-browser network requests --filter api # Filter requests 121 + agent-browser session list 159 122 ``` 160 123 161 - ### Tabs & Windows 162 - ```bash 163 - agent-browser tab # List tabs 164 - agent-browser tab new [url] # New tab 165 - agent-browser tab 2 # Switch to tab 166 - agent-browser tab close # Close tab 167 - agent-browser window new # New window 168 - ``` 124 + ### Visual Browser (Debugging) 169 125 170 - ### Frames 171 126 ```bash 172 - agent-browser frame "#iframe" # Switch to iframe 173 - agent-browser frame main # Back to main frame 127 + agent-browser --headed open https://example.com 128 + agent-browser highlight @e1 # Highlight element 129 + agent-browser record start demo.webm # Record session 174 130 ``` 175 131 176 - ### Dialogs 177 - ```bash 178 - agent-browser dialog accept [text] # Accept dialog 179 - agent-browser dialog dismiss # Dismiss dialog 180 - ``` 132 + ### Local Files (PDFs, HTML) 181 133 182 - ### JavaScript 183 134 ```bash 184 - agent-browser eval "document.title" # Run JavaScript 135 + # Open local files with file:// URLs 136 + agent-browser --allow-file-access open file:///path/to/document.pdf 137 + agent-browser --allow-file-access open file:///path/to/page.html 138 + agent-browser screenshot output.png 185 139 ``` 186 140 187 - ## Example: Form submission 141 + ### iOS Simulator (Mobile Safari) 188 142 189 143 ```bash 190 - agent-browser open https://example.com/form 191 - agent-browser snapshot -i 192 - # Output shows: textbox "Email" [ref=e1], textbox "Password" [ref=e2], button "Submit" [ref=e3] 144 + # List available iOS simulators 145 + agent-browser device list 193 146 194 - agent-browser fill @e1 "user@example.com" 195 - agent-browser fill @e2 "password123" 196 - agent-browser click @e3 197 - agent-browser wait --load networkidle 198 - agent-browser snapshot -i # Check result 199 - ``` 147 + # Launch Safari on a specific device 148 + agent-browser -p ios --device "iPhone 16 Pro" open https://example.com 200 149 201 - ## Example: Authentication with saved state 150 + # Same workflow as desktop - snapshot, interact, re-snapshot 151 + agent-browser -p ios snapshot -i 152 + agent-browser -p ios tap @e1 # Tap (alias for click) 153 + agent-browser -p ios fill @e2 "text" 154 + agent-browser -p ios swipe up # Mobile-specific gesture 202 155 203 - ```bash 204 - # Login once 205 - agent-browser open https://app.example.com/login 206 - agent-browser snapshot -i 207 - agent-browser fill @e1 "username" 208 - agent-browser fill @e2 "password" 209 - agent-browser click @e3 210 - agent-browser wait --url "**/dashboard" 211 - agent-browser state save auth.json 156 + # Take screenshot 157 + agent-browser -p ios screenshot mobile.png 212 158 213 - # Later sessions: load saved state 214 - agent-browser state load auth.json 215 - agent-browser open https://app.example.com/dashboard 159 + # Close session (shuts down simulator) 160 + agent-browser -p ios close 216 161 ``` 217 162 218 - ## Sessions (parallel browsers) 163 + **Requirements:** macOS with Xcode, Appium (`npm install -g appium && appium driver install xcuitest`) 219 164 220 - ```bash 221 - agent-browser --session test1 open site-a.com 222 - agent-browser --session test2 open site-b.com 223 - agent-browser session list 224 - ``` 165 + **Real devices:** Works with physical iOS devices if pre-configured. Use `--device "<UDID>"` where UDID is from `xcrun xctrace list devices`. 225 166 226 - ## JSON output (for parsing) 167 + ## Ref Lifecycle (Important) 227 168 228 - Add `--json` for machine-readable output: 169 + Refs (`@e1`, `@e2`, etc.) are invalidated when the page changes. Always re-snapshot after: 170 + 171 + - Clicking links or buttons that navigate 172 + - Form submissions 173 + - Dynamic content loading (dropdowns, modals) 174 + 229 175 ```bash 230 - agent-browser snapshot -i --json 231 - agent-browser get text @e1 --json 176 + agent-browser click @e5 # Navigates to new page 177 + agent-browser snapshot -i # MUST re-snapshot 178 + agent-browser click @e1 # Use new refs 232 179 ``` 233 180 234 - ## Debugging 181 + ## Semantic Locators (Alternative to Refs) 182 + 183 + When refs are unavailable or unreliable, use semantic locators: 235 184 236 185 ```bash 237 - agent-browser open example.com --headed # Show browser window 238 - agent-browser console # View console messages 239 - agent-browser errors # View page errors 240 - agent-browser record start ./debug.webm # Record from current page 241 - agent-browser record stop # Save recording 242 - agent-browser open example.com --headed # Show browser window 243 - agent-browser --cdp 9222 snapshot # Connect via CDP 244 - agent-browser console # View console messages 245 - agent-browser console --clear # Clear console 246 - agent-browser errors # View page errors 247 - agent-browser errors --clear # Clear errors 248 - agent-browser highlight @e1 # Highlight element 249 - agent-browser trace start # Start recording trace 250 - agent-browser trace stop trace.zip # Stop and save trace 186 + agent-browser find text "Sign In" click 187 + agent-browser find label "Email" fill "user@test.com" 188 + agent-browser find role button click --name "Submit" 189 + agent-browser find placeholder "Search" type "query" 190 + agent-browser find testid "submit-btn" click 251 191 ```
agents/skills/ask-questions-if-underspecified/SKILL.md agents/disabled-skills/ask-questions-if-underspecified/SKILL.md
agents/skills/duckdb/SKILL.md agents/disabled-skills/duckdb/SKILL.md
+54
agents/skills/web-search/SKILL.md
··· 1 + --- 2 + name: web-search 3 + description: Use when you need fast, headless web search or readable page content without a browser, especially when you only have curl/wget and need a deterministic URL-based endpoint 4 + --- 5 + 6 + # Jina AI Search + Reader 7 + 8 + ## Overview 9 + Use Jina AIโ€™s public endpoints to (1) search the web and (2) fetch readable page content as plain text/markdown via a single URL. 10 + 11 + ## When to Use 12 + - You need quick web search results without an API key or browser 13 + - You need readable page content from a URL for summarization or analysis 14 + - Youโ€™re in a headless environment and only have HTTP tools (curl/wget) 15 + 16 + When NOT to use: 17 + - You need advanced ranking, filters, or custom search parameters (use a full search API) 18 + 19 + ## Quick Reference 20 + 21 + ### Search (s.jina.ai) 22 + ``` 23 + https://s.jina.ai/YOUR_SEARCH_QUERY 24 + ``` 25 + 26 + ### Reader (r.jina.ai) 27 + ``` 28 + https://r.jina.ai/YOUR_URL 29 + ``` 30 + 31 + ## Implementation 32 + 33 + ### 1) Search for pages 34 + ```bash 35 + curl "https://s.jina.ai/jina%20ai%20reader%20usage" 36 + ``` 37 + - URL-encode spaces and special characters in the query. 38 + - Output returns search results with titles/snippets/links (plain text). 39 + 40 + ### 2) Fetch readable page content 41 + ```bash 42 + curl "https://r.jina.ai/https://example.com/article" 43 + ``` 44 + - Prepend `https://r.jina.ai/` to any HTTP/HTTPS URL. 45 + - Output is readable text/markdown for the target page. 46 + 47 + ### 3) Typical workflow 48 + 1. Use `s.jina.ai` to discover relevant links. 49 + 2. Use `r.jina.ai` to fetch readable content from those links. 50 + 51 + ## Common Mistakes 52 + - Forgetting to URL-encode the search query โ†’ results in malformed requests. 53 + - Omitting the original URL scheme (http/https) after `r.jina.ai/`. 54 + - Assuming `r.jina.ai` performs search (it only reads a specific URL).