rudimentry _redirects support, incremental uploading for cli #3

+31 -1

README.md

··· 50 50 cargo build 51 51 ``` 52 52 53 + ## Features 54 + 55 + ### URL Redirects and Rewrites 56 + 57 + The hosting service supports Netlify-style `_redirects` files for managing URLs. Place a `_redirects` file in your site root to enable: 58 + 59 + - **301/302 Redirects**: Permanent and temporary URL redirects 60 + - **200 Rewrites**: Serve different content without changing the URL 61 + - **404 Custom Pages**: Custom error pages for specific paths 62 + - **Splats & Placeholders**: Dynamic path matching (`/blog/:year/:month/:day`, `/news/*`) 63 + - **Query Parameter Matching**: Redirect based on URL parameters 64 + - **Conditional Redirects**: Route by country, language, or cookie presence 65 + - **Force Redirects**: Override existing files with redirects 66 + 67 + Example `_redirects`: 68 + ``` 69 + # Single-page app routing (React, Vue, etc.) 70 + /* /index.html 200 71 + 72 + # Simple redirects 73 + /home / 74 + /old-blog/* /blog/:splat 75 + 76 + # API proxy 77 + /api/* https://api.example.com/:splat 200 78 + 79 + # Country-based routing 80 + / /us/ 302 Country=us 81 + / /uk/ 302 Country=gb 82 + ``` 83 + 53 84 ## Limits 54 85 55 86 - Max file size: 100MB (PDS limit) 56 - - Max site size: 300MB 57 87 - Max files: 2000 58 88 59 89 ## Tech Stack

-123

hosting-service/EXAMPLE.md

··· 1 - # HTML Path Rewriting Example 2 - 3 - This document demonstrates how HTML path rewriting works when serving sites via the `/s/:identifier/:site/*` route. 4 - 5 - ## Problem 6 - 7 - When you create a static site with absolute paths like `/style.css` or `/images/logo.png`, these paths work fine when served from the root domain. However, when served from a subdirectory like `/s/alice.bsky.social/mysite/`, these absolute paths break because they resolve to the server root instead of the site root. 8 - 9 - ## Solution 10 - 11 - The hosting service automatically rewrites absolute paths in HTML files to work correctly in the subdirectory context. 12 - 13 - ## Example 14 - 15 - **Original HTML file (index.html):** 16 - ```html 17 - <!DOCTYPE html> 18 - <html> 19 - <head> 20 - <meta charset="UTF-8"> 21 - <title>My Site</title> 22 - <link rel="stylesheet" href="/style.css"> 23 - <link rel="icon" href="/favicon.ico"> 24 - <script src="/app.js"></script> 25 - </head> 26 - <body> 27 - <header> 28 - <img src="/images/logo.png" alt="Logo"> 29 - <nav> 30 - <a href="/">Home</a> 31 - <a href="/about">About</a> 32 - <a href="/contact">Contact</a> 33 - </nav> 34 - </header> 35 - 36 - <main> 37 - <h1>Welcome</h1> 38 - <img src="/images/hero.jpg" 39 - srcset="/images/hero.jpg 1x, /images/hero@2x.jpg 2x" 40 - alt="Hero"> 41 - 42 - <form action="/submit" method="post"> 43 - <input type="text" name="email"> 44 - <button>Submit</button> 45 - </form> 46 - </main> 47 - 48 - <footer> 49 - <a href="https://example.com">External Link</a> 50 - <a href="#top">Back to Top</a> 51 - </footer> 52 - </body> 53 - </html> 54 - ``` 55 - 56 - **When accessed via `/s/alice.bsky.social/mysite/`, the HTML is rewritten to:** 57 - ```html 58 - <!DOCTYPE html> 59 - <html> 60 - <head> 61 - <meta charset="UTF-8"> 62 - <title>My Site</title> 63 - <link rel="stylesheet" href="/s/alice.bsky.social/mysite/style.css"> 64 - <link rel="icon" href="/s/alice.bsky.social/mysite/favicon.ico"> 65 - <script src="/s/alice.bsky.social/mysite/app.js"></script> 66 - </head> 67 - <body> 68 - <header> 69 - <img src="/s/alice.bsky.social/mysite/images/logo.png" alt="Logo"> 70 - <nav> 71 - <a href="/s/alice.bsky.social/mysite/">Home</a> 72 - <a href="/s/alice.bsky.social/mysite/about">About</a> 73 - <a href="/s/alice.bsky.social/mysite/contact">Contact</a> 74 - </nav> 75 - </header> 76 - 77 - <main> 78 - <h1>Welcome</h1> 79 - <img src="/s/alice.bsky.social/mysite/images/hero.jpg" 80 - srcset="/s/alice.bsky.social/mysite/images/hero.jpg 1x, /s/alice.bsky.social/mysite/images/hero@2x.jpg 2x" 81 - alt="Hero"> 82 - 83 - <form action="/s/alice.bsky.social/mysite/submit" method="post"> 84 - <input type="text" name="email"> 85 - <button>Submit</button> 86 - </form> 87 - </main> 88 - 89 - <footer> 90 - <a href="https://example.com">External Link</a> 91 - <a href="#top">Back to Top</a> 92 - </footer> 93 - </body> 94 - </html> 95 - ``` 96 - 97 - ## What's Preserved 98 - 99 - Notice that: 100 - - ✅ Absolute paths are rewritten: `/style.css` → `/s/alice.bsky.social/mysite/style.css` 101 - - ✅ External URLs are preserved: `https://example.com` stays the same 102 - - ✅ Anchors are preserved: `#top` stays the same 103 - - ✅ The rewriting is safe and won't break your site 104 - 105 - ## Supported Attributes 106 - 107 - The rewriter handles these HTML attributes: 108 - - `src` - images, scripts, iframes, videos, audio 109 - - `href` - links, stylesheets 110 - - `action` - forms 111 - - `data` - objects 112 - - `poster` - video posters 113 - - `srcset` - responsive images 114 - 115 - ## Testing Your Site 116 - 117 - To test if your site works with path rewriting: 118 - 119 - 1. Upload your site to your PDS as a `place.wisp.fs` record 120 - 2. Access it via: `https://hosting.wisp.place/s/YOUR_HANDLE/SITE_NAME/` 121 - 3. Check that all resources load correctly 122 - 123 - If you're using relative paths already (like `./style.css` or `../images/logo.png`), they'll work without any rewriting.

+134

hosting-service/example-_redirects

··· 1 + # Example _redirects file for Wisp hosting 2 + # Place this file in the root directory of your site as "_redirects" 3 + # Lines starting with # are comments 4 + 5 + # =================================== 6 + # SIMPLE REDIRECTS 7 + # =================================== 8 + 9 + # Redirect home page 10 + # /home / 11 + 12 + # Redirect old URLs to new ones 13 + # /old-blog /blog 14 + # /about-us /about 15 + 16 + # =================================== 17 + # SPLAT REDIRECTS (WILDCARDS) 18 + # =================================== 19 + 20 + # Redirect entire directories 21 + # /news/* /blog/:splat 22 + # /old-site/* /new-site/:splat 23 + 24 + # =================================== 25 + # PLACEHOLDER REDIRECTS 26 + # =================================== 27 + 28 + # Restructure blog URLs 29 + # /blog/:year/:month/:day/:slug /posts/:year-:month-:day/:slug 30 + 31 + # Capture multiple parameters 32 + # /products/:category/:id /shop/:category/item/:id 33 + 34 + # =================================== 35 + # STATUS CODES 36 + # =================================== 37 + 38 + # Permanent redirect (301) - default if not specified 39 + # /permanent-move /new-location 301 40 + 41 + # Temporary redirect (302) 42 + # /temp-redirect /temp-location 302 43 + 44 + # Rewrite (200) - serves different content, URL stays the same 45 + # /api/* /functions/:splat 200 46 + 47 + # Custom 404 page 48 + # /shop/* /shop-closed.html 404 49 + 50 + # =================================== 51 + # FORCE REDIRECTS 52 + # =================================== 53 + 54 + # Force redirect even if file exists (note the ! after status code) 55 + # /override-file /other-file.html 200! 56 + 57 + # =================================== 58 + # CONDITIONAL REDIRECTS 59 + # =================================== 60 + 61 + # Country-based redirects (ISO 3166-1 alpha-2 codes) 62 + # / /us/ 302 Country=us 63 + # / /uk/ 302 Country=gb 64 + # / /anz/ 302 Country=au,nz 65 + 66 + # Language-based redirects 67 + # /products /en/products 301 Language=en 68 + # /products /de/products 301 Language=de 69 + # /products /fr/products 301 Language=fr 70 + 71 + # Cookie-based redirects (checks if cookie exists) 72 + # /* /legacy/:splat 200 Cookie=is_legacy 73 + 74 + # =================================== 75 + # QUERY PARAMETERS 76 + # =================================== 77 + 78 + # Match specific query parameters 79 + # /store id=:id /blog/:id 301 80 + 81 + # Multiple parameters 82 + # /search q=:query category=:cat /find/:cat/:query 301 83 + 84 + # =================================== 85 + # DOMAIN-LEVEL REDIRECTS 86 + # =================================== 87 + 88 + # Redirect to different domain (must include protocol) 89 + # /external https://example.com/path 90 + 91 + # Redirect entire subdomain 92 + # http://blog.example.com/* https://example.com/blog/:splat 301! 93 + # https://blog.example.com/* https://example.com/blog/:splat 301! 94 + 95 + # =================================== 96 + # COMMON PATTERNS 97 + # =================================== 98 + 99 + # Remove .html extensions 100 + # /page.html /page 101 + 102 + # Add trailing slash 103 + # /about /about/ 104 + 105 + # Single-page app fallback (serve index.html for all paths) 106 + # /* /index.html 200 107 + 108 + # API proxy 109 + # /api/* https://api.example.com/:splat 200 110 + 111 + # =================================== 112 + # CUSTOM ERROR PAGES 113 + # =================================== 114 + 115 + # Language-specific 404 pages 116 + # /en/* /en/404.html 404 117 + # /de/* /de/404.html 404 118 + 119 + # Section-specific 404 pages 120 + # /shop/* /shop/not-found.html 404 121 + # /blog/* /blog/404.html 404 122 + 123 + # =================================== 124 + # NOTES 125 + # =================================== 126 + # 127 + # - Rules are processed in order (first match wins) 128 + # - More specific rules should come before general ones 129 + # - Splats (*) can only be used at the end of a path 130 + # - Query parameters are automatically preserved for 200, 301, 302 131 + # - Trailing slashes are normalized (/ and no / are treated the same) 132 + # - Default status code is 301 if not specified 133 + # 134 +

+215

hosting-service/src/lib/redirects.test.ts

··· 1 + import { describe, it, expect } from 'bun:test' 2 + import { parseRedirectsFile, matchRedirectRule } from './redirects'; 3 + 4 + describe('parseRedirectsFile', () => { 5 + it('should parse simple redirects', () => { 6 + const content = ` 7 + # Comment line 8 + /old-path /new-path 9 + /home / 301 10 + `; 11 + const rules = parseRedirectsFile(content); 12 + expect(rules).toHaveLength(2); 13 + expect(rules[0]).toMatchObject({ 14 + from: '/old-path', 15 + to: '/new-path', 16 + status: 301, 17 + force: false, 18 + }); 19 + expect(rules[1]).toMatchObject({ 20 + from: '/home', 21 + to: '/', 22 + status: 301, 23 + force: false, 24 + }); 25 + }); 26 + 27 + it('should parse redirects with different status codes', () => { 28 + const content = ` 29 + /temp-redirect /target 302 30 + /rewrite /content 200 31 + /not-found /404 404 32 + `; 33 + const rules = parseRedirectsFile(content); 34 + expect(rules).toHaveLength(3); 35 + expect(rules[0]?.status).toBe(302); 36 + expect(rules[1]?.status).toBe(200); 37 + expect(rules[2]?.status).toBe(404); 38 + }); 39 + 40 + it('should parse force redirects', () => { 41 + const content = `/force-path /target 301!`; 42 + const rules = parseRedirectsFile(content); 43 + expect(rules[0]?.force).toBe(true); 44 + expect(rules[0]?.status).toBe(301); 45 + }); 46 + 47 + it('should parse splat redirects', () => { 48 + const content = `/news/* /blog/:splat`; 49 + const rules = parseRedirectsFile(content); 50 + expect(rules[0]?.from).toBe('/news/*'); 51 + expect(rules[0]?.to).toBe('/blog/:splat'); 52 + }); 53 + 54 + it('should parse placeholder redirects', () => { 55 + const content = `/blog/:year/:month/:day /posts/:year-:month-:day`; 56 + const rules = parseRedirectsFile(content); 57 + expect(rules[0]?.from).toBe('/blog/:year/:month/:day'); 58 + expect(rules[0]?.to).toBe('/posts/:year-:month-:day'); 59 + }); 60 + 61 + it('should parse country-based redirects', () => { 62 + const content = `/ /anz 302 Country=au,nz`; 63 + const rules = parseRedirectsFile(content); 64 + expect(rules[0]?.conditions?.country).toEqual(['au', 'nz']); 65 + }); 66 + 67 + it('should parse language-based redirects', () => { 68 + const content = `/products /en/products 301 Language=en`; 69 + const rules = parseRedirectsFile(content); 70 + expect(rules[0]?.conditions?.language).toEqual(['en']); 71 + }); 72 + 73 + it('should parse cookie-based redirects', () => { 74 + const content = `/* /legacy/:splat 200 Cookie=is_legacy,my_cookie`; 75 + const rules = parseRedirectsFile(content); 76 + expect(rules[0]?.conditions?.cookie).toEqual(['is_legacy', 'my_cookie']); 77 + }); 78 + }); 79 + 80 + describe('matchRedirectRule', () => { 81 + it('should match exact paths', () => { 82 + const rules = parseRedirectsFile('/old-path /new-path'); 83 + const match = matchRedirectRule('/old-path', rules); 84 + expect(match).toBeTruthy(); 85 + expect(match?.targetPath).toBe('/new-path'); 86 + expect(match?.status).toBe(301); 87 + }); 88 + 89 + it('should match paths with trailing slash', () => { 90 + const rules = parseRedirectsFile('/old-path /new-path'); 91 + const match = matchRedirectRule('/old-path/', rules); 92 + expect(match).toBeTruthy(); 93 + expect(match?.targetPath).toBe('/new-path'); 94 + }); 95 + 96 + it('should match splat patterns', () => { 97 + const rules = parseRedirectsFile('/news/* /blog/:splat'); 98 + const match = matchRedirectRule('/news/2024/01/15/my-post', rules); 99 + expect(match).toBeTruthy(); 100 + expect(match?.targetPath).toBe('/blog/2024/01/15/my-post'); 101 + }); 102 + 103 + it('should match placeholder patterns', () => { 104 + const rules = parseRedirectsFile('/blog/:year/:month/:day /posts/:year-:month-:day'); 105 + const match = matchRedirectRule('/blog/2024/01/15', rules); 106 + expect(match).toBeTruthy(); 107 + expect(match?.targetPath).toBe('/posts/2024-01-15'); 108 + }); 109 + 110 + it('should preserve query strings for 301/302 redirects', () => { 111 + const rules = parseRedirectsFile('/old /new 301'); 112 + const match = matchRedirectRule('/old', rules, { 113 + queryParams: { foo: 'bar', baz: 'qux' }, 114 + }); 115 + expect(match?.targetPath).toContain('?'); 116 + expect(match?.targetPath).toContain('foo=bar'); 117 + expect(match?.targetPath).toContain('baz=qux'); 118 + }); 119 + 120 + it('should match based on query parameters', () => { 121 + const rules = parseRedirectsFile('/store id=:id /blog/:id 301'); 122 + const match = matchRedirectRule('/store', rules, { 123 + queryParams: { id: 'my-post' }, 124 + }); 125 + expect(match).toBeTruthy(); 126 + expect(match?.targetPath).toContain('/blog/my-post'); 127 + }); 128 + 129 + it('should not match when query params are missing', () => { 130 + const rules = parseRedirectsFile('/store id=:id /blog/:id 301'); 131 + const match = matchRedirectRule('/store', rules, { 132 + queryParams: {}, 133 + }); 134 + expect(match).toBeNull(); 135 + }); 136 + 137 + it('should match based on country header', () => { 138 + const rules = parseRedirectsFile('/ /aus 302 Country=au'); 139 + const match = matchRedirectRule('/', rules, { 140 + headers: { 'cf-ipcountry': 'AU' }, 141 + }); 142 + expect(match).toBeTruthy(); 143 + expect(match?.targetPath).toBe('/aus'); 144 + }); 145 + 146 + it('should not match wrong country', () => { 147 + const rules = parseRedirectsFile('/ /aus 302 Country=au'); 148 + const match = matchRedirectRule('/', rules, { 149 + headers: { 'cf-ipcountry': 'US' }, 150 + }); 151 + expect(match).toBeNull(); 152 + }); 153 + 154 + it('should match based on language header', () => { 155 + const rules = parseRedirectsFile('/products /en/products 301 Language=en'); 156 + const match = matchRedirectRule('/products', rules, { 157 + headers: { 'accept-language': 'en-US,en;q=0.9' }, 158 + }); 159 + expect(match).toBeTruthy(); 160 + expect(match?.targetPath).toBe('/en/products'); 161 + }); 162 + 163 + it('should match based on cookie presence', () => { 164 + const rules = parseRedirectsFile('/* /legacy/:splat 200 Cookie=is_legacy'); 165 + const match = matchRedirectRule('/some-path', rules, { 166 + cookies: { is_legacy: 'true' }, 167 + }); 168 + expect(match).toBeTruthy(); 169 + expect(match?.targetPath).toBe('/legacy/some-path'); 170 + }); 171 + 172 + it('should return first matching rule', () => { 173 + const content = ` 174 + /path /first 175 + /path /second 176 + `; 177 + const rules = parseRedirectsFile(content); 178 + const match = matchRedirectRule('/path', rules); 179 + expect(match?.targetPath).toBe('/first'); 180 + }); 181 + 182 + it('should match more specific rules before general ones', () => { 183 + const content = ` 184 + /jobs/customer-ninja /careers/support 185 + /jobs/* /careers/:splat 186 + `; 187 + const rules = parseRedirectsFile(content); 188 + 189 + const match1 = matchRedirectRule('/jobs/customer-ninja', rules); 190 + expect(match1?.targetPath).toBe('/careers/support'); 191 + 192 + const match2 = matchRedirectRule('/jobs/developer', rules); 193 + expect(match2?.targetPath).toBe('/careers/developer'); 194 + }); 195 + 196 + it('should handle SPA routing pattern', () => { 197 + const rules = parseRedirectsFile('/* /index.html 200'); 198 + 199 + // Should match any path 200 + const match1 = matchRedirectRule('/about', rules); 201 + expect(match1).toBeTruthy(); 202 + expect(match1?.targetPath).toBe('/index.html'); 203 + expect(match1?.status).toBe(200); 204 + 205 + const match2 = matchRedirectRule('/users/123/profile', rules); 206 + expect(match2).toBeTruthy(); 207 + expect(match2?.targetPath).toBe('/index.html'); 208 + expect(match2?.status).toBe(200); 209 + 210 + const match3 = matchRedirectRule('/', rules); 211 + expect(match3).toBeTruthy(); 212 + expect(match3?.targetPath).toBe('/index.html'); 213 + }); 214 + }); 215 +

+413

hosting-service/src/lib/redirects.ts

··· 1 + import { readFile } from 'fs/promises'; 2 + import { existsSync } from 'fs'; 3 + 4 + export interface RedirectRule { 5 + from: string; 6 + to: string; 7 + status: number; 8 + force: boolean; 9 + conditions?: { 10 + country?: string[]; 11 + language?: string[]; 12 + role?: string[]; 13 + cookie?: string[]; 14 + }; 15 + // For pattern matching 16 + fromPattern?: RegExp; 17 + fromParams?: string[]; // Named parameters from the pattern 18 + queryParams?: Record<string, string>; // Expected query parameters 19 + } 20 + 21 + export interface RedirectMatch { 22 + rule: RedirectRule; 23 + targetPath: string; 24 + status: number; 25 + } 26 + 27 + /** 28 + * Parse a _redirects file into an array of redirect rules 29 + */ 30 + export function parseRedirectsFile(content: string): RedirectRule[] { 31 + const lines = content.split('\n'); 32 + const rules: RedirectRule[] = []; 33 + 34 + for (let lineNum = 0; lineNum < lines.length; lineNum++) { 35 + const lineRaw = lines[lineNum]; 36 + if (!lineRaw) continue; 37 + 38 + const line = lineRaw.trim(); 39 + 40 + // Skip empty lines and comments 41 + if (!line || line.startsWith('#')) { 42 + continue; 43 + } 44 + 45 + try { 46 + const rule = parseRedirectLine(line); 47 + if (rule && rule.fromPattern) { 48 + rules.push(rule); 49 + } 50 + } catch (err) { 51 + console.warn(`Failed to parse redirect rule on line ${lineNum + 1}: ${line}`, err); 52 + } 53 + } 54 + 55 + return rules; 56 + } 57 + 58 + /** 59 + * Parse a single redirect rule line 60 + * Format: /from [query_params] /to [status] [conditions] 61 + */ 62 + function parseRedirectLine(line: string): RedirectRule | null { 63 + // Split by whitespace, but respect quoted strings (though not commonly used) 64 + const parts = line.split(/\s+/); 65 + 66 + if (parts.length < 2) { 67 + return null; 68 + } 69 + 70 + let idx = 0; 71 + const from = parts[idx++]; 72 + 73 + if (!from) { 74 + return null; 75 + } 76 + 77 + let status = 301; // Default status 78 + let force = false; 79 + const conditions: NonNullable<RedirectRule['conditions']> = {}; 80 + const queryParams: Record<string, string> = {}; 81 + 82 + // Parse query parameters that come before the destination path 83 + // They look like: key=:value (and don't start with /) 84 + while (idx < parts.length) { 85 + const part = parts[idx]; 86 + if (!part) { 87 + idx++; 88 + continue; 89 + } 90 + 91 + // If it starts with / or http, it's the destination path 92 + if (part.startsWith('/') || part.startsWith('http://') || part.startsWith('https://')) { 93 + break; 94 + } 95 + 96 + // If it contains = and comes before the destination, it's a query param 97 + if (part.includes('=')) { 98 + const splitIndex = part.indexOf('='); 99 + const key = part.slice(0, splitIndex); 100 + const value = part.slice(splitIndex + 1); 101 + 102 + if (key && value) { 103 + queryParams[key] = value; 104 + } 105 + idx++; 106 + } else { 107 + // Not a query param, must be destination or something else 108 + break; 109 + } 110 + } 111 + 112 + // Next part should be the destination 113 + if (idx >= parts.length) { 114 + return null; 115 + } 116 + 117 + const to = parts[idx++]; 118 + if (!to) { 119 + return null; 120 + } 121 + 122 + // Parse remaining parts for status code and conditions 123 + for (let i = idx; i < parts.length; i++) { 124 + const part = parts[i]; 125 + 126 + if (!part) continue; 127 + 128 + // Check for status code (with optional ! for force) 129 + if (/^\d+!?$/.test(part)) { 130 + if (part.endsWith('!')) { 131 + force = true; 132 + status = parseInt(part.slice(0, -1)); 133 + } else { 134 + status = parseInt(part); 135 + } 136 + continue; 137 + } 138 + 139 + // Check for condition parameters (Country=, Language=, Role=, Cookie=) 140 + if (part.includes('=')) { 141 + const splitIndex = part.indexOf('='); 142 + const key = part.slice(0, splitIndex); 143 + const value = part.slice(splitIndex + 1); 144 + 145 + if (!key || !value) continue; 146 + 147 + const keyLower = key.toLowerCase(); 148 + 149 + if (keyLower === 'country') { 150 + conditions.country = value.split(',').map(v => v.trim().toLowerCase()); 151 + } else if (keyLower === 'language') { 152 + conditions.language = value.split(',').map(v => v.trim().toLowerCase()); 153 + } else if (keyLower === 'role') { 154 + conditions.role = value.split(',').map(v => v.trim()); 155 + } else if (keyLower === 'cookie') { 156 + conditions.cookie = value.split(',').map(v => v.trim().toLowerCase()); 157 + } 158 + } 159 + } 160 + 161 + // Parse the 'from' pattern 162 + const { pattern, params } = convertPathToRegex(from); 163 + 164 + return { 165 + from, 166 + to, 167 + status, 168 + force, 169 + conditions: Object.keys(conditions).length > 0 ? conditions : undefined, 170 + queryParams: Object.keys(queryParams).length > 0 ? queryParams : undefined, 171 + fromPattern: pattern, 172 + fromParams: params, 173 + }; 174 + } 175 + 176 + /** 177 + * Convert a path pattern with placeholders and splats to a regex 178 + * Examples: 179 + * /blog/:year/:month/:day -> captures year, month, day 180 + * /news/* -> captures splat 181 + */ 182 + function convertPathToRegex(pattern: string): { pattern: RegExp; params: string[] } { 183 + const params: string[] = []; 184 + let regexStr = '^'; 185 + 186 + // Split by query string if present 187 + const pathPart = pattern.split('?')[0] || pattern; 188 + 189 + // Escape special regex characters except * and : 190 + let escaped = pathPart.replace(/[.+^${}()|[\]\\]/g, '\\$&'); 191 + 192 + // Replace :param with named capture groups 193 + escaped = escaped.replace(/:([a-zA-Z_][a-zA-Z0-9_]*)/g, (match, paramName) => { 194 + params.push(paramName); 195 + // Match path segment (everything except / and ?) 196 + return '([^/?]+)'; 197 + }); 198 + 199 + // Replace * with splat capture (matches everything including /) 200 + if (escaped.includes('*')) { 201 + escaped = escaped.replace(/\*/g, '(.*)'); 202 + params.push('splat'); 203 + } 204 + 205 + regexStr += escaped; 206 + 207 + // Make trailing slash optional 208 + if (!regexStr.endsWith('.*')) { 209 + regexStr += '/?'; 210 + } 211 + 212 + regexStr += '$'; 213 + 214 + return { 215 + pattern: new RegExp(regexStr), 216 + params, 217 + }; 218 + } 219 + 220 + /** 221 + * Match a request path against redirect rules 222 + */ 223 + export function matchRedirectRule( 224 + requestPath: string, 225 + rules: RedirectRule[], 226 + context?: { 227 + queryParams?: Record<string, string>; 228 + headers?: Record<string, string>; 229 + cookies?: Record<string, string>; 230 + } 231 + ): RedirectMatch | null { 232 + // Normalize path: ensure leading slash, remove trailing slash (except for root) 233 + let normalizedPath = requestPath.startsWith('/') ? requestPath : `/${requestPath}`; 234 + 235 + for (const rule of rules) { 236 + // Check query parameter conditions first (if any) 237 + if (rule.queryParams) { 238 + // If rule requires query params but none provided, skip this rule 239 + if (!context?.queryParams) { 240 + continue; 241 + } 242 + 243 + const queryMatches = Object.entries(rule.queryParams).every(([key, value]) => { 244 + const actualValue = context.queryParams?.[key]; 245 + return actualValue !== undefined; 246 + }); 247 + 248 + if (!queryMatches) { 249 + continue; 250 + } 251 + } 252 + 253 + // Check conditional redirects (country, language, role, cookie) 254 + if (rule.conditions) { 255 + if (rule.conditions.country && context?.headers) { 256 + const cfCountry = context.headers['cf-ipcountry']; 257 + const xCountry = context.headers['x-country']; 258 + const country = (cfCountry?.toLowerCase() || xCountry?.toLowerCase()); 259 + if (!country || !rule.conditions.country.includes(country)) { 260 + continue; 261 + } 262 + } 263 + 264 + if (rule.conditions.language && context?.headers) { 265 + const acceptLang = context.headers['accept-language']; 266 + if (!acceptLang) { 267 + continue; 268 + } 269 + // Parse accept-language header (simplified) 270 + const langs = acceptLang.split(',').map(l => { 271 + const langPart = l.split(';')[0]; 272 + return langPart ? langPart.trim().toLowerCase() : ''; 273 + }).filter(l => l !== ''); 274 + const hasMatch = rule.conditions.language.some(lang => 275 + langs.some(l => l === lang || l.startsWith(lang + '-')) 276 + ); 277 + if (!hasMatch) { 278 + continue; 279 + } 280 + } 281 + 282 + if (rule.conditions.cookie && context?.cookies) { 283 + const hasCookie = rule.conditions.cookie.some(cookieName => 284 + context.cookies && cookieName in context.cookies 285 + ); 286 + if (!hasCookie) { 287 + continue; 288 + } 289 + } 290 + 291 + // Role-based redirects would need JWT verification - skip for now 292 + if (rule.conditions.role) { 293 + continue; 294 + } 295 + } 296 + 297 + // Match the path pattern 298 + const match = rule.fromPattern?.exec(normalizedPath); 299 + if (!match) { 300 + continue; 301 + } 302 + 303 + // Build the target path by replacing placeholders 304 + let targetPath = rule.to; 305 + 306 + // Replace captured parameters 307 + if (rule.fromParams && match.length > 1) { 308 + for (let i = 0; i < rule.fromParams.length; i++) { 309 + const paramName = rule.fromParams[i]; 310 + const paramValue = match[i + 1]; 311 + 312 + if (!paramName || !paramValue) continue; 313 + 314 + if (paramName === 'splat') { 315 + targetPath = targetPath.replace(':splat', paramValue); 316 + } else { 317 + targetPath = targetPath.replace(`:${paramName}`, paramValue); 318 + } 319 + } 320 + } 321 + 322 + // Handle query parameter replacements 323 + if (rule.queryParams && context?.queryParams) { 324 + for (const [key, placeholder] of Object.entries(rule.queryParams)) { 325 + const actualValue = context.queryParams[key]; 326 + if (actualValue && placeholder && placeholder.startsWith(':')) { 327 + const paramName = placeholder.slice(1); 328 + if (paramName) { 329 + targetPath = targetPath.replace(`:${paramName}`, actualValue); 330 + } 331 + } 332 + } 333 + } 334 + 335 + // Preserve query string for 200, 301, 302 redirects (unless target already has one) 336 + if ([200, 301, 302].includes(rule.status) && context?.queryParams && !targetPath.includes('?')) { 337 + const queryString = Object.entries(context.queryParams) 338 + .map(([k, v]) => `${encodeURIComponent(k)}=${encodeURIComponent(v)}`) 339 + .join('&'); 340 + if (queryString) { 341 + targetPath += `?${queryString}`; 342 + } 343 + } 344 + 345 + return { 346 + rule, 347 + targetPath, 348 + status: rule.status, 349 + }; 350 + } 351 + 352 + return null; 353 + } 354 + 355 + /** 356 + * Load redirect rules from a cached site 357 + */ 358 + export async function loadRedirectRules(did: string, rkey: string): Promise<RedirectRule[]> { 359 + const CACHE_DIR = process.env.CACHE_DIR || './cache/sites'; 360 + const redirectsPath = `${CACHE_DIR}/${did}/${rkey}/_redirects`; 361 + 362 + if (!existsSync(redirectsPath)) { 363 + return []; 364 + } 365 + 366 + try { 367 + const content = await readFile(redirectsPath, 'utf-8'); 368 + return parseRedirectsFile(content); 369 + } catch (err) { 370 + console.error('Failed to load _redirects file', err); 371 + return []; 372 + } 373 + } 374 + 375 + /** 376 + * Parse cookies from Cookie header 377 + */ 378 + export function parseCookies(cookieHeader?: string): Record<string, string> { 379 + if (!cookieHeader) return {}; 380 + 381 + const cookies: Record<string, string> = {}; 382 + const parts = cookieHeader.split(';'); 383 + 384 + for (const part of parts) { 385 + const [key, ...valueParts] = part.split('='); 386 + if (key && valueParts.length > 0) { 387 + cookies[key.trim()] = valueParts.join('=').trim(); 388 + } 389 + } 390 + 391 + return cookies; 392 + } 393 + 394 + /** 395 + * Parse query string into object 396 + */ 397 + export function parseQueryString(url: string): Record<string, string> { 398 + const queryStart = url.indexOf('?'); 399 + if (queryStart === -1) return {}; 400 + 401 + const queryString = url.slice(queryStart + 1); 402 + const params: Record<string, string> = {}; 403 + 404 + for (const pair of queryString.split('&')) { 405 + const [key, value] = pair.split('='); 406 + if (key) { 407 + params[decodeURIComponent(key)] = value ? decodeURIComponent(value) : ''; 408 + } 409 + } 410 + 411 + return params; 412 + } 413 +

+168 -6

hosting-service/src/server.ts

··· 7 7 import { lookup } from 'mime-types'; 8 8 import { logger, observabilityMiddleware, observabilityErrorHandler, logCollector, errorTracker, metricsCollector } from './lib/observability'; 9 9 import { fileCache, metadataCache, rewrittenHtmlCache, getCacheKey, type FileMetadata } from './lib/cache'; 10 + import { loadRedirectRules, matchRedirectRule, parseCookies, parseQueryString, type RedirectRule } from './lib/redirects'; 10 11 11 12 const BASE_HOST = process.env.BASE_HOST || 'wisp.place'; 12 13 ··· 35 36 } 36 37 } 37 38 39 + // Cache for redirect rules (per site) 40 + const redirectRulesCache = new Map<string, RedirectRule[]>(); 41 + 42 + /** 43 + * Clear redirect rules cache for a specific site 44 + * Should be called when a site is updated/recached 45 + */ 46 + export function clearRedirectRulesCache(did: string, rkey: string) { 47 + const cacheKey = `${did}:${rkey}`; 48 + redirectRulesCache.delete(cacheKey); 49 + } 50 + 38 51 // Helper to serve files from cache 39 - async function serveFromCache(did: string, rkey: string, filePath: string) { 52 + async function serveFromCache( 53 + did: string, 54 + rkey: string, 55 + filePath: string, 56 + fullUrl?: string, 57 + headers?: Record<string, string> 58 + ) { 59 + // Check for redirect rules first 60 + const redirectCacheKey = `${did}:${rkey}`; 61 + let redirectRules = redirectRulesCache.get(redirectCacheKey); 62 + 63 + if (redirectRules === undefined) { 64 + // Load rules for the first time 65 + redirectRules = await loadRedirectRules(did, rkey); 66 + redirectRulesCache.set(redirectCacheKey, redirectRules); 67 + } 68 + 69 + // Apply redirect rules if any exist 70 + if (redirectRules.length > 0) { 71 + const requestPath = '/' + (filePath || ''); 72 + const queryParams = fullUrl ? parseQueryString(fullUrl) : {}; 73 + const cookies = parseCookies(headers?.['cookie']); 74 + 75 + const redirectMatch = matchRedirectRule(requestPath, redirectRules, { 76 + queryParams, 77 + headers, 78 + cookies, 79 + }); 80 + 81 + if (redirectMatch) { 82 + const { targetPath, status } = redirectMatch; 83 + 84 + // Handle different status codes 85 + if (status === 200) { 86 + // Rewrite: serve different content but keep URL the same 87 + // Remove leading slash for internal path resolution 88 + const rewritePath = targetPath.startsWith('/') ? targetPath.slice(1) : targetPath; 89 + return serveFileInternal(did, rkey, rewritePath); 90 + } else if (status === 301 || status === 302) { 91 + // External redirect: change the URL 92 + return new Response(null, { 93 + status, 94 + headers: { 95 + 'Location': targetPath, 96 + 'Cache-Control': status === 301 ? 'public, max-age=31536000' : 'public, max-age=0', 97 + }, 98 + }); 99 + } else if (status === 404) { 100 + // Custom 404 page 101 + const custom404Path = targetPath.startsWith('/') ? targetPath.slice(1) : targetPath; 102 + const response = await serveFileInternal(did, rkey, custom404Path); 103 + // Override status to 404 104 + return new Response(response.body, { 105 + status: 404, 106 + headers: response.headers, 107 + }); 108 + } 109 + } 110 + } 111 + 112 + // No redirect matched, serve normally 113 + return serveFileInternal(did, rkey, filePath); 114 + } 115 + 116 + // Internal function to serve a file (used by both normal serving and rewrites) 117 + async function serveFileInternal(did: string, rkey: string, filePath: string) { 40 118 // Default to index.html if path is empty or ends with / 41 119 let requestPath = filePath || 'index.html'; 42 120 if (requestPath.endsWith('/')) { ··· 138 216 did: string, 139 217 rkey: string, 140 218 filePath: string, 141 - basePath: string 219 + basePath: string, 220 + fullUrl?: string, 221 + headers?: Record<string, string> 142 222 ) { 223 + // Check for redirect rules first 224 + const redirectCacheKey = `${did}:${rkey}`; 225 + let redirectRules = redirectRulesCache.get(redirectCacheKey); 226 + 227 + if (redirectRules === undefined) { 228 + // Load rules for the first time 229 + redirectRules = await loadRedirectRules(did, rkey); 230 + redirectRulesCache.set(redirectCacheKey, redirectRules); 231 + } 232 + 233 + // Apply redirect rules if any exist 234 + if (redirectRules.length > 0) { 235 + const requestPath = '/' + (filePath || ''); 236 + const queryParams = fullUrl ? parseQueryString(fullUrl) : {}; 237 + const cookies = parseCookies(headers?.['cookie']); 238 + 239 + const redirectMatch = matchRedirectRule(requestPath, redirectRules, { 240 + queryParams, 241 + headers, 242 + cookies, 243 + }); 244 + 245 + if (redirectMatch) { 246 + const { targetPath, status } = redirectMatch; 247 + 248 + // Handle different status codes 249 + if (status === 200) { 250 + // Rewrite: serve different content but keep URL the same 251 + const rewritePath = targetPath.startsWith('/') ? targetPath.slice(1) : targetPath; 252 + return serveFileInternalWithRewrite(did, rkey, rewritePath, basePath); 253 + } else if (status === 301 || status === 302) { 254 + // External redirect: change the URL 255 + // For sites.wisp.place, we need to adjust the target path to include the base path 256 + // unless it's an absolute URL 257 + let redirectTarget = targetPath; 258 + if (!targetPath.startsWith('http://') && !targetPath.startsWith('https://')) { 259 + redirectTarget = basePath + (targetPath.startsWith('/') ? targetPath.slice(1) : targetPath); 260 + } 261 + return new Response(null, { 262 + status, 263 + headers: { 264 + 'Location': redirectTarget, 265 + 'Cache-Control': status === 301 ? 'public, max-age=31536000' : 'public, max-age=0', 266 + }, 267 + }); 268 + } else if (status === 404) { 269 + // Custom 404 page 270 + const custom404Path = targetPath.startsWith('/') ? targetPath.slice(1) : targetPath; 271 + const response = await serveFileInternalWithRewrite(did, rkey, custom404Path, basePath); 272 + // Override status to 404 273 + return new Response(response.body, { 274 + status: 404, 275 + headers: response.headers, 276 + }); 277 + } 278 + } 279 + } 280 + 281 + // No redirect matched, serve normally 282 + return serveFileInternalWithRewrite(did, rkey, filePath, basePath); 283 + } 284 + 285 + // Internal function to serve a file with rewriting 286 + async function serveFileInternalWithRewrite(did: string, rkey: string, filePath: string, basePath: string) { 143 287 // Default to index.html if path is empty or ends with / 144 288 let requestPath = filePath || 'index.html'; 145 289 if (requestPath.endsWith('/')) { ··· 317 461 318 462 try { 319 463 await downloadAndCacheSite(did, rkey, siteData.record, pdsEndpoint, siteData.cid); 464 + // Clear redirect rules cache since the site was updated 465 + clearRedirectRulesCache(did, rkey); 320 466 logger.info('Site cached successfully', { did, rkey }); 321 467 return true; 322 468 } catch (err) { ··· 384 530 385 531 // Serve with HTML path rewriting to handle absolute paths 386 532 const basePath = `/${identifier}/${site}/`; 387 - return serveFromCacheWithRewrite(did, site, filePath, basePath); 533 + const headers: Record<string, string> = {}; 534 + c.req.raw.headers.forEach((value, key) => { 535 + headers[key.toLowerCase()] = value; 536 + }); 537 + return serveFromCacheWithRewrite(did, site, filePath, basePath, c.req.url, headers); 388 538 } 389 539 390 540 // Check if this is a DNS hash subdomain ··· 420 570 return c.text('Site not found', 404); 421 571 } 422 572 423 - return serveFromCache(customDomain.did, rkey, path); 573 + const headers: Record<string, string> = {}; 574 + c.req.raw.headers.forEach((value, key) => { 575 + headers[key.toLowerCase()] = value; 576 + }); 577 + return serveFromCache(customDomain.did, rkey, path, c.req.url, headers); 424 578 } 425 579 426 580 // Route 2: Registered subdomains - /*.wisp.place/* ··· 444 598 return c.text('Site not found', 404); 445 599 } 446 600 447 - return serveFromCache(domainInfo.did, rkey, path); 601 + const headers: Record<string, string> = {}; 602 + c.req.raw.headers.forEach((value, key) => { 603 + headers[key.toLowerCase()] = value; 604 + }); 605 + return serveFromCache(domainInfo.did, rkey, path, c.req.url, headers); 448 606 } 449 607 450 608 // Route 1: Custom domains - /* ··· 467 625 return c.text('Site not found', 404); 468 626 } 469 627 470 - return serveFromCache(customDomain.did, rkey, path); 628 + const headers: Record<string, string> = {}; 629 + c.req.raw.headers.forEach((value, key) => { 630 + headers[key.toLowerCase()] = value; 631 + }); 632 + return serveFromCache(customDomain.did, rkey, path, c.req.url, headers); 471 633 }); 472 634 473 635 // Internal observability endpoints (for admin panel)

+1

cli/.gitignore

··· 1 + test/ 1 2 .DS_STORE 2 3 jacquard/ 3 4 binaries/

+21 -12

cli/Cargo.toml

··· 1 + [package] 2 + name = "wisp-cli" 3 + version = "0.2.0" 4 + edition = "2024" 1 5 6 + [features] 2 7 3 8 4 9 5 10 11 + jacquard = { git = "https://tangled.org/@nonbinary.computer/jacquard", features = ["loopback"] } 12 + jacquard-oauth = { git = "https://tangled.org/@nonbinary.computer/jacquard" } 13 + jacquard-api = { git = "https://tangled.org/@nonbinary.computer/jacquard" } 14 + jacquard-common = { git = "https://tangled.org/@nonbinary.computer/jacquard", features = ["websocket"] } 15 + jacquard-identity = { git = "https://tangled.org/@nonbinary.computer/jacquard", features = ["dns"] } 16 + jacquard-derive = { git = "https://tangled.org/@nonbinary.computer/jacquard" } 17 + jacquard-lexicon = { git = "https://tangled.org/@nonbinary.computer/jacquard" } 6 18 7 19 8 20 ··· 15 27 16 28 17 29 18 - 19 - 20 - 21 - 22 - 23 - 24 - 25 - 26 - 27 - 28 - 29 - 30 30 mime_guess = "2.0" 31 31 bytes = "1.10" 32 32 futures = "0.3.31" 33 + multihash = "0.19.3" 34 + multibase = "0.9" 35 + sha2 = "0.10" 36 + axum = "0.7" 37 + tower-http = { version = "0.5", features = ["fs", "compression-gzip"] } 38 + tower = "0.4" 39 + n0-future = "0.1" 40 + chrono = "0.4" 41 + url = "2.5"

+85

cli/src/blob_map.rs

··· 1 + use jacquard_common::types::blob::BlobRef; 2 + use jacquard_common::IntoStatic; 3 + use std::collections::HashMap; 4 + 5 + use crate::place_wisp::fs::{Directory, EntryNode}; 6 + 7 + /// Extract blob information from a directory tree 8 + /// Returns a map of file paths to their blob refs and CIDs 9 + /// 10 + /// This mirrors the TypeScript implementation in src/lib/wisp-utils.ts lines 275-302 11 + pub fn extract_blob_map( 12 + directory: &Directory, 13 + ) -> HashMap<String, (BlobRef<'static>, String)> { 14 + extract_blob_map_recursive(directory, String::new()) 15 + } 16 + 17 + fn extract_blob_map_recursive( 18 + directory: &Directory, 19 + current_path: String, 20 + ) -> HashMap<String, (BlobRef<'static>, String)> { 21 + let mut blob_map = HashMap::new(); 22 + 23 + for entry in &directory.entries { 24 + let full_path = if current_path.is_empty() { 25 + entry.name.to_string() 26 + } else { 27 + format!("{}/{}", current_path, entry.name) 28 + }; 29 + 30 + match &entry.node { 31 + EntryNode::File(file_node) => { 32 + // Extract CID from blob ref 33 + // BlobRef is an enum with Blob variant, which has a ref field (CidLink) 34 + let blob_ref = &file_node.blob; 35 + let cid_string = blob_ref.blob().r#ref.to_string(); 36 + 37 + // Store with full path (mirrors TypeScript implementation) 38 + blob_map.insert( 39 + full_path, 40 + (blob_ref.clone().into_static(), cid_string) 41 + ); 42 + } 43 + EntryNode::Directory(subdir) => { 44 + let sub_map = extract_blob_map_recursive(subdir, full_path); 45 + blob_map.extend(sub_map); 46 + } 47 + EntryNode::Unknown(_) => { 48 + // Skip unknown node types 49 + } 50 + } 51 + } 52 + 53 + blob_map 54 + } 55 + 56 + /// Normalize file path by removing base folder prefix 57 + /// Example: "cobblemon/index.html" -> "index.html" 58 + /// 59 + /// Note: This function is kept for reference but is no longer used in production code. 60 + /// The TypeScript server has a similar normalization (src/routes/wisp.ts line 291) to handle 61 + /// uploads that include a base folder prefix, but our CLI doesn't need this since we 62 + /// track full paths consistently. 63 + #[allow(dead_code)] 64 + pub fn normalize_path(path: &str) -> String { 65 + // Remove base folder prefix (everything before first /) 66 + if let Some(idx) = path.find('/') { 67 + path[idx + 1..].to_string() 68 + } else { 69 + path.to_string() 70 + } 71 + } 72 + 73 + #[cfg(test)] 74 + mod tests { 75 + use super::*; 76 + 77 + #[test] 78 + fn test_normalize_path() { 79 + assert_eq!(normalize_path("index.html"), "index.html"); 80 + assert_eq!(normalize_path("cobblemon/index.html"), "index.html"); 81 + assert_eq!(normalize_path("folder/subfolder/file.txt"), "subfolder/file.txt"); 82 + assert_eq!(normalize_path("a/b/c/d.txt"), "b/c/d.txt"); 83 + } 84 + } 85 +

+66

cli/src/cid.rs

··· 1 + use jacquard_common::types::cid::IpldCid; 2 + use sha2::{Digest, Sha256}; 3 + 4 + /// Compute CID (Content Identifier) for blob content 5 + /// Uses the same algorithm as AT Protocol: CIDv1 with raw codec (0x55) and SHA-256 6 + /// 7 + /// CRITICAL: This must be called on BASE64-ENCODED GZIPPED content, not just gzipped content 8 + /// 9 + /// Based on @atproto/common/src/ipld.ts sha256RawToCid implementation 10 + pub fn compute_cid(content: &[u8]) -> String { 11 + // Use node crypto to compute sha256 hash (same as AT Protocol) 12 + let hash = Sha256::digest(content); 13 + 14 + // Create multihash (code 0x12 = sha2-256) 15 + let multihash = multihash::Multihash::wrap(0x12, &hash) 16 + .expect("SHA-256 hash should always fit in multihash"); 17 + 18 + // Create CIDv1 with raw codec (0x55) 19 + let cid = IpldCid::new_v1(0x55, multihash); 20 + 21 + // Convert to base32 string representation 22 + cid.to_string_of_base(multibase::Base::Base32Lower) 23 + .unwrap_or_else(|_| cid.to_string()) 24 + } 25 + 26 + #[cfg(test)] 27 + mod tests { 28 + use super::*; 29 + use base64::Engine; 30 + 31 + #[test] 32 + fn test_compute_cid() { 33 + // Test with a simple string: "hello" 34 + let content = b"hello"; 35 + let cid = compute_cid(content); 36 + 37 + // CID should start with 'baf' for raw codec base32 38 + assert!(cid.starts_with("baf")); 39 + } 40 + 41 + #[test] 42 + fn test_compute_cid_base64_encoded() { 43 + // Simulate the actual use case: gzipped then base64 encoded 44 + use flate2::write::GzEncoder; 45 + use flate2::Compression; 46 + use std::io::Write; 47 + 48 + let original = b"hello world"; 49 + 50 + // Gzip compress 51 + let mut encoder = GzEncoder::new(Vec::new(), Compression::default()); 52 + encoder.write_all(original).unwrap(); 53 + let gzipped = encoder.finish().unwrap(); 54 + 55 + // Base64 encode the gzipped data 56 + let base64_bytes = base64::prelude::BASE64_STANDARD.encode(&gzipped).into_bytes(); 57 + 58 + // Compute CID on the base64 bytes 59 + let cid = compute_cid(&base64_bytes); 60 + 61 + // Should be a valid CID 62 + assert!(cid.starts_with("baf")); 63 + assert!(cid.len() > 10); 64 + } 65 + } 66 +

+253 -66

cli/src/main.rs

··· 1 1 mod builder_types; 2 2 mod place_wisp; 3 + mod cid; 4 + mod blob_map; 5 + mod metadata; 6 + mod download; 7 + mod pull; 8 + mod serve; 3 9 4 - use clap::Parser; 10 + use clap::{Parser, Subcommand}; 5 11 use jacquard::CowStr; 6 - use jacquard::client::{Agent, FileAuthStore, AgentSessionExt, MemoryCredentialSession}; 12 + use jacquard::client::{Agent, FileAuthStore, AgentSessionExt, MemoryCredentialSession, AgentSession}; 7 13 use jacquard::oauth::client::OAuthClient; 8 14 use jacquard::oauth::loopback::LoopbackConfig; 9 15 use jacquard::prelude::IdentityResolver; ··· 11 17 use jacquard_common::types::blob::MimeType; 12 18 use miette::IntoDiagnostic; 13 19 use std::path::{Path, PathBuf}; 20 + use std::collections::HashMap; 14 21 use flate2::Compression; 15 22 use flate2::write::GzEncoder; 16 23 use std::io::Write; 17 24 18 25 19 26 27 + use place_wisp::fs::*; 20 28 29 + #[derive(Parser, Debug)] 30 + #[command(author, version, about = "wisp.place CLI tool")] 31 + struct Args { 32 + #[command(subcommand)] 33 + command: Option<Commands>, 34 + 35 + // Deploy arguments (when no subcommand is specified) 36 + /// Handle (e.g., alice.bsky.social), DID, or PDS URL 37 + #[arg(global = true, conflicts_with = "command")] 38 + input: Option<CowStr<'static>>, 21 39 40 + /// Path to the directory containing your static site 41 + #[arg(short, long, global = true, conflicts_with = "command")] 42 + path: Option<PathBuf>, 22 43 44 + /// Site name (defaults to directory name) 45 + #[arg(short, long, global = true, conflicts_with = "command")] 46 + site: Option<String>, 23 47 48 + /// Path to auth store file 49 + #[arg(long, global = true, conflicts_with = "command")] 50 + store: Option<String>, 24 51 52 + /// App Password for authentication 53 + #[arg(long, global = true, conflicts_with = "command")] 54 + password: Option<CowStr<'static>>, 55 + } 25 56 57 + #[derive(Subcommand, Debug)] 58 + enum Commands { 59 + /// Deploy a static site to wisp.place (default command) 60 + Deploy { 61 + /// Handle (e.g., alice.bsky.social), DID, or PDS URL 62 + input: CowStr<'static>, 26 63 64 + /// Path to the directory containing your static site 65 + #[arg(short, long, default_value = ".")] 66 + path: PathBuf, 27 67 68 + /// Site name (defaults to directory name) 69 + #[arg(short, long)] 70 + site: Option<String>, 28 71 72 + /// Path to auth store file (will be created if missing, only used with OAuth) 73 + #[arg(long, default_value = "/tmp/wisp-oauth-session.json")] 74 + store: String, 29 75 76 + /// App Password for authentication (alternative to OAuth) 77 + #[arg(long)] 78 + password: Option<CowStr<'static>>, 79 + }, 80 + /// Pull a site from the PDS to a local directory 81 + Pull { 82 + /// Handle (e.g., alice.bsky.social) or DID 83 + input: CowStr<'static>, 30 84 85 + /// Site name (record key) 86 + #[arg(short, long)] 87 + site: String, 31 88 89 + /// Output directory for the downloaded site 90 + #[arg(short, long, default_value = ".")] 91 + output: PathBuf, 92 + }, 93 + /// Serve a site locally with real-time firehose updates 94 + Serve { 95 + /// Handle (e.g., alice.bsky.social) or DID 96 + input: CowStr<'static>, 32 97 98 + /// Site name (record key) 99 + #[arg(short, long)] 100 + site: String, 33 101 102 + /// Output directory for the site files 103 + #[arg(short, long, default_value = ".")] 104 + output: PathBuf, 34 105 106 + /// Port to serve on 107 + #[arg(short, long, default_value = "8080")] 108 + port: u16, 109 + }, 110 + } 35 111 112 + #[tokio::main] 113 + async fn main() -> miette::Result<()> { 114 + let args = Args::parse(); 36 115 116 + match args.command { 117 + Some(Commands::Deploy { input, path, site, store, password }) => { 118 + // Dispatch to appropriate authentication method 119 + if let Some(password) = password { 120 + run_with_app_password(input, password, path, site).await 121 + } else { 122 + run_with_oauth(input, store, path, site).await 123 + } 124 + } 125 + Some(Commands::Pull { input, site, output }) => { 126 + pull::pull_site(input, CowStr::from(site), output).await 127 + } 128 + Some(Commands::Serve { input, site, output, port }) => { 129 + serve::serve_site(input, CowStr::from(site), output, port).await 130 + } 131 + None => { 132 + // Legacy mode: if input is provided, assume deploy command 133 + if let Some(input) = args.input { 134 + let path = args.path.unwrap_or_else(|| PathBuf::from(".")); 135 + let store = args.store.unwrap_or_else(|| "/tmp/wisp-oauth-session.json".to_string()); 136 + 137 + // Dispatch to appropriate authentication method 138 + if let Some(password) = args.password { 139 + run_with_app_password(input, password, path, args.site).await 140 + } else { 141 + run_with_oauth(input, store, path, args.site).await 142 + } 143 + } else { 144 + // No command and no input, show help 145 + use clap::CommandFactory; 146 + Args::command().print_help().into_diagnostic()?; 147 + Ok(()) 148 + } 149 + } 150 + } 151 + } 37 152 38 153 39 154 ··· 86 201 87 202 88 203 89 - 90 - 91 - 92 - 93 - 94 - 95 - 96 - 97 - 98 - 99 - 100 - 101 - 102 - 103 - 104 - 105 - 106 - 107 - 108 204 println!("Deploying site '{}'...", site_name); 109 205 206 + // Try to fetch existing manifest for incremental updates 207 + let existing_blob_map: HashMap<String, (jacquard_common::types::blob::BlobRef<'static>, String)> = { 208 + use jacquard_common::types::string::AtUri; 209 + 210 + // Get the DID for this session 211 + let session_info = agent.session_info().await; 212 + if let Some((did, _)) = session_info { 213 + // Construct the AT URI for the record 214 + let uri_string = format!("at://{}/place.wisp.fs/{}", did, site_name); 215 + if let Ok(uri) = AtUri::new(&uri_string) { 216 + match agent.get_record::<Fs>(&uri).await { 217 + Ok(response) => { 218 + match response.into_output() { 219 + Ok(record_output) => { 220 + let existing_manifest = record_output.value; 221 + let blob_map = blob_map::extract_blob_map(&existing_manifest.root); 222 + println!("Found existing manifest with {} files, checking for changes...", blob_map.len()); 223 + blob_map 224 + } 225 + Err(_) => { 226 + println!("No existing manifest found, uploading all files..."); 227 + HashMap::new() 228 + } 229 + } 230 + } 231 + Err(_) => { 232 + // Record doesn't exist yet - this is a new site 233 + println!("No existing manifest found, uploading all files..."); 234 + HashMap::new() 235 + } 236 + } 237 + } else { 238 + println!("No existing manifest found (invalid URI), uploading all files..."); 239 + HashMap::new() 240 + } 241 + } else { 242 + println!("No existing manifest found (could not get DID), uploading all files..."); 243 + HashMap::new() 244 + } 245 + }; 246 + 110 247 // Build directory tree 111 - let root_dir = build_directory(agent, &path).await?; 248 + let (root_dir, total_files, reused_count) = build_directory(agent, &path, &existing_blob_map, String::new()).await?; 249 + let uploaded_count = total_files - reused_count; 112 250 113 - // Count total files 114 - let file_count = count_files(&root_dir); 115 - 116 251 // Create the Fs record 117 252 let fs_record = Fs::new() 118 253 .site(CowStr::from(site_name.clone())) 119 254 .root(root_dir) 120 - .file_count(file_count as i64) 255 + .file_count(total_files as i64) 121 256 .created_at(Datetime::now()) 122 257 .build(); 123 258 ··· 132 267 .and_then(|s| s.split('/').next()) 133 268 .ok_or_else(|| miette::miette!("Failed to parse DID from URI"))?; 134 269 135 - println!("Deployed site '{}': {}", site_name, output.uri); 136 - println!("Available at: https://sites.wisp.place/{}/{}", did, site_name); 270 + println!("\n✓ Deployed site '{}': {}", site_name, output.uri); 271 + println!(" Total files: {} ({} reused, {} uploaded)", total_files, reused_count, uploaded_count); 272 + println!(" Available at: https://sites.wisp.place/{}/{}", did, site_name); 137 273 138 274 Ok(()) 139 275 } 140 276 141 - 277 + /// Recursively build a Directory from a filesystem path 278 + /// current_path is the path from the root of the site (e.g., "" for root, "config" for config dir) 142 279 fn build_directory<'a>( 143 280 agent: &'a Agent<impl jacquard::client::AgentSession + IdentityResolver + 'a>, 144 281 dir_path: &'a Path, 145 - ) -> std::pin::Pin<Box<dyn std::future::Future<Output = miette::Result<Directory<'static>>> + 'a>> 282 + existing_blobs: &'a HashMap<String, (jacquard_common::types::blob::BlobRef<'static>, String)>, 283 + current_path: String, 284 + ) -> std::pin::Pin<Box<dyn std::future::Future<Output = miette::Result<(Directory<'static>, usize, usize)>> + 'a>> 146 285 { 147 286 Box::pin(async move { 148 287 // Collect all directory entries first ··· 167 306 168 307 169 308 309 + let metadata = entry.metadata().into_diagnostic()?; 170 310 171 - 172 - 173 - 174 - 175 - 176 - 311 + if metadata.is_file() { 312 + // Construct full path for this file (for blob map lookup) 313 + let full_path = if current_path.is_empty() { 314 + name_str.clone() 315 + } else { 316 + format!("{}/{}", current_path, name_str) 317 + }; 318 + file_tasks.push((name_str, path, full_path)); 319 + } else if metadata.is_dir() { 320 + dir_tasks.push((name_str, path)); 321 + } 177 322 } 178 323 179 324 // Process files concurrently with a limit of 5 180 - let file_entries: Vec<Entry> = stream::iter(file_tasks) 181 - .map(|(name, path)| async move { 182 - let file_node = process_file(agent, &path).await?; 183 - Ok::<_, miette::Report>(Entry::new() 325 + let file_results: Vec<(Entry<'static>, bool)> = stream::iter(file_tasks) 326 + .map(|(name, path, full_path)| async move { 327 + let (file_node, reused) = process_file(agent, &path, &full_path, existing_blobs).await?; 328 + let entry = Entry::new() 184 329 .name(CowStr::from(name)) 185 330 .node(EntryNode::File(Box::new(file_node))) 186 - .build()) 331 + .build(); 332 + Ok::<_, miette::Report>((entry, reused)) 187 333 }) 188 334 .buffer_unordered(5) 189 335 .collect::<Vec<_>>() 190 336 .await 191 337 .into_iter() 192 338 .collect::<miette::Result<Vec<_>>>()?; 339 + 340 + let mut file_entries = Vec::new(); 341 + let mut reused_count = 0; 342 + let mut total_files = 0; 343 + 344 + for (entry, reused) in file_results { 345 + file_entries.push(entry); 346 + total_files += 1; 347 + if reused { 348 + reused_count += 1; 349 + } 350 + } 193 351 194 352 // Process directories recursively (sequentially to avoid too much nesting) 195 353 let mut dir_entries = Vec::new(); 196 354 for (name, path) in dir_tasks { 197 - let subdir = build_directory(agent, &path).await?; 355 + // Construct full path for subdirectory 356 + let subdir_path = if current_path.is_empty() { 357 + name.clone() 358 + } else { 359 + format!("{}/{}", current_path, name) 360 + }; 361 + let (subdir, sub_total, sub_reused) = build_directory(agent, &path, existing_blobs, subdir_path).await?; 198 362 dir_entries.push(Entry::new() 199 363 .name(CowStr::from(name)) 200 364 .node(EntryNode::Directory(Box::new(subdir))) 201 365 .build()); 366 + total_files += sub_total; 367 + reused_count += sub_reused; 202 368 } 203 369 204 370 // Combine file and directory entries 205 371 let mut entries = file_entries; 206 372 entries.extend(dir_entries); 207 373 208 - Ok(Directory::new() 374 + let directory = Directory::new() 209 375 .r#type(CowStr::from("directory")) 210 376 .entries(entries) 211 - .build()) 377 + .build(); 378 + 379 + Ok((directory, total_files, reused_count)) 212 380 }) 213 381 } 214 382 215 - /// Process a single file: gzip -> base64 -> upload blob 383 + /// Process a single file: gzip -> base64 -> upload blob (or reuse existing) 384 + /// Returns (File, reused: bool) 385 + /// file_path_key is the full path from the site root (e.g., "config/file.json") for blob map lookup 216 386 async fn process_file( 217 387 agent: &Agent<impl jacquard::client::AgentSession + IdentityResolver>, 218 388 file_path: &Path, 219 - ) -> miette::Result<File<'static>> 389 + file_path_key: &str, 390 + existing_blobs: &HashMap<String, (jacquard_common::types::blob::BlobRef<'static>, String)>, 391 + ) -> miette::Result<(File<'static>, bool)> 220 392 { 221 393 // Read file 222 394 let file_data = std::fs::read(file_path).into_diagnostic()?; ··· 234 406 // Base64 encode the gzipped data 235 407 let base64_bytes = base64::prelude::BASE64_STANDARD.encode(&gzipped).into_bytes(); 236 408 237 - // Upload blob as octet-stream 409 + // Compute CID for this file (CRITICAL: on base64-encoded gzipped content) 410 + let file_cid = cid::compute_cid(&base64_bytes); 411 + 412 + // Check if we have an existing blob with the same CID 413 + let existing_blob = existing_blobs.get(file_path_key); 414 + 415 + if let Some((existing_blob_ref, existing_cid)) = existing_blob { 416 + if existing_cid == &file_cid { 417 + // CIDs match - reuse existing blob 418 + println!(" ✓ Reusing blob for {} (CID: {})", file_path_key, file_cid); 419 + return Ok(( 420 + File::new() 421 + .r#type(CowStr::from("file")) 422 + .blob(existing_blob_ref.clone()) 423 + .encoding(CowStr::from("gzip")) 424 + .mime_type(CowStr::from(original_mime)) 425 + .base64(true) 426 + .build(), 427 + true 428 + )); 429 + } 430 + } 431 + 432 + // File is new or changed - upload it 433 + println!(" ↑ Uploading {} ({} bytes, CID: {})", file_path_key, base64_bytes.len(), file_cid); 238 434 let blob = agent.upload_blob( 239 435 base64_bytes, 240 436 MimeType::new_static("application/octet-stream"), 241 437 ).await?; 242 438 243 - Ok(File::new() 244 - .r#type(CowStr::from("file")) 245 - .blob(blob) 246 - .encoding(CowStr::from("gzip")) 247 - .mime_type(CowStr::from(original_mime)) 248 - .base64(true) 249 - .build()) 439 + Ok(( 440 + File::new() 441 + .r#type(CowStr::from("file")) 442 + .blob(blob) 443 + .encoding(CowStr::from("gzip")) 444 + .mime_type(CowStr::from(original_mime)) 445 + .base64(true) 446 + .build(), 447 + false 448 + )) 250 449 } 251 450 252 - /// Count total files in a directory tree 253 - fn count_files(dir: &Directory) -> usize { 254 - let mut count = 0; 255 - for entry in &dir.entries { 256 - match &entry.node { 257 - EntryNode::File(_) => count += 1, 258 - EntryNode::Directory(subdir) => count += count_files(subdir), 259 - _ => {} // Unknown variants 260 - } 261 - } 262 - count 263 - }

+71

cli/src/download.rs

··· 1 + use base64::Engine; 2 + use bytes::Bytes; 3 + use flate2::read::GzDecoder; 4 + use jacquard_common::types::blob::BlobRef; 5 + use miette::IntoDiagnostic; 6 + use std::io::Read; 7 + use url::Url; 8 + 9 + /// Download a blob from the PDS 10 + pub async fn download_blob(pds_url: &Url, blob_ref: &BlobRef<'_>, did: &str) -> miette::Result<Bytes> { 11 + // Extract CID from blob ref 12 + let cid = blob_ref.blob().r#ref.to_string(); 13 + 14 + // Construct blob download URL 15 + // The correct endpoint is: /xrpc/com.atproto.sync.getBlob?did={did}&cid={cid} 16 + let blob_url = pds_url 17 + .join(&format!("/xrpc/com.atproto.sync.getBlob?did={}&cid={}", did, cid)) 18 + .into_diagnostic()?; 19 + 20 + let client = reqwest::Client::new(); 21 + let response = client 22 + .get(blob_url) 23 + .send() 24 + .await 25 + .into_diagnostic()?; 26 + 27 + if !response.status().is_success() { 28 + return Err(miette::miette!( 29 + "Failed to download blob: {}", 30 + response.status() 31 + )); 32 + } 33 + 34 + let bytes = response.bytes().await.into_diagnostic()?; 35 + Ok(bytes) 36 + } 37 + 38 + /// Decompress and decode a blob (base64 + gzip) 39 + pub fn decompress_blob(data: &[u8], is_base64: bool, is_gzipped: bool) -> miette::Result<Vec<u8>> { 40 + let mut current_data = data.to_vec(); 41 + 42 + // First, decode base64 if needed 43 + if is_base64 { 44 + current_data = base64::prelude::BASE64_STANDARD 45 + .decode(&current_data) 46 + .into_diagnostic()?; 47 + } 48 + 49 + // Then, decompress gzip if needed 50 + if is_gzipped { 51 + let mut decoder = GzDecoder::new(&current_data[..]); 52 + let mut decompressed = Vec::new(); 53 + decoder.read_to_end(&mut decompressed).into_diagnostic()?; 54 + current_data = decompressed; 55 + } 56 + 57 + Ok(current_data) 58 + } 59 + 60 + /// Download and decompress a blob 61 + pub async fn download_and_decompress_blob( 62 + pds_url: &Url, 63 + blob_ref: &BlobRef<'_>, 64 + did: &str, 65 + is_base64: bool, 66 + is_gzipped: bool, 67 + ) -> miette::Result<Vec<u8>> { 68 + let data = download_blob(pds_url, blob_ref, did).await?; 69 + decompress_blob(&data, is_base64, is_gzipped) 70 + } 71 +

+46

cli/src/metadata.rs

··· 1 + use serde::{Deserialize, Serialize}; 2 + use std::collections::HashMap; 3 + use std::path::Path; 4 + use miette::IntoDiagnostic; 5 + 6 + /// Metadata tracking file CIDs for incremental updates 7 + #[derive(Debug, Clone, Serialize, Deserialize)] 8 + pub struct SiteMetadata { 9 + /// Record CID from the PDS 10 + pub record_cid: String, 11 + /// Map of file paths to their blob CIDs 12 + pub file_cids: HashMap<String, String>, 13 + /// Timestamp when the site was last synced 14 + pub last_sync: i64, 15 + } 16 + 17 + impl SiteMetadata { 18 + pub fn new(record_cid: String, file_cids: HashMap<String, String>) -> Self { 19 + Self { 20 + record_cid, 21 + file_cids, 22 + last_sync: chrono::Utc::now().timestamp(), 23 + } 24 + } 25 + 26 + /// Load metadata from a directory 27 + pub fn load(dir: &Path) -> miette::Result<Option<Self>> { 28 + let metadata_path = dir.join(".wisp-metadata.json"); 29 + if !metadata_path.exists() { 30 + return Ok(None); 31 + } 32 + 33 + let contents = std::fs::read_to_string(&metadata_path).into_diagnostic()?; 34 + let metadata: SiteMetadata = serde_json::from_str(&contents).into_diagnostic()?; 35 + Ok(Some(metadata)) 36 + } 37 + 38 + /// Save metadata to a directory 39 + pub fn save(&self, dir: &Path) -> miette::Result<()> { 40 + let metadata_path = dir.join(".wisp-metadata.json"); 41 + let contents = serde_json::to_string_pretty(self).into_diagnostic()?; 42 + std::fs::write(&metadata_path, contents).into_diagnostic()?; 43 + Ok(()) 44 + } 45 + } 46 +

+305

cli/src/pull.rs

··· 1 + use crate::blob_map; 2 + use crate::download; 3 + use crate::metadata::SiteMetadata; 4 + use crate::place_wisp::fs::*; 5 + use jacquard::CowStr; 6 + use jacquard::prelude::IdentityResolver; 7 + use jacquard_common::types::string::Did; 8 + use jacquard_common::xrpc::XrpcExt; 9 + use jacquard_identity::PublicResolver; 10 + use miette::IntoDiagnostic; 11 + use std::collections::HashMap; 12 + use std::path::{Path, PathBuf}; 13 + use url::Url; 14 + 15 + /// Pull a site from the PDS to a local directory 16 + pub async fn pull_site( 17 + input: CowStr<'static>, 18 + rkey: CowStr<'static>, 19 + output_dir: PathBuf, 20 + ) -> miette::Result<()> { 21 + println!("Pulling site {} from {}...", rkey, input); 22 + 23 + // Resolve handle to DID if needed 24 + let resolver = PublicResolver::default(); 25 + let did = if input.starts_with("did:") { 26 + Did::new(&input).into_diagnostic()? 27 + } else { 28 + // It's a handle, resolve it 29 + let handle = jacquard_common::types::string::Handle::new(&input).into_diagnostic()?; 30 + resolver.resolve_handle(&handle).await.into_diagnostic()? 31 + }; 32 + 33 + // Resolve PDS endpoint for the DID 34 + let pds_url = resolver.pds_for_did(&did).await.into_diagnostic()?; 35 + println!("Resolved PDS: {}", pds_url); 36 + 37 + // Fetch the place.wisp.fs record 38 + 39 + println!("Fetching record from PDS..."); 40 + let client = reqwest::Client::new(); 41 + 42 + // Use com.atproto.repo.getRecord 43 + use jacquard::api::com_atproto::repo::get_record::GetRecord; 44 + use jacquard_common::types::string::Rkey as RkeyType; 45 + let rkey_parsed = RkeyType::new(&rkey).into_diagnostic()?; 46 + 47 + use jacquard_common::types::ident::AtIdentifier; 48 + use jacquard_common::types::string::RecordKey; 49 + let request = GetRecord::new() 50 + .repo(AtIdentifier::Did(did.clone())) 51 + .collection(CowStr::from("place.wisp.fs")) 52 + .rkey(RecordKey::from(rkey_parsed)) 53 + .build(); 54 + 55 + let response = client 56 + .xrpc(pds_url.clone()) 57 + .send(&request) 58 + .await 59 + .into_diagnostic()?; 60 + 61 + let record_output = response.into_output().into_diagnostic()?; 62 + let record_cid = record_output.cid.as_ref().map(|c| c.to_string()).unwrap_or_default(); 63 + 64 + // Parse the record value as Fs 65 + use jacquard_common::types::value::from_data; 66 + let fs_record: Fs = from_data(&record_output.value).into_diagnostic()?; 67 + 68 + let file_count = fs_record.file_count.map(|c| c.to_string()).unwrap_or_else(|| "?".to_string()); 69 + println!("Found site '{}' with {} files", fs_record.site, file_count); 70 + 71 + // Load existing metadata for incremental updates 72 + let existing_metadata = SiteMetadata::load(&output_dir)?; 73 + let existing_file_cids = existing_metadata 74 + .as_ref() 75 + .map(|m| m.file_cids.clone()) 76 + .unwrap_or_default(); 77 + 78 + // Extract blob map from the new manifest 79 + let new_blob_map = blob_map::extract_blob_map(&fs_record.root); 80 + let new_file_cids: HashMap<String, String> = new_blob_map 81 + .iter() 82 + .map(|(path, (_blob_ref, cid))| (path.clone(), cid.clone())) 83 + .collect(); 84 + 85 + // Clean up any leftover temp directories from previous failed attempts 86 + let parent = output_dir.parent().unwrap_or_else(|| std::path::Path::new(".")); 87 + let output_name = output_dir.file_name().unwrap_or_else(|| std::ffi::OsStr::new("site")).to_string_lossy(); 88 + let temp_prefix = format!(".tmp-{}-", output_name); 89 + 90 + if let Ok(entries) = parent.read_dir() { 91 + for entry in entries.flatten() { 92 + let name = entry.file_name(); 93 + if name.to_string_lossy().starts_with(&temp_prefix) { 94 + let _ = std::fs::remove_dir_all(entry.path()); 95 + } 96 + } 97 + } 98 + 99 + // Check if we need to update (but only if output directory actually exists with files) 100 + if let Some(metadata) = &existing_metadata { 101 + if metadata.record_cid == record_cid { 102 + // Verify that the output directory actually exists and has content 103 + let has_content = output_dir.exists() && 104 + output_dir.read_dir() 105 + .map(|mut entries| entries.any(|e| { 106 + if let Ok(entry) = e { 107 + !entry.file_name().to_string_lossy().starts_with(".wisp-metadata") 108 + } else { 109 + false 110 + } 111 + })) 112 + .unwrap_or(false); 113 + 114 + if has_content { 115 + println!("Site is already up to date!"); 116 + return Ok(()); 117 + } 118 + } 119 + } 120 + 121 + // Create temporary directory for atomic update 122 + // Place temp dir in parent directory to avoid issues with non-existent output_dir 123 + let parent = output_dir.parent().unwrap_or_else(|| std::path::Path::new(".")); 124 + let temp_dir_name = format!( 125 + ".tmp-{}-{}", 126 + output_dir.file_name().unwrap_or_else(|| std::ffi::OsStr::new("site")).to_string_lossy(), 127 + chrono::Utc::now().timestamp() 128 + ); 129 + let temp_dir = parent.join(temp_dir_name); 130 + std::fs::create_dir_all(&temp_dir).into_diagnostic()?; 131 + 132 + println!("Downloading files..."); 133 + let mut downloaded = 0; 134 + let mut reused = 0; 135 + 136 + // Download files recursively 137 + let download_result = download_directory( 138 + &fs_record.root, 139 + &temp_dir, 140 + &pds_url, 141 + did.as_str(), 142 + &new_blob_map, 143 + &existing_file_cids, 144 + &output_dir, 145 + String::new(), 146 + &mut downloaded, 147 + &mut reused, 148 + ) 149 + .await; 150 + 151 + // If download failed, clean up temp directory 152 + if let Err(e) = download_result { 153 + let _ = std::fs::remove_dir_all(&temp_dir); 154 + return Err(e); 155 + } 156 + 157 + println!( 158 + "Downloaded {} files, reused {} files", 159 + downloaded, reused 160 + ); 161 + 162 + // Save metadata 163 + let metadata = SiteMetadata::new(record_cid, new_file_cids); 164 + metadata.save(&temp_dir)?; 165 + 166 + // Move files from temp to output directory 167 + let output_abs = std::fs::canonicalize(&output_dir).unwrap_or_else(|_| output_dir.clone()); 168 + let current_dir = std::env::current_dir().into_diagnostic()?; 169 + 170 + // Special handling for pulling to current directory 171 + if output_abs == current_dir { 172 + // Move files from temp to current directory 173 + for entry in std::fs::read_dir(&temp_dir).into_diagnostic()? { 174 + let entry = entry.into_diagnostic()?; 175 + let dest = current_dir.join(entry.file_name()); 176 + 177 + // Remove existing file/dir if it exists 178 + if dest.exists() { 179 + if dest.is_dir() { 180 + std::fs::remove_dir_all(&dest).into_diagnostic()?; 181 + } else { 182 + std::fs::remove_file(&dest).into_diagnostic()?; 183 + } 184 + } 185 + 186 + // Move from temp to current dir 187 + std::fs::rename(entry.path(), dest).into_diagnostic()?; 188 + } 189 + 190 + // Clean up temp directory 191 + std::fs::remove_dir_all(&temp_dir).into_diagnostic()?; 192 + } else { 193 + // If output directory exists and has content, remove it first 194 + if output_dir.exists() { 195 + std::fs::remove_dir_all(&output_dir).into_diagnostic()?; 196 + } 197 + 198 + // Ensure parent directory exists 199 + if let Some(parent) = output_dir.parent() { 200 + if !parent.as_os_str().is_empty() && !parent.exists() { 201 + std::fs::create_dir_all(parent).into_diagnostic()?; 202 + } 203 + } 204 + 205 + // Rename temp to final location 206 + match std::fs::rename(&temp_dir, &output_dir) { 207 + Ok(_) => {}, 208 + Err(e) => { 209 + // Clean up temp directory on failure 210 + let _ = std::fs::remove_dir_all(&temp_dir); 211 + return Err(miette::miette!("Failed to move temp directory: {}", e)); 212 + } 213 + } 214 + } 215 + 216 + println!("✓ Site pulled successfully to {}", output_dir.display()); 217 + 218 + Ok(()) 219 + } 220 + 221 + /// Recursively download a directory 222 + fn download_directory<'a>( 223 + dir: &'a Directory<'_>, 224 + output_dir: &'a Path, 225 + pds_url: &'a Url, 226 + did: &'a str, 227 + new_blob_map: &'a HashMap<String, (jacquard_common::types::blob::BlobRef<'static>, String)>, 228 + existing_file_cids: &'a HashMap<String, String>, 229 + existing_output_dir: &'a Path, 230 + path_prefix: String, 231 + downloaded: &'a mut usize, 232 + reused: &'a mut usize, 233 + ) -> std::pin::Pin<Box<dyn std::future::Future<Output = miette::Result<()>> + Send + 'a>> { 234 + Box::pin(async move { 235 + for entry in &dir.entries { 236 + let entry_name = entry.name.as_str(); 237 + let current_path = if path_prefix.is_empty() { 238 + entry_name.to_string() 239 + } else { 240 + format!("{}/{}", path_prefix, entry_name) 241 + }; 242 + 243 + match &entry.node { 244 + EntryNode::File(file) => { 245 + let output_path = output_dir.join(entry_name); 246 + 247 + // Check if file CID matches existing 248 + if let Some((_blob_ref, new_cid)) = new_blob_map.get(&current_path) { 249 + if let Some(existing_cid) = existing_file_cids.get(&current_path) { 250 + if existing_cid == new_cid { 251 + // File unchanged, copy from existing directory 252 + let existing_path = existing_output_dir.join(&current_path); 253 + if existing_path.exists() { 254 + std::fs::copy(&existing_path, &output_path).into_diagnostic()?; 255 + *reused += 1; 256 + println!(" ✓ Reused {}", current_path); 257 + continue; 258 + } 259 + } 260 + } 261 + } 262 + 263 + // File is new or changed, download it 264 + println!(" ↓ Downloading {}", current_path); 265 + let data = download::download_and_decompress_blob( 266 + pds_url, 267 + &file.blob, 268 + did, 269 + file.base64.unwrap_or(false), 270 + file.encoding.as_ref().map(|e| e.as_str() == "gzip").unwrap_or(false), 271 + ) 272 + .await?; 273 + 274 + std::fs::write(&output_path, data).into_diagnostic()?; 275 + *downloaded += 1; 276 + } 277 + EntryNode::Directory(subdir) => { 278 + let subdir_path = output_dir.join(entry_name); 279 + std::fs::create_dir_all(&subdir_path).into_diagnostic()?; 280 + 281 + download_directory( 282 + subdir, 283 + &subdir_path, 284 + pds_url, 285 + did, 286 + new_blob_map, 287 + existing_file_cids, 288 + existing_output_dir, 289 + current_path, 290 + downloaded, 291 + reused, 292 + ) 293 + .await?; 294 + } 295 + EntryNode::Unknown(_) => { 296 + // Skip unknown node types 297 + println!(" ⚠ Skipping unknown node type for {}", current_path); 298 + } 299 + } 300 + } 301 + 302 + Ok(()) 303 + }) 304 + } 305 +

+202

cli/src/serve.rs

··· 1 + use crate::pull::pull_site; 2 + use axum::Router; 3 + use jacquard::CowStr; 4 + use jacquard_common::jetstream::{CommitOperation, JetstreamMessage, JetstreamParams}; 5 + use jacquard_common::types::string::Did; 6 + use jacquard_common::xrpc::{SubscriptionClient, TungsteniteSubscriptionClient}; 7 + use miette::IntoDiagnostic; 8 + use n0_future::StreamExt; 9 + use std::path::PathBuf; 10 + use std::sync::Arc; 11 + use tokio::sync::RwLock; 12 + use tower_http::compression::CompressionLayer; 13 + use tower_http::services::ServeDir; 14 + use url::Url; 15 + 16 + /// Shared state for the server 17 + #[derive(Clone)] 18 + struct ServerState { 19 + did: CowStr<'static>, 20 + rkey: CowStr<'static>, 21 + output_dir: PathBuf, 22 + last_cid: Arc<RwLock<Option<String>>>, 23 + } 24 + 25 + /// Serve a site locally with real-time firehose updates 26 + pub async fn serve_site( 27 + input: CowStr<'static>, 28 + rkey: CowStr<'static>, 29 + output_dir: PathBuf, 30 + port: u16, 31 + ) -> miette::Result<()> { 32 + println!("Serving site {} from {} on port {}...", rkey, input, port); 33 + 34 + // Resolve handle to DID if needed 35 + use jacquard_identity::PublicResolver; 36 + use jacquard::prelude::IdentityResolver; 37 + 38 + let resolver = PublicResolver::default(); 39 + let did = if input.starts_with("did:") { 40 + Did::new(&input).into_diagnostic()? 41 + } else { 42 + // It's a handle, resolve it 43 + let handle = jacquard_common::types::string::Handle::new(&input).into_diagnostic()?; 44 + resolver.resolve_handle(&handle).await.into_diagnostic()? 45 + }; 46 + 47 + println!("Resolved to DID: {}", did.as_str()); 48 + 49 + // Create output directory if it doesn't exist 50 + std::fs::create_dir_all(&output_dir).into_diagnostic()?; 51 + 52 + // Initial pull of the site 53 + println!("Performing initial pull..."); 54 + let did_str = CowStr::from(did.as_str().to_string()); 55 + pull_site(did_str.clone(), rkey.clone(), output_dir.clone()).await?; 56 + 57 + // Create shared state 58 + let state = ServerState { 59 + did: did_str.clone(), 60 + rkey: rkey.clone(), 61 + output_dir: output_dir.clone(), 62 + last_cid: Arc::new(RwLock::new(None)), 63 + }; 64 + 65 + // Start firehose listener in background 66 + let firehose_state = state.clone(); 67 + tokio::spawn(async move { 68 + if let Err(e) = watch_firehose(firehose_state).await { 69 + eprintln!("Firehose error: {}", e); 70 + } 71 + }); 72 + 73 + // Create HTTP server with gzip compression 74 + let app = Router::new() 75 + .fallback_service( 76 + ServeDir::new(&output_dir) 77 + .precompressed_gzip() 78 + ) 79 + .layer(CompressionLayer::new()) 80 + .with_state(state); 81 + 82 + let addr = format!("0.0.0.0:{}", port); 83 + let listener = tokio::net::TcpListener::bind(&addr) 84 + .await 85 + .into_diagnostic()?; 86 + 87 + println!("\n✓ Server running at http://localhost:{}", port); 88 + println!(" Watching for updates on the firehose...\n"); 89 + 90 + axum::serve(listener, app).await.into_diagnostic()?; 91 + 92 + Ok(()) 93 + } 94 + 95 + /// Watch the firehose for updates to the specific site 96 + fn watch_firehose(state: ServerState) -> std::pin::Pin<Box<dyn std::future::Future<Output = miette::Result<()>> + Send>> { 97 + Box::pin(async move { 98 + let jetstream_url = Url::parse("wss://jetstream1.us-east.fire.hose.cam") 99 + .into_diagnostic()?; 100 + 101 + println!("[Firehose] Connecting to Jetstream..."); 102 + 103 + // Create subscription client 104 + let client = TungsteniteSubscriptionClient::from_base_uri(jetstream_url); 105 + 106 + // Subscribe with no filters (we'll filter manually) 107 + // Jetstream doesn't support filtering by collection in the params builder 108 + let params = JetstreamParams::new().build(); 109 + 110 + let stream = client.subscribe(&params).await.into_diagnostic()?; 111 + println!("[Firehose] Connected! Watching for updates..."); 112 + 113 + // Convert to typed message stream 114 + let (_sink, mut messages) = stream.into_stream(); 115 + 116 + loop { 117 + match messages.next().await { 118 + Some(Ok(msg)) => { 119 + if let Err(e) = handle_firehose_message(&state, msg).await { 120 + eprintln!("[Firehose] Error handling message: {}", e); 121 + } 122 + } 123 + Some(Err(e)) => { 124 + eprintln!("[Firehose] Stream error: {}", e); 125 + // Try to reconnect after a delay 126 + tokio::time::sleep(tokio::time::Duration::from_secs(5)).await; 127 + return Box::pin(watch_firehose(state)).await; 128 + } 129 + None => { 130 + println!("[Firehose] Stream ended, reconnecting..."); 131 + tokio::time::sleep(tokio::time::Duration::from_secs(5)).await; 132 + return Box::pin(watch_firehose(state)).await; 133 + } 134 + } 135 + } 136 + }) 137 + } 138 + 139 + /// Handle a firehose message 140 + async fn handle_firehose_message( 141 + state: &ServerState, 142 + msg: JetstreamMessage<'_>, 143 + ) -> miette::Result<()> { 144 + match msg { 145 + JetstreamMessage::Commit { 146 + did, 147 + commit, 148 + .. 149 + } => { 150 + // Check if this is our site 151 + if did.as_str() == state.did.as_str() 152 + && commit.collection.as_str() == "place.wisp.fs" 153 + && commit.rkey.as_str() == state.rkey.as_str() 154 + { 155 + match commit.operation { 156 + CommitOperation::Create | CommitOperation::Update => { 157 + let new_cid = commit.cid.as_ref().map(|c| c.to_string()); 158 + 159 + // Check if CID changed 160 + let should_update = { 161 + let last_cid = state.last_cid.read().await; 162 + new_cid != *last_cid 163 + }; 164 + 165 + if should_update { 166 + println!("\n[Update] Detected change to site {} (CID: {:?})", state.rkey, new_cid); 167 + println!("[Update] Pulling latest version..."); 168 + 169 + // Pull the updated site 170 + match pull_site( 171 + state.did.clone(), 172 + state.rkey.clone(), 173 + state.output_dir.clone(), 174 + ) 175 + .await 176 + { 177 + Ok(_) => { 178 + // Update last CID 179 + let mut last_cid = state.last_cid.write().await; 180 + *last_cid = new_cid; 181 + println!("[Update] ✓ Site updated successfully!\n"); 182 + } 183 + Err(e) => { 184 + eprintln!("[Update] Failed to pull site: {}", e); 185 + } 186 + } 187 + } 188 + } 189 + CommitOperation::Delete => { 190 + println!("\n[Update] Site {} was deleted", state.rkey); 191 + } 192 + } 193 + } 194 + } 195 + _ => { 196 + // Ignore identity and account messages 197 + } 198 + } 199 + 200 + Ok(()) 201 + } 202 +

-3

.gitmodules

··· 1 - [submodule "cli/jacquard"] 2 - path = cli/jacquard 3 - url = https://tangled.org/@nonbinary.computer/jacquard

-1

cli/jacquard

··· 1 - Subproject commit d533482a61f540586b1eea620b8e9a01a59d5650

+1 -1

cli/Cargo.lock

··· 4913 4913 4914 4914 [[package]] 4915 4915 name = "wisp-cli" 4916 - version = "0.1.0" 4916 + version = "0.2.0" 4917 4917 dependencies = [ 4918 4918 "axum", 4919 4919 "base64 0.22.1",

+28 -1

crates.nix

··· 19 19 targets.x86_64-pc-windows-gnu.latest.rust-std 20 20 targets.x86_64-unknown-linux-gnu.latest.rust-std 21 21 targets.aarch64-apple-darwin.latest.rust-std 22 + targets.aarch64-unknown-linux-gnu.latest.rust-std 22 23 ]; 23 24 # configure crates 24 25 nci.crates."wisp-cli" = { ··· 26 27 dev.runTests = false; 27 28 release.runTests = false; 28 29 }; 29 - targets."x86_64-unknown-linux-gnu" = { 30 + targets."x86_64-unknown-linux-gnu" = let 31 + targetPkgs = pkgs.pkgsCross.gnu64; 32 + targetCC = targetPkgs.stdenv.cc; 33 + targetCargoEnvVarTarget = targetPkgs.stdenv.hostPlatform.rust.cargoEnvVarTarget; 34 + in rec { 30 35 default = true; 36 + depsDrvConfig.mkDerivation = { 37 + nativeBuildInputs = [targetCC]; 38 + }; 39 + depsDrvConfig.env = rec { 40 + TARGET_CC = "${targetCC.targetPrefix}cc"; 41 + "CARGO_TARGET_${targetCargoEnvVarTarget}_LINKER" = TARGET_CC; 42 + }; 43 + drvConfig = depsDrvConfig; 31 44 }; 32 45 targets."x86_64-pc-windows-gnu" = let 33 46 targetPkgs = pkgs.pkgsCross.mingwW64; ··· 58 71 }; 59 72 drvConfig = depsDrvConfig; 60 73 }; 74 + targets."aarch64-unknown-linux-gnu" = let 75 + targetPkgs = pkgs.pkgsCross.aarch64-multiplatform; 76 + targetCC = targetPkgs.stdenv.cc; 77 + targetCargoEnvVarTarget = targetPkgs.stdenv.hostPlatform.rust.cargoEnvVarTarget; 78 + in rec { 79 + depsDrvConfig.mkDerivation = { 80 + nativeBuildInputs = [targetCC]; 81 + }; 82 + depsDrvConfig.env = rec { 83 + TARGET_CC = "${targetCC.targetPrefix}cc"; 84 + "CARGO_TARGET_${targetCargoEnvVarTarget}_LINKER" = TARGET_CC; 85 + }; 86 + drvConfig = depsDrvConfig; 87 + }; 61 88 }; 62 89 }; 63 90 }

+17 -2

flake.nix

··· 26 26 ... 27 27 }: let 28 28 crateOutputs = config.nci.outputs."wisp-cli"; 29 + mkRenamedPackage = name: pkg: pkgs.runCommand name {} '' 30 + mkdir -p $out/bin 31 + cp ${pkg}/bin/wisp-cli $out/bin/${name} 32 + ''; 29 33 in { 30 34 devShells.default = crateOutputs.devShell; 31 35 packages.default = crateOutputs.packages.release; 32 - packages.wisp-cli-windows = crateOutputs.allTargets."x86_64-pc-windows-gnu".packages.release; 33 - packages.wisp-cli-darwin = crateOutputs.allTargets."aarch64-apple-darwin".packages.release; 36 + packages.wisp-cli-x86_64-linux = mkRenamedPackage "wisp-cli-x86_64-linux" crateOutputs.packages.release; 37 + packages.wisp-cli-aarch64-linux = mkRenamedPackage "wisp-cli-aarch64-linux" crateOutputs.allTargets."aarch64-unknown-linux-gnu".packages.release; 38 + packages.wisp-cli-x86_64-windows = mkRenamedPackage "wisp-cli-x86_64-windows.exe" crateOutputs.allTargets."x86_64-pc-windows-gnu".packages.release; 39 + packages.wisp-cli-aarch64-darwin = mkRenamedPackage "wisp-cli-aarch64-darwin" crateOutputs.allTargets."aarch64-apple-darwin".packages.release; 40 + packages.all = pkgs.symlinkJoin { 41 + name = "wisp-cli-all"; 42 + paths = [ 43 + config.packages.wisp-cli-x86_64-linux 44 + config.packages.wisp-cli-aarch64-linux 45 + config.packages.wisp-cli-x86_64-windows 46 + config.packages.wisp-cli-aarch64-darwin 47 + ]; 48 + }; 34 49 }; 35 50 }; 36 51 }

History