Monorepo for wisp.place. A static site hosting service built on top of the AT Protocol. wisp.place

rudimentry _redirects support, incremental uploading for cli #3

closed opened by nekomimi.pet targeting main

TODO _headers file place.wisp.settings lexicon as a lexiconal way of configuring this

Labels

None yet.

assignee

None yet.

Participants 1
AT URI
at://did:plc:ttdrpj45ibqunmfhdsb4zdwq/sh.tangled.repo.pull/3m5htcylipq22
+2057 -216
Diff #2
+31 -1
README.md
··· 50 50 cargo build 51 51 ``` 52 52 53 + ## Features 54 + 55 + ### URL Redirects and Rewrites 56 + 57 + The hosting service supports Netlify-style `_redirects` files for managing URLs. Place a `_redirects` file in your site root to enable: 58 + 59 + - **301/302 Redirects**: Permanent and temporary URL redirects 60 + - **200 Rewrites**: Serve different content without changing the URL 61 + - **404 Custom Pages**: Custom error pages for specific paths 62 + - **Splats & Placeholders**: Dynamic path matching (`/blog/:year/:month/:day`, `/news/*`) 63 + - **Query Parameter Matching**: Redirect based on URL parameters 64 + - **Conditional Redirects**: Route by country, language, or cookie presence 65 + - **Force Redirects**: Override existing files with redirects 66 + 67 + Example `_redirects`: 68 + ``` 69 + # Single-page app routing (React, Vue, etc.) 70 + /* /index.html 200 71 + 72 + # Simple redirects 73 + /home / 74 + /old-blog/* /blog/:splat 75 + 76 + # API proxy 77 + /api/* https://api.example.com/:splat 200 78 + 79 + # Country-based routing 80 + / /us/ 302 Country=us 81 + / /uk/ 302 Country=gb 82 + ``` 83 + 53 84 ## Limits 54 85 55 86 - Max file size: 100MB (PDS limit) 56 - - Max site size: 300MB 57 87 - Max files: 2000 58 88 59 89 ## Tech Stack
-123
hosting-service/EXAMPLE.md
··· 1 - # HTML Path Rewriting Example 2 - 3 - This document demonstrates how HTML path rewriting works when serving sites via the `/s/:identifier/:site/*` route. 4 - 5 - ## Problem 6 - 7 - When you create a static site with absolute paths like `/style.css` or `/images/logo.png`, these paths work fine when served from the root domain. However, when served from a subdirectory like `/s/alice.bsky.social/mysite/`, these absolute paths break because they resolve to the server root instead of the site root. 8 - 9 - ## Solution 10 - 11 - The hosting service automatically rewrites absolute paths in HTML files to work correctly in the subdirectory context. 12 - 13 - ## Example 14 - 15 - **Original HTML file (index.html):** 16 - ```html 17 - <!DOCTYPE html> 18 - <html> 19 - <head> 20 - <meta charset="UTF-8"> 21 - <title>My Site</title> 22 - <link rel="stylesheet" href="/style.css"> 23 - <link rel="icon" href="/favicon.ico"> 24 - <script src="/app.js"></script> 25 - </head> 26 - <body> 27 - <header> 28 - <img src="/images/logo.png" alt="Logo"> 29 - <nav> 30 - <a href="/">Home</a> 31 - <a href="/about">About</a> 32 - <a href="/contact">Contact</a> 33 - </nav> 34 - </header> 35 - 36 - <main> 37 - <h1>Welcome</h1> 38 - <img src="/images/hero.jpg" 39 - srcset="/images/hero.jpg 1x, /images/hero@2x.jpg 2x" 40 - alt="Hero"> 41 - 42 - <form action="/submit" method="post"> 43 - <input type="text" name="email"> 44 - <button>Submit</button> 45 - </form> 46 - </main> 47 - 48 - <footer> 49 - <a href="https://example.com">External Link</a> 50 - <a href="#top">Back to Top</a> 51 - </footer> 52 - </body> 53 - </html> 54 - ``` 55 - 56 - **When accessed via `/s/alice.bsky.social/mysite/`, the HTML is rewritten to:** 57 - ```html 58 - <!DOCTYPE html> 59 - <html> 60 - <head> 61 - <meta charset="UTF-8"> 62 - <title>My Site</title> 63 - <link rel="stylesheet" href="/s/alice.bsky.social/mysite/style.css"> 64 - <link rel="icon" href="/s/alice.bsky.social/mysite/favicon.ico"> 65 - <script src="/s/alice.bsky.social/mysite/app.js"></script> 66 - </head> 67 - <body> 68 - <header> 69 - <img src="/s/alice.bsky.social/mysite/images/logo.png" alt="Logo"> 70 - <nav> 71 - <a href="/s/alice.bsky.social/mysite/">Home</a> 72 - <a href="/s/alice.bsky.social/mysite/about">About</a> 73 - <a href="/s/alice.bsky.social/mysite/contact">Contact</a> 74 - </nav> 75 - </header> 76 - 77 - <main> 78 - <h1>Welcome</h1> 79 - <img src="/s/alice.bsky.social/mysite/images/hero.jpg" 80 - srcset="/s/alice.bsky.social/mysite/images/hero.jpg 1x, /s/alice.bsky.social/mysite/images/hero@2x.jpg 2x" 81 - alt="Hero"> 82 - 83 - <form action="/s/alice.bsky.social/mysite/submit" method="post"> 84 - <input type="text" name="email"> 85 - <button>Submit</button> 86 - </form> 87 - </main> 88 - 89 - <footer> 90 - <a href="https://example.com">External Link</a> 91 - <a href="#top">Back to Top</a> 92 - </footer> 93 - </body> 94 - </html> 95 - ``` 96 - 97 - ## What's Preserved 98 - 99 - Notice that: 100 - - ✅ Absolute paths are rewritten: `/style.css` → `/s/alice.bsky.social/mysite/style.css` 101 - - ✅ External URLs are preserved: `https://example.com` stays the same 102 - - ✅ Anchors are preserved: `#top` stays the same 103 - - ✅ The rewriting is safe and won't break your site 104 - 105 - ## Supported Attributes 106 - 107 - The rewriter handles these HTML attributes: 108 - - `src` - images, scripts, iframes, videos, audio 109 - - `href` - links, stylesheets 110 - - `action` - forms 111 - - `data` - objects 112 - - `poster` - video posters 113 - - `srcset` - responsive images 114 - 115 - ## Testing Your Site 116 - 117 - To test if your site works with path rewriting: 118 - 119 - 1. Upload your site to your PDS as a `place.wisp.fs` record 120 - 2. Access it via: `https://hosting.wisp.place/s/YOUR_HANDLE/SITE_NAME/` 121 - 3. Check that all resources load correctly 122 - 123 - If you're using relative paths already (like `./style.css` or `../images/logo.png`), they'll work without any rewriting.
+134
hosting-service/example-_redirects
··· 1 + # Example _redirects file for Wisp hosting 2 + # Place this file in the root directory of your site as "_redirects" 3 + # Lines starting with # are comments 4 + 5 + # =================================== 6 + # SIMPLE REDIRECTS 7 + # =================================== 8 + 9 + # Redirect home page 10 + # /home / 11 + 12 + # Redirect old URLs to new ones 13 + # /old-blog /blog 14 + # /about-us /about 15 + 16 + # =================================== 17 + # SPLAT REDIRECTS (WILDCARDS) 18 + # =================================== 19 + 20 + # Redirect entire directories 21 + # /news/* /blog/:splat 22 + # /old-site/* /new-site/:splat 23 + 24 + # =================================== 25 + # PLACEHOLDER REDIRECTS 26 + # =================================== 27 + 28 + # Restructure blog URLs 29 + # /blog/:year/:month/:day/:slug /posts/:year-:month-:day/:slug 30 + 31 + # Capture multiple parameters 32 + # /products/:category/:id /shop/:category/item/:id 33 + 34 + # =================================== 35 + # STATUS CODES 36 + # =================================== 37 + 38 + # Permanent redirect (301) - default if not specified 39 + # /permanent-move /new-location 301 40 + 41 + # Temporary redirect (302) 42 + # /temp-redirect /temp-location 302 43 + 44 + # Rewrite (200) - serves different content, URL stays the same 45 + # /api/* /functions/:splat 200 46 + 47 + # Custom 404 page 48 + # /shop/* /shop-closed.html 404 49 + 50 + # =================================== 51 + # FORCE REDIRECTS 52 + # =================================== 53 + 54 + # Force redirect even if file exists (note the ! after status code) 55 + # /override-file /other-file.html 200! 56 + 57 + # =================================== 58 + # CONDITIONAL REDIRECTS 59 + # =================================== 60 + 61 + # Country-based redirects (ISO 3166-1 alpha-2 codes) 62 + # / /us/ 302 Country=us 63 + # / /uk/ 302 Country=gb 64 + # / /anz/ 302 Country=au,nz 65 + 66 + # Language-based redirects 67 + # /products /en/products 301 Language=en 68 + # /products /de/products 301 Language=de 69 + # /products /fr/products 301 Language=fr 70 + 71 + # Cookie-based redirects (checks if cookie exists) 72 + # /* /legacy/:splat 200 Cookie=is_legacy 73 + 74 + # =================================== 75 + # QUERY PARAMETERS 76 + # =================================== 77 + 78 + # Match specific query parameters 79 + # /store id=:id /blog/:id 301 80 + 81 + # Multiple parameters 82 + # /search q=:query category=:cat /find/:cat/:query 301 83 + 84 + # =================================== 85 + # DOMAIN-LEVEL REDIRECTS 86 + # =================================== 87 + 88 + # Redirect to different domain (must include protocol) 89 + # /external https://example.com/path 90 + 91 + # Redirect entire subdomain 92 + # http://blog.example.com/* https://example.com/blog/:splat 301! 93 + # https://blog.example.com/* https://example.com/blog/:splat 301! 94 + 95 + # =================================== 96 + # COMMON PATTERNS 97 + # =================================== 98 + 99 + # Remove .html extensions 100 + # /page.html /page 101 + 102 + # Add trailing slash 103 + # /about /about/ 104 + 105 + # Single-page app fallback (serve index.html for all paths) 106 + # /* /index.html 200 107 + 108 + # API proxy 109 + # /api/* https://api.example.com/:splat 200 110 + 111 + # =================================== 112 + # CUSTOM ERROR PAGES 113 + # =================================== 114 + 115 + # Language-specific 404 pages 116 + # /en/* /en/404.html 404 117 + # /de/* /de/404.html 404 118 + 119 + # Section-specific 404 pages 120 + # /shop/* /shop/not-found.html 404 121 + # /blog/* /blog/404.html 404 122 + 123 + # =================================== 124 + # NOTES 125 + # =================================== 126 + # 127 + # - Rules are processed in order (first match wins) 128 + # - More specific rules should come before general ones 129 + # - Splats (*) can only be used at the end of a path 130 + # - Query parameters are automatically preserved for 200, 301, 302 131 + # - Trailing slashes are normalized (/ and no / are treated the same) 132 + # - Default status code is 301 if not specified 133 + # 134 +
+215
hosting-service/src/lib/redirects.test.ts
··· 1 + import { describe, it, expect } from 'bun:test' 2 + import { parseRedirectsFile, matchRedirectRule } from './redirects'; 3 + 4 + describe('parseRedirectsFile', () => { 5 + it('should parse simple redirects', () => { 6 + const content = ` 7 + # Comment line 8 + /old-path /new-path 9 + /home / 301 10 + `; 11 + const rules = parseRedirectsFile(content); 12 + expect(rules).toHaveLength(2); 13 + expect(rules[0]).toMatchObject({ 14 + from: '/old-path', 15 + to: '/new-path', 16 + status: 301, 17 + force: false, 18 + }); 19 + expect(rules[1]).toMatchObject({ 20 + from: '/home', 21 + to: '/', 22 + status: 301, 23 + force: false, 24 + }); 25 + }); 26 + 27 + it('should parse redirects with different status codes', () => { 28 + const content = ` 29 + /temp-redirect /target 302 30 + /rewrite /content 200 31 + /not-found /404 404 32 + `; 33 + const rules = parseRedirectsFile(content); 34 + expect(rules).toHaveLength(3); 35 + expect(rules[0]?.status).toBe(302); 36 + expect(rules[1]?.status).toBe(200); 37 + expect(rules[2]?.status).toBe(404); 38 + }); 39 + 40 + it('should parse force redirects', () => { 41 + const content = `/force-path /target 301!`; 42 + const rules = parseRedirectsFile(content); 43 + expect(rules[0]?.force).toBe(true); 44 + expect(rules[0]?.status).toBe(301); 45 + }); 46 + 47 + it('should parse splat redirects', () => { 48 + const content = `/news/* /blog/:splat`; 49 + const rules = parseRedirectsFile(content); 50 + expect(rules[0]?.from).toBe('/news/*'); 51 + expect(rules[0]?.to).toBe('/blog/:splat'); 52 + }); 53 + 54 + it('should parse placeholder redirects', () => { 55 + const content = `/blog/:year/:month/:day /posts/:year-:month-:day`; 56 + const rules = parseRedirectsFile(content); 57 + expect(rules[0]?.from).toBe('/blog/:year/:month/:day'); 58 + expect(rules[0]?.to).toBe('/posts/:year-:month-:day'); 59 + }); 60 + 61 + it('should parse country-based redirects', () => { 62 + const content = `/ /anz 302 Country=au,nz`; 63 + const rules = parseRedirectsFile(content); 64 + expect(rules[0]?.conditions?.country).toEqual(['au', 'nz']); 65 + }); 66 + 67 + it('should parse language-based redirects', () => { 68 + const content = `/products /en/products 301 Language=en`; 69 + const rules = parseRedirectsFile(content); 70 + expect(rules[0]?.conditions?.language).toEqual(['en']); 71 + }); 72 + 73 + it('should parse cookie-based redirects', () => { 74 + const content = `/* /legacy/:splat 200 Cookie=is_legacy,my_cookie`; 75 + const rules = parseRedirectsFile(content); 76 + expect(rules[0]?.conditions?.cookie).toEqual(['is_legacy', 'my_cookie']); 77 + }); 78 + }); 79 + 80 + describe('matchRedirectRule', () => { 81 + it('should match exact paths', () => { 82 + const rules = parseRedirectsFile('/old-path /new-path'); 83 + const match = matchRedirectRule('/old-path', rules); 84 + expect(match).toBeTruthy(); 85 + expect(match?.targetPath).toBe('/new-path'); 86 + expect(match?.status).toBe(301); 87 + }); 88 + 89 + it('should match paths with trailing slash', () => { 90 + const rules = parseRedirectsFile('/old-path /new-path'); 91 + const match = matchRedirectRule('/old-path/', rules); 92 + expect(match).toBeTruthy(); 93 + expect(match?.targetPath).toBe('/new-path'); 94 + }); 95 + 96 + it('should match splat patterns', () => { 97 + const rules = parseRedirectsFile('/news/* /blog/:splat'); 98 + const match = matchRedirectRule('/news/2024/01/15/my-post', rules); 99 + expect(match).toBeTruthy(); 100 + expect(match?.targetPath).toBe('/blog/2024/01/15/my-post'); 101 + }); 102 + 103 + it('should match placeholder patterns', () => { 104 + const rules = parseRedirectsFile('/blog/:year/:month/:day /posts/:year-:month-:day'); 105 + const match = matchRedirectRule('/blog/2024/01/15', rules); 106 + expect(match).toBeTruthy(); 107 + expect(match?.targetPath).toBe('/posts/2024-01-15'); 108 + }); 109 + 110 + it('should preserve query strings for 301/302 redirects', () => { 111 + const rules = parseRedirectsFile('/old /new 301'); 112 + const match = matchRedirectRule('/old', rules, { 113 + queryParams: { foo: 'bar', baz: 'qux' }, 114 + }); 115 + expect(match?.targetPath).toContain('?'); 116 + expect(match?.targetPath).toContain('foo=bar'); 117 + expect(match?.targetPath).toContain('baz=qux'); 118 + }); 119 + 120 + it('should match based on query parameters', () => { 121 + const rules = parseRedirectsFile('/store id=:id /blog/:id 301'); 122 + const match = matchRedirectRule('/store', rules, { 123 + queryParams: { id: 'my-post' }, 124 + }); 125 + expect(match).toBeTruthy(); 126 + expect(match?.targetPath).toContain('/blog/my-post'); 127 + }); 128 + 129 + it('should not match when query params are missing', () => { 130 + const rules = parseRedirectsFile('/store id=:id /blog/:id 301'); 131 + const match = matchRedirectRule('/store', rules, { 132 + queryParams: {}, 133 + }); 134 + expect(match).toBeNull(); 135 + }); 136 + 137 + it('should match based on country header', () => { 138 + const rules = parseRedirectsFile('/ /aus 302 Country=au'); 139 + const match = matchRedirectRule('/', rules, { 140 + headers: { 'cf-ipcountry': 'AU' }, 141 + }); 142 + expect(match).toBeTruthy(); 143 + expect(match?.targetPath).toBe('/aus'); 144 + }); 145 + 146 + it('should not match wrong country', () => { 147 + const rules = parseRedirectsFile('/ /aus 302 Country=au'); 148 + const match = matchRedirectRule('/', rules, { 149 + headers: { 'cf-ipcountry': 'US' }, 150 + }); 151 + expect(match).toBeNull(); 152 + }); 153 + 154 + it('should match based on language header', () => { 155 + const rules = parseRedirectsFile('/products /en/products 301 Language=en'); 156 + const match = matchRedirectRule('/products', rules, { 157 + headers: { 'accept-language': 'en-US,en;q=0.9' }, 158 + }); 159 + expect(match).toBeTruthy(); 160 + expect(match?.targetPath).toBe('/en/products'); 161 + }); 162 + 163 + it('should match based on cookie presence', () => { 164 + const rules = parseRedirectsFile('/* /legacy/:splat 200 Cookie=is_legacy'); 165 + const match = matchRedirectRule('/some-path', rules, { 166 + cookies: { is_legacy: 'true' }, 167 + }); 168 + expect(match).toBeTruthy(); 169 + expect(match?.targetPath).toBe('/legacy/some-path'); 170 + }); 171 + 172 + it('should return first matching rule', () => { 173 + const content = ` 174 + /path /first 175 + /path /second 176 + `; 177 + const rules = parseRedirectsFile(content); 178 + const match = matchRedirectRule('/path', rules); 179 + expect(match?.targetPath).toBe('/first'); 180 + }); 181 + 182 + it('should match more specific rules before general ones', () => { 183 + const content = ` 184 + /jobs/customer-ninja /careers/support 185 + /jobs/* /careers/:splat 186 + `; 187 + const rules = parseRedirectsFile(content); 188 + 189 + const match1 = matchRedirectRule('/jobs/customer-ninja', rules); 190 + expect(match1?.targetPath).toBe('/careers/support'); 191 + 192 + const match2 = matchRedirectRule('/jobs/developer', rules); 193 + expect(match2?.targetPath).toBe('/careers/developer'); 194 + }); 195 + 196 + it('should handle SPA routing pattern', () => { 197 + const rules = parseRedirectsFile('/* /index.html 200'); 198 + 199 + // Should match any path 200 + const match1 = matchRedirectRule('/about', rules); 201 + expect(match1).toBeTruthy(); 202 + expect(match1?.targetPath).toBe('/index.html'); 203 + expect(match1?.status).toBe(200); 204 + 205 + const match2 = matchRedirectRule('/users/123/profile', rules); 206 + expect(match2).toBeTruthy(); 207 + expect(match2?.targetPath).toBe('/index.html'); 208 + expect(match2?.status).toBe(200); 209 + 210 + const match3 = matchRedirectRule('/', rules); 211 + expect(match3).toBeTruthy(); 212 + expect(match3?.targetPath).toBe('/index.html'); 213 + }); 214 + }); 215 +
+413
hosting-service/src/lib/redirects.ts
··· 1 + import { readFile } from 'fs/promises'; 2 + import { existsSync } from 'fs'; 3 + 4 + export interface RedirectRule { 5 + from: string; 6 + to: string; 7 + status: number; 8 + force: boolean; 9 + conditions?: { 10 + country?: string[]; 11 + language?: string[]; 12 + role?: string[]; 13 + cookie?: string[]; 14 + }; 15 + // For pattern matching 16 + fromPattern?: RegExp; 17 + fromParams?: string[]; // Named parameters from the pattern 18 + queryParams?: Record<string, string>; // Expected query parameters 19 + } 20 + 21 + export interface RedirectMatch { 22 + rule: RedirectRule; 23 + targetPath: string; 24 + status: number; 25 + } 26 + 27 + /** 28 + * Parse a _redirects file into an array of redirect rules 29 + */ 30 + export function parseRedirectsFile(content: string): RedirectRule[] { 31 + const lines = content.split('\n'); 32 + const rules: RedirectRule[] = []; 33 + 34 + for (let lineNum = 0; lineNum < lines.length; lineNum++) { 35 + const lineRaw = lines[lineNum]; 36 + if (!lineRaw) continue; 37 + 38 + const line = lineRaw.trim(); 39 + 40 + // Skip empty lines and comments 41 + if (!line || line.startsWith('#')) { 42 + continue; 43 + } 44 + 45 + try { 46 + const rule = parseRedirectLine(line); 47 + if (rule && rule.fromPattern) { 48 + rules.push(rule); 49 + } 50 + } catch (err) { 51 + console.warn(`Failed to parse redirect rule on line ${lineNum + 1}: ${line}`, err); 52 + } 53 + } 54 + 55 + return rules; 56 + } 57 + 58 + /** 59 + * Parse a single redirect rule line 60 + * Format: /from [query_params] /to [status] [conditions] 61 + */ 62 + function parseRedirectLine(line: string): RedirectRule | null { 63 + // Split by whitespace, but respect quoted strings (though not commonly used) 64 + const parts = line.split(/\s+/); 65 + 66 + if (parts.length < 2) { 67 + return null; 68 + } 69 + 70 + let idx = 0; 71 + const from = parts[idx++]; 72 + 73 + if (!from) { 74 + return null; 75 + } 76 + 77 + let status = 301; // Default status 78 + let force = false; 79 + const conditions: NonNullable<RedirectRule['conditions']> = {}; 80 + const queryParams: Record<string, string> = {}; 81 + 82 + // Parse query parameters that come before the destination path 83 + // They look like: key=:value (and don't start with /) 84 + while (idx < parts.length) { 85 + const part = parts[idx]; 86 + if (!part) { 87 + idx++; 88 + continue; 89 + } 90 + 91 + // If it starts with / or http, it's the destination path 92 + if (part.startsWith('/') || part.startsWith('http://') || part.startsWith('https://')) { 93 + break; 94 + } 95 + 96 + // If it contains = and comes before the destination, it's a query param 97 + if (part.includes('=')) { 98 + const splitIndex = part.indexOf('='); 99 + const key = part.slice(0, splitIndex); 100 + const value = part.slice(splitIndex + 1); 101 + 102 + if (key && value) { 103 + queryParams[key] = value; 104 + } 105 + idx++; 106 + } else { 107 + // Not a query param, must be destination or something else 108 + break; 109 + } 110 + } 111 + 112 + // Next part should be the destination 113 + if (idx >= parts.length) { 114 + return null; 115 + } 116 + 117 + const to = parts[idx++]; 118 + if (!to) { 119 + return null; 120 + } 121 + 122 + // Parse remaining parts for status code and conditions 123 + for (let i = idx; i < parts.length; i++) { 124 + const part = parts[i]; 125 + 126 + if (!part) continue; 127 + 128 + // Check for status code (with optional ! for force) 129 + if (/^\d+!?$/.test(part)) { 130 + if (part.endsWith('!')) { 131 + force = true; 132 + status = parseInt(part.slice(0, -1)); 133 + } else { 134 + status = parseInt(part); 135 + } 136 + continue; 137 + } 138 + 139 + // Check for condition parameters (Country=, Language=, Role=, Cookie=) 140 + if (part.includes('=')) { 141 + const splitIndex = part.indexOf('='); 142 + const key = part.slice(0, splitIndex); 143 + const value = part.slice(splitIndex + 1); 144 + 145 + if (!key || !value) continue; 146 + 147 + const keyLower = key.toLowerCase(); 148 + 149 + if (keyLower === 'country') { 150 + conditions.country = value.split(',').map(v => v.trim().toLowerCase()); 151 + } else if (keyLower === 'language') { 152 + conditions.language = value.split(',').map(v => v.trim().toLowerCase()); 153 + } else if (keyLower === 'role') { 154 + conditions.role = value.split(',').map(v => v.trim()); 155 + } else if (keyLower === 'cookie') { 156 + conditions.cookie = value.split(',').map(v => v.trim().toLowerCase()); 157 + } 158 + } 159 + } 160 + 161 + // Parse the 'from' pattern 162 + const { pattern, params } = convertPathToRegex(from); 163 + 164 + return { 165 + from, 166 + to, 167 + status, 168 + force, 169 + conditions: Object.keys(conditions).length > 0 ? conditions : undefined, 170 + queryParams: Object.keys(queryParams).length > 0 ? queryParams : undefined, 171 + fromPattern: pattern, 172 + fromParams: params, 173 + }; 174 + } 175 + 176 + /** 177 + * Convert a path pattern with placeholders and splats to a regex 178 + * Examples: 179 + * /blog/:year/:month/:day -> captures year, month, day 180 + * /news/* -> captures splat 181 + */ 182 + function convertPathToRegex(pattern: string): { pattern: RegExp; params: string[] } { 183 + const params: string[] = []; 184 + let regexStr = '^'; 185 + 186 + // Split by query string if present 187 + const pathPart = pattern.split('?')[0] || pattern; 188 + 189 + // Escape special regex characters except * and : 190 + let escaped = pathPart.replace(/[.+^${}()|[\]\\]/g, '\\$&'); 191 + 192 + // Replace :param with named capture groups 193 + escaped = escaped.replace(/:([a-zA-Z_][a-zA-Z0-9_]*)/g, (match, paramName) => { 194 + params.push(paramName); 195 + // Match path segment (everything except / and ?) 196 + return '([^/?]+)'; 197 + }); 198 + 199 + // Replace * with splat capture (matches everything including /) 200 + if (escaped.includes('*')) { 201 + escaped = escaped.replace(/\*/g, '(.*)'); 202 + params.push('splat'); 203 + } 204 + 205 + regexStr += escaped; 206 + 207 + // Make trailing slash optional 208 + if (!regexStr.endsWith('.*')) { 209 + regexStr += '/?'; 210 + } 211 + 212 + regexStr += '$'; 213 + 214 + return { 215 + pattern: new RegExp(regexStr), 216 + params, 217 + }; 218 + } 219 + 220 + /** 221 + * Match a request path against redirect rules 222 + */ 223 + export function matchRedirectRule( 224 + requestPath: string, 225 + rules: RedirectRule[], 226 + context?: { 227 + queryParams?: Record<string, string>; 228 + headers?: Record<string, string>; 229 + cookies?: Record<string, string>; 230 + } 231 + ): RedirectMatch | null { 232 + // Normalize path: ensure leading slash, remove trailing slash (except for root) 233 + let normalizedPath = requestPath.startsWith('/') ? requestPath : `/${requestPath}`; 234 + 235 + for (const rule of rules) { 236 + // Check query parameter conditions first (if any) 237 + if (rule.queryParams) { 238 + // If rule requires query params but none provided, skip this rule 239 + if (!context?.queryParams) { 240 + continue; 241 + } 242 + 243 + const queryMatches = Object.entries(rule.queryParams).every(([key, value]) => { 244 + const actualValue = context.queryParams?.[key]; 245 + return actualValue !== undefined; 246 + }); 247 + 248 + if (!queryMatches) { 249 + continue; 250 + } 251 + } 252 + 253 + // Check conditional redirects (country, language, role, cookie) 254 + if (rule.conditions) { 255 + if (rule.conditions.country && context?.headers) { 256 + const cfCountry = context.headers['cf-ipcountry']; 257 + const xCountry = context.headers['x-country']; 258 + const country = (cfCountry?.toLowerCase() || xCountry?.toLowerCase()); 259 + if (!country || !rule.conditions.country.includes(country)) { 260 + continue; 261 + } 262 + } 263 + 264 + if (rule.conditions.language && context?.headers) { 265 + const acceptLang = context.headers['accept-language']; 266 + if (!acceptLang) { 267 + continue; 268 + } 269 + // Parse accept-language header (simplified) 270 + const langs = acceptLang.split(',').map(l => { 271 + const langPart = l.split(';')[0]; 272 + return langPart ? langPart.trim().toLowerCase() : ''; 273 + }).filter(l => l !== ''); 274 + const hasMatch = rule.conditions.language.some(lang => 275 + langs.some(l => l === lang || l.startsWith(lang + '-')) 276 + ); 277 + if (!hasMatch) { 278 + continue; 279 + } 280 + } 281 + 282 + if (rule.conditions.cookie && context?.cookies) { 283 + const hasCookie = rule.conditions.cookie.some(cookieName => 284 + context.cookies && cookieName in context.cookies 285 + ); 286 + if (!hasCookie) { 287 + continue; 288 + } 289 + } 290 + 291 + // Role-based redirects would need JWT verification - skip for now 292 + if (rule.conditions.role) { 293 + continue; 294 + } 295 + } 296 + 297 + // Match the path pattern 298 + const match = rule.fromPattern?.exec(normalizedPath); 299 + if (!match) { 300 + continue; 301 + } 302 + 303 + // Build the target path by replacing placeholders 304 + let targetPath = rule.to; 305 + 306 + // Replace captured parameters 307 + if (rule.fromParams && match.length > 1) { 308 + for (let i = 0; i < rule.fromParams.length; i++) { 309 + const paramName = rule.fromParams[i]; 310 + const paramValue = match[i + 1]; 311 + 312 + if (!paramName || !paramValue) continue; 313 + 314 + if (paramName === 'splat') { 315 + targetPath = targetPath.replace(':splat', paramValue); 316 + } else { 317 + targetPath = targetPath.replace(`:${paramName}`, paramValue); 318 + } 319 + } 320 + } 321 + 322 + // Handle query parameter replacements 323 + if (rule.queryParams && context?.queryParams) { 324 + for (const [key, placeholder] of Object.entries(rule.queryParams)) { 325 + const actualValue = context.queryParams[key]; 326 + if (actualValue && placeholder && placeholder.startsWith(':')) { 327 + const paramName = placeholder.slice(1); 328 + if (paramName) { 329 + targetPath = targetPath.replace(`:${paramName}`, actualValue); 330 + } 331 + } 332 + } 333 + } 334 + 335 + // Preserve query string for 200, 301, 302 redirects (unless target already has one) 336 + if ([200, 301, 302].includes(rule.status) && context?.queryParams && !targetPath.includes('?')) { 337 + const queryString = Object.entries(context.queryParams) 338 + .map(([k, v]) => `${encodeURIComponent(k)}=${encodeURIComponent(v)}`) 339 + .join('&'); 340 + if (queryString) { 341 + targetPath += `?${queryString}`; 342 + } 343 + } 344 + 345 + return { 346 + rule, 347 + targetPath, 348 + status: rule.status, 349 + }; 350 + } 351 + 352 + return null; 353 + } 354 + 355 + /** 356 + * Load redirect rules from a cached site 357 + */ 358 + export async function loadRedirectRules(did: string, rkey: string): Promise<RedirectRule[]> { 359 + const CACHE_DIR = process.env.CACHE_DIR || './cache/sites'; 360 + const redirectsPath = `${CACHE_DIR}/${did}/${rkey}/_redirects`; 361 + 362 + if (!existsSync(redirectsPath)) { 363 + return []; 364 + } 365 + 366 + try { 367 + const content = await readFile(redirectsPath, 'utf-8'); 368 + return parseRedirectsFile(content); 369 + } catch (err) { 370 + console.error('Failed to load _redirects file', err); 371 + return []; 372 + } 373 + } 374 + 375 + /** 376 + * Parse cookies from Cookie header 377 + */ 378 + export function parseCookies(cookieHeader?: string): Record<string, string> { 379 + if (!cookieHeader) return {}; 380 + 381 + const cookies: Record<string, string> = {}; 382 + const parts = cookieHeader.split(';'); 383 + 384 + for (const part of parts) { 385 + const [key, ...valueParts] = part.split('='); 386 + if (key && valueParts.length > 0) { 387 + cookies[key.trim()] = valueParts.join('=').trim(); 388 + } 389 + } 390 + 391 + return cookies; 392 + } 393 + 394 + /** 395 + * Parse query string into object 396 + */ 397 + export function parseQueryString(url: string): Record<string, string> { 398 + const queryStart = url.indexOf('?'); 399 + if (queryStart === -1) return {}; 400 + 401 + const queryString = url.slice(queryStart + 1); 402 + const params: Record<string, string> = {}; 403 + 404 + for (const pair of queryString.split('&')) { 405 + const [key, value] = pair.split('='); 406 + if (key) { 407 + params[decodeURIComponent(key)] = value ? decodeURIComponent(value) : ''; 408 + } 409 + } 410 + 411 + return params; 412 + } 413 +
+168 -6
hosting-service/src/server.ts
··· 7 7 import { lookup } from 'mime-types'; 8 8 import { logger, observabilityMiddleware, observabilityErrorHandler, logCollector, errorTracker, metricsCollector } from './lib/observability'; 9 9 import { fileCache, metadataCache, rewrittenHtmlCache, getCacheKey, type FileMetadata } from './lib/cache'; 10 + import { loadRedirectRules, matchRedirectRule, parseCookies, parseQueryString, type RedirectRule } from './lib/redirects'; 10 11 11 12 const BASE_HOST = process.env.BASE_HOST || 'wisp.place'; 12 13 ··· 35 36 } 36 37 } 37 38 39 + // Cache for redirect rules (per site) 40 + const redirectRulesCache = new Map<string, RedirectRule[]>(); 41 + 42 + /** 43 + * Clear redirect rules cache for a specific site 44 + * Should be called when a site is updated/recached 45 + */ 46 + export function clearRedirectRulesCache(did: string, rkey: string) { 47 + const cacheKey = `${did}:${rkey}`; 48 + redirectRulesCache.delete(cacheKey); 49 + } 50 + 38 51 // Helper to serve files from cache 39 - async function serveFromCache(did: string, rkey: string, filePath: string) { 52 + async function serveFromCache( 53 + did: string, 54 + rkey: string, 55 + filePath: string, 56 + fullUrl?: string, 57 + headers?: Record<string, string> 58 + ) { 59 + // Check for redirect rules first 60 + const redirectCacheKey = `${did}:${rkey}`; 61 + let redirectRules = redirectRulesCache.get(redirectCacheKey); 62 + 63 + if (redirectRules === undefined) { 64 + // Load rules for the first time 65 + redirectRules = await loadRedirectRules(did, rkey); 66 + redirectRulesCache.set(redirectCacheKey, redirectRules); 67 + } 68 + 69 + // Apply redirect rules if any exist 70 + if (redirectRules.length > 0) { 71 + const requestPath = '/' + (filePath || ''); 72 + const queryParams = fullUrl ? parseQueryString(fullUrl) : {}; 73 + const cookies = parseCookies(headers?.['cookie']); 74 + 75 + const redirectMatch = matchRedirectRule(requestPath, redirectRules, { 76 + queryParams, 77 + headers, 78 + cookies, 79 + }); 80 + 81 + if (redirectMatch) { 82 + const { targetPath, status } = redirectMatch; 83 + 84 + // Handle different status codes 85 + if (status === 200) { 86 + // Rewrite: serve different content but keep URL the same 87 + // Remove leading slash for internal path resolution 88 + const rewritePath = targetPath.startsWith('/') ? targetPath.slice(1) : targetPath; 89 + return serveFileInternal(did, rkey, rewritePath); 90 + } else if (status === 301 || status === 302) { 91 + // External redirect: change the URL 92 + return new Response(null, { 93 + status, 94 + headers: { 95 + 'Location': targetPath, 96 + 'Cache-Control': status === 301 ? 'public, max-age=31536000' : 'public, max-age=0', 97 + }, 98 + }); 99 + } else if (status === 404) { 100 + // Custom 404 page 101 + const custom404Path = targetPath.startsWith('/') ? targetPath.slice(1) : targetPath; 102 + const response = await serveFileInternal(did, rkey, custom404Path); 103 + // Override status to 404 104 + return new Response(response.body, { 105 + status: 404, 106 + headers: response.headers, 107 + }); 108 + } 109 + } 110 + } 111 + 112 + // No redirect matched, serve normally 113 + return serveFileInternal(did, rkey, filePath); 114 + } 115 + 116 + // Internal function to serve a file (used by both normal serving and rewrites) 117 + async function serveFileInternal(did: string, rkey: string, filePath: string) { 40 118 // Default to index.html if path is empty or ends with / 41 119 let requestPath = filePath || 'index.html'; 42 120 if (requestPath.endsWith('/')) { ··· 138 216 did: string, 139 217 rkey: string, 140 218 filePath: string, 141 - basePath: string 219 + basePath: string, 220 + fullUrl?: string, 221 + headers?: Record<string, string> 142 222 ) { 223 + // Check for redirect rules first 224 + const redirectCacheKey = `${did}:${rkey}`; 225 + let redirectRules = redirectRulesCache.get(redirectCacheKey); 226 + 227 + if (redirectRules === undefined) { 228 + // Load rules for the first time 229 + redirectRules = await loadRedirectRules(did, rkey); 230 + redirectRulesCache.set(redirectCacheKey, redirectRules); 231 + } 232 + 233 + // Apply redirect rules if any exist 234 + if (redirectRules.length > 0) { 235 + const requestPath = '/' + (filePath || ''); 236 + const queryParams = fullUrl ? parseQueryString(fullUrl) : {}; 237 + const cookies = parseCookies(headers?.['cookie']); 238 + 239 + const redirectMatch = matchRedirectRule(requestPath, redirectRules, { 240 + queryParams, 241 + headers, 242 + cookies, 243 + }); 244 + 245 + if (redirectMatch) { 246 + const { targetPath, status } = redirectMatch; 247 + 248 + // Handle different status codes 249 + if (status === 200) { 250 + // Rewrite: serve different content but keep URL the same 251 + const rewritePath = targetPath.startsWith('/') ? targetPath.slice(1) : targetPath; 252 + return serveFileInternalWithRewrite(did, rkey, rewritePath, basePath); 253 + } else if (status === 301 || status === 302) { 254 + // External redirect: change the URL 255 + // For sites.wisp.place, we need to adjust the target path to include the base path 256 + // unless it's an absolute URL 257 + let redirectTarget = targetPath; 258 + if (!targetPath.startsWith('http://') && !targetPath.startsWith('https://')) { 259 + redirectTarget = basePath + (targetPath.startsWith('/') ? targetPath.slice(1) : targetPath); 260 + } 261 + return new Response(null, { 262 + status, 263 + headers: { 264 + 'Location': redirectTarget, 265 + 'Cache-Control': status === 301 ? 'public, max-age=31536000' : 'public, max-age=0', 266 + }, 267 + }); 268 + } else if (status === 404) { 269 + // Custom 404 page 270 + const custom404Path = targetPath.startsWith('/') ? targetPath.slice(1) : targetPath; 271 + const response = await serveFileInternalWithRewrite(did, rkey, custom404Path, basePath); 272 + // Override status to 404 273 + return new Response(response.body, { 274 + status: 404, 275 + headers: response.headers, 276 + }); 277 + } 278 + } 279 + } 280 + 281 + // No redirect matched, serve normally 282 + return serveFileInternalWithRewrite(did, rkey, filePath, basePath); 283 + } 284 + 285 + // Internal function to serve a file with rewriting 286 + async function serveFileInternalWithRewrite(did: string, rkey: string, filePath: string, basePath: string) { 143 287 // Default to index.html if path is empty or ends with / 144 288 let requestPath = filePath || 'index.html'; 145 289 if (requestPath.endsWith('/')) { ··· 317 461 318 462 try { 319 463 await downloadAndCacheSite(did, rkey, siteData.record, pdsEndpoint, siteData.cid); 464 + // Clear redirect rules cache since the site was updated 465 + clearRedirectRulesCache(did, rkey); 320 466 logger.info('Site cached successfully', { did, rkey }); 321 467 return true; 322 468 } catch (err) { ··· 384 530 385 531 // Serve with HTML path rewriting to handle absolute paths 386 532 const basePath = `/${identifier}/${site}/`; 387 - return serveFromCacheWithRewrite(did, site, filePath, basePath); 533 + const headers: Record<string, string> = {}; 534 + c.req.raw.headers.forEach((value, key) => { 535 + headers[key.toLowerCase()] = value; 536 + }); 537 + return serveFromCacheWithRewrite(did, site, filePath, basePath, c.req.url, headers); 388 538 } 389 539 390 540 // Check if this is a DNS hash subdomain ··· 420 570 return c.text('Site not found', 404); 421 571 } 422 572 423 - return serveFromCache(customDomain.did, rkey, path); 573 + const headers: Record<string, string> = {}; 574 + c.req.raw.headers.forEach((value, key) => { 575 + headers[key.toLowerCase()] = value; 576 + }); 577 + return serveFromCache(customDomain.did, rkey, path, c.req.url, headers); 424 578 } 425 579 426 580 // Route 2: Registered subdomains - /*.wisp.place/* ··· 444 598 return c.text('Site not found', 404); 445 599 } 446 600 447 - return serveFromCache(domainInfo.did, rkey, path); 601 + const headers: Record<string, string> = {}; 602 + c.req.raw.headers.forEach((value, key) => { 603 + headers[key.toLowerCase()] = value; 604 + }); 605 + return serveFromCache(domainInfo.did, rkey, path, c.req.url, headers); 448 606 } 449 607 450 608 // Route 1: Custom domains - /* ··· 467 625 return c.text('Site not found', 404); 468 626 } 469 627 470 - return serveFromCache(customDomain.did, rkey, path); 628 + const headers: Record<string, string> = {}; 629 + c.req.raw.headers.forEach((value, key) => { 630 + headers[key.toLowerCase()] = value; 631 + }); 632 + return serveFromCache(customDomain.did, rkey, path, c.req.url, headers); 471 633 }); 472 634 473 635 // Internal observability endpoints (for admin panel)
+1
cli/.gitignore
··· 1 + test/ 1 2 .DS_STORE 2 3 jacquard/ 3 4 binaries/
+21 -12
cli/Cargo.toml
··· 1 + [package] 2 + name = "wisp-cli" 3 + version = "0.2.0" 4 + edition = "2024" 1 5 6 + [features] 2 7 3 8 4 9 5 10 11 + jacquard = { git = "https://tangled.org/@nonbinary.computer/jacquard", features = ["loopback"] } 12 + jacquard-oauth = { git = "https://tangled.org/@nonbinary.computer/jacquard" } 13 + jacquard-api = { git = "https://tangled.org/@nonbinary.computer/jacquard" } 14 + jacquard-common = { git = "https://tangled.org/@nonbinary.computer/jacquard", features = ["websocket"] } 15 + jacquard-identity = { git = "https://tangled.org/@nonbinary.computer/jacquard", features = ["dns"] } 16 + jacquard-derive = { git = "https://tangled.org/@nonbinary.computer/jacquard" } 17 + jacquard-lexicon = { git = "https://tangled.org/@nonbinary.computer/jacquard" } 6 18 7 19 8 20 ··· 15 27 16 28 17 29 18 - 19 - 20 - 21 - 22 - 23 - 24 - 25 - 26 - 27 - 28 - 29 - 30 30 mime_guess = "2.0" 31 31 bytes = "1.10" 32 32 futures = "0.3.31" 33 + multihash = "0.19.3" 34 + multibase = "0.9" 35 + sha2 = "0.10" 36 + axum = "0.7" 37 + tower-http = { version = "0.5", features = ["fs", "compression-gzip"] } 38 + tower = "0.4" 39 + n0-future = "0.1" 40 + chrono = "0.4" 41 + url = "2.5"
+85
cli/src/blob_map.rs
··· 1 + use jacquard_common::types::blob::BlobRef; 2 + use jacquard_common::IntoStatic; 3 + use std::collections::HashMap; 4 + 5 + use crate::place_wisp::fs::{Directory, EntryNode}; 6 + 7 + /// Extract blob information from a directory tree 8 + /// Returns a map of file paths to their blob refs and CIDs 9 + /// 10 + /// This mirrors the TypeScript implementation in src/lib/wisp-utils.ts lines 275-302 11 + pub fn extract_blob_map( 12 + directory: &Directory, 13 + ) -> HashMap<String, (BlobRef<'static>, String)> { 14 + extract_blob_map_recursive(directory, String::new()) 15 + } 16 + 17 + fn extract_blob_map_recursive( 18 + directory: &Directory, 19 + current_path: String, 20 + ) -> HashMap<String, (BlobRef<'static>, String)> { 21 + let mut blob_map = HashMap::new(); 22 + 23 + for entry in &directory.entries { 24 + let full_path = if current_path.is_empty() { 25 + entry.name.to_string() 26 + } else { 27 + format!("{}/{}", current_path, entry.name) 28 + }; 29 + 30 + match &entry.node { 31 + EntryNode::File(file_node) => { 32 + // Extract CID from blob ref 33 + // BlobRef is an enum with Blob variant, which has a ref field (CidLink) 34 + let blob_ref = &file_node.blob; 35 + let cid_string = blob_ref.blob().r#ref.to_string(); 36 + 37 + // Store with full path (mirrors TypeScript implementation) 38 + blob_map.insert( 39 + full_path, 40 + (blob_ref.clone().into_static(), cid_string) 41 + ); 42 + } 43 + EntryNode::Directory(subdir) => { 44 + let sub_map = extract_blob_map_recursive(subdir, full_path); 45 + blob_map.extend(sub_map); 46 + } 47 + EntryNode::Unknown(_) => { 48 + // Skip unknown node types 49 + } 50 + } 51 + } 52 + 53 + blob_map 54 + } 55 + 56 + /// Normalize file path by removing base folder prefix 57 + /// Example: "cobblemon/index.html" -> "index.html" 58 + /// 59 + /// Note: This function is kept for reference but is no longer used in production code. 60 + /// The TypeScript server has a similar normalization (src/routes/wisp.ts line 291) to handle 61 + /// uploads that include a base folder prefix, but our CLI doesn't need this since we 62 + /// track full paths consistently. 63 + #[allow(dead_code)] 64 + pub fn normalize_path(path: &str) -> String { 65 + // Remove base folder prefix (everything before first /) 66 + if let Some(idx) = path.find('/') { 67 + path[idx + 1..].to_string() 68 + } else { 69 + path.to_string() 70 + } 71 + } 72 + 73 + #[cfg(test)] 74 + mod tests { 75 + use super::*; 76 + 77 + #[test] 78 + fn test_normalize_path() { 79 + assert_eq!(normalize_path("index.html"), "index.html"); 80 + assert_eq!(normalize_path("cobblemon/index.html"), "index.html"); 81 + assert_eq!(normalize_path("folder/subfolder/file.txt"), "subfolder/file.txt"); 82 + assert_eq!(normalize_path("a/b/c/d.txt"), "b/c/d.txt"); 83 + } 84 + } 85 +
+66
cli/src/cid.rs
··· 1 + use jacquard_common::types::cid::IpldCid; 2 + use sha2::{Digest, Sha256}; 3 + 4 + /// Compute CID (Content Identifier) for blob content 5 + /// Uses the same algorithm as AT Protocol: CIDv1 with raw codec (0x55) and SHA-256 6 + /// 7 + /// CRITICAL: This must be called on BASE64-ENCODED GZIPPED content, not just gzipped content 8 + /// 9 + /// Based on @atproto/common/src/ipld.ts sha256RawToCid implementation 10 + pub fn compute_cid(content: &[u8]) -> String { 11 + // Use node crypto to compute sha256 hash (same as AT Protocol) 12 + let hash = Sha256::digest(content); 13 + 14 + // Create multihash (code 0x12 = sha2-256) 15 + let multihash = multihash::Multihash::wrap(0x12, &hash) 16 + .expect("SHA-256 hash should always fit in multihash"); 17 + 18 + // Create CIDv1 with raw codec (0x55) 19 + let cid = IpldCid::new_v1(0x55, multihash); 20 + 21 + // Convert to base32 string representation 22 + cid.to_string_of_base(multibase::Base::Base32Lower) 23 + .unwrap_or_else(|_| cid.to_string()) 24 + } 25 + 26 + #[cfg(test)] 27 + mod tests { 28 + use super::*; 29 + use base64::Engine; 30 + 31 + #[test] 32 + fn test_compute_cid() { 33 + // Test with a simple string: "hello" 34 + let content = b"hello"; 35 + let cid = compute_cid(content); 36 + 37 + // CID should start with 'baf' for raw codec base32 38 + assert!(cid.starts_with("baf")); 39 + } 40 + 41 + #[test] 42 + fn test_compute_cid_base64_encoded() { 43 + // Simulate the actual use case: gzipped then base64 encoded 44 + use flate2::write::GzEncoder; 45 + use flate2::Compression; 46 + use std::io::Write; 47 + 48 + let original = b"hello world"; 49 + 50 + // Gzip compress 51 + let mut encoder = GzEncoder::new(Vec::new(), Compression::default()); 52 + encoder.write_all(original).unwrap(); 53 + let gzipped = encoder.finish().unwrap(); 54 + 55 + // Base64 encode the gzipped data 56 + let base64_bytes = base64::prelude::BASE64_STANDARD.encode(&gzipped).into_bytes(); 57 + 58 + // Compute CID on the base64 bytes 59 + let cid = compute_cid(&base64_bytes); 60 + 61 + // Should be a valid CID 62 + assert!(cid.starts_with("baf")); 63 + assert!(cid.len() > 10); 64 + } 65 + } 66 +
+253 -66
cli/src/main.rs
··· 1 1 mod builder_types; 2 2 mod place_wisp; 3 + mod cid; 4 + mod blob_map; 5 + mod metadata; 6 + mod download; 7 + mod pull; 8 + mod serve; 3 9 4 - use clap::Parser; 10 + use clap::{Parser, Subcommand}; 5 11 use jacquard::CowStr; 6 - use jacquard::client::{Agent, FileAuthStore, AgentSessionExt, MemoryCredentialSession}; 12 + use jacquard::client::{Agent, FileAuthStore, AgentSessionExt, MemoryCredentialSession, AgentSession}; 7 13 use jacquard::oauth::client::OAuthClient; 8 14 use jacquard::oauth::loopback::LoopbackConfig; 9 15 use jacquard::prelude::IdentityResolver; ··· 11 17 use jacquard_common::types::blob::MimeType; 12 18 use miette::IntoDiagnostic; 13 19 use std::path::{Path, PathBuf}; 20 + use std::collections::HashMap; 14 21 use flate2::Compression; 15 22 use flate2::write::GzEncoder; 16 23 use std::io::Write; 17 24 18 25 19 26 27 + use place_wisp::fs::*; 20 28 29 + #[derive(Parser, Debug)] 30 + #[command(author, version, about = "wisp.place CLI tool")] 31 + struct Args { 32 + #[command(subcommand)] 33 + command: Option<Commands>, 34 + 35 + // Deploy arguments (when no subcommand is specified) 36 + /// Handle (e.g., alice.bsky.social), DID, or PDS URL 37 + #[arg(global = true, conflicts_with = "command")] 38 + input: Option<CowStr<'static>>, 21 39 40 + /// Path to the directory containing your static site 41 + #[arg(short, long, global = true, conflicts_with = "command")] 42 + path: Option<PathBuf>, 22 43 44 + /// Site name (defaults to directory name) 45 + #[arg(short, long, global = true, conflicts_with = "command")] 46 + site: Option<String>, 23 47 48 + /// Path to auth store file 49 + #[arg(long, global = true, conflicts_with = "command")] 50 + store: Option<String>, 24 51 52 + /// App Password for authentication 53 + #[arg(long, global = true, conflicts_with = "command")] 54 + password: Option<CowStr<'static>>, 55 + } 25 56 57 + #[derive(Subcommand, Debug)] 58 + enum Commands { 59 + /// Deploy a static site to wisp.place (default command) 60 + Deploy { 61 + /// Handle (e.g., alice.bsky.social), DID, or PDS URL 62 + input: CowStr<'static>, 26 63 64 + /// Path to the directory containing your static site 65 + #[arg(short, long, default_value = ".")] 66 + path: PathBuf, 27 67 68 + /// Site name (defaults to directory name) 69 + #[arg(short, long)] 70 + site: Option<String>, 28 71 72 + /// Path to auth store file (will be created if missing, only used with OAuth) 73 + #[arg(long, default_value = "/tmp/wisp-oauth-session.json")] 74 + store: String, 29 75 76 + /// App Password for authentication (alternative to OAuth) 77 + #[arg(long)] 78 + password: Option<CowStr<'static>>, 79 + }, 80 + /// Pull a site from the PDS to a local directory 81 + Pull { 82 + /// Handle (e.g., alice.bsky.social) or DID 83 + input: CowStr<'static>, 30 84 85 + /// Site name (record key) 86 + #[arg(short, long)] 87 + site: String, 31 88 89 + /// Output directory for the downloaded site 90 + #[arg(short, long, default_value = ".")] 91 + output: PathBuf, 92 + }, 93 + /// Serve a site locally with real-time firehose updates 94 + Serve { 95 + /// Handle (e.g., alice.bsky.social) or DID 96 + input: CowStr<'static>, 32 97 98 + /// Site name (record key) 99 + #[arg(short, long)] 100 + site: String, 33 101 102 + /// Output directory for the site files 103 + #[arg(short, long, default_value = ".")] 104 + output: PathBuf, 34 105 106 + /// Port to serve on 107 + #[arg(short, long, default_value = "8080")] 108 + port: u16, 109 + }, 110 + } 35 111 112 + #[tokio::main] 113 + async fn main() -> miette::Result<()> { 114 + let args = Args::parse(); 36 115 116 + match args.command { 117 + Some(Commands::Deploy { input, path, site, store, password }) => { 118 + // Dispatch to appropriate authentication method 119 + if let Some(password) = password { 120 + run_with_app_password(input, password, path, site).await 121 + } else { 122 + run_with_oauth(input, store, path, site).await 123 + } 124 + } 125 + Some(Commands::Pull { input, site, output }) => { 126 + pull::pull_site(input, CowStr::from(site), output).await 127 + } 128 + Some(Commands::Serve { input, site, output, port }) => { 129 + serve::serve_site(input, CowStr::from(site), output, port).await 130 + } 131 + None => { 132 + // Legacy mode: if input is provided, assume deploy command 133 + if let Some(input) = args.input { 134 + let path = args.path.unwrap_or_else(|| PathBuf::from(".")); 135 + let store = args.store.unwrap_or_else(|| "/tmp/wisp-oauth-session.json".to_string()); 136 + 137 + // Dispatch to appropriate authentication method 138 + if let Some(password) = args.password { 139 + run_with_app_password(input, password, path, args.site).await 140 + } else { 141 + run_with_oauth(input, store, path, args.site).await 142 + } 143 + } else { 144 + // No command and no input, show help 145 + use clap::CommandFactory; 146 + Args::command().print_help().into_diagnostic()?; 147 + Ok(()) 148 + } 149 + } 150 + } 151 + } 37 152 38 153 39 154 ··· 86 201 87 202 88 203 89 - 90 - 91 - 92 - 93 - 94 - 95 - 96 - 97 - 98 - 99 - 100 - 101 - 102 - 103 - 104 - 105 - 106 - 107 - 108 204 println!("Deploying site '{}'...", site_name); 109 205 206 + // Try to fetch existing manifest for incremental updates 207 + let existing_blob_map: HashMap<String, (jacquard_common::types::blob::BlobRef<'static>, String)> = { 208 + use jacquard_common::types::string::AtUri; 209 + 210 + // Get the DID for this session 211 + let session_info = agent.session_info().await; 212 + if let Some((did, _)) = session_info { 213 + // Construct the AT URI for the record 214 + let uri_string = format!("at://{}/place.wisp.fs/{}", did, site_name); 215 + if let Ok(uri) = AtUri::new(&uri_string) { 216 + match agent.get_record::<Fs>(&uri).await { 217 + Ok(response) => { 218 + match response.into_output() { 219 + Ok(record_output) => { 220 + let existing_manifest = record_output.value; 221 + let blob_map = blob_map::extract_blob_map(&existing_manifest.root); 222 + println!("Found existing manifest with {} files, checking for changes...", blob_map.len()); 223 + blob_map 224 + } 225 + Err(_) => { 226 + println!("No existing manifest found, uploading all files..."); 227 + HashMap::new() 228 + } 229 + } 230 + } 231 + Err(_) => { 232 + // Record doesn't exist yet - this is a new site 233 + println!("No existing manifest found, uploading all files..."); 234 + HashMap::new() 235 + } 236 + } 237 + } else { 238 + println!("No existing manifest found (invalid URI), uploading all files..."); 239 + HashMap::new() 240 + } 241 + } else { 242 + println!("No existing manifest found (could not get DID), uploading all files..."); 243 + HashMap::new() 244 + } 245 + }; 246 + 110 247 // Build directory tree 111 - let root_dir = build_directory(agent, &path).await?; 248 + let (root_dir, total_files, reused_count) = build_directory(agent, &path, &existing_blob_map, String::new()).await?; 249 + let uploaded_count = total_files - reused_count; 112 250 113 - // Count total files 114 - let file_count = count_files(&root_dir); 115 - 116 251 // Create the Fs record 117 252 let fs_record = Fs::new() 118 253 .site(CowStr::from(site_name.clone())) 119 254 .root(root_dir) 120 - .file_count(file_count as i64) 255 + .file_count(total_files as i64) 121 256 .created_at(Datetime::now()) 122 257 .build(); 123 258 ··· 132 267 .and_then(|s| s.split('/').next()) 133 268 .ok_or_else(|| miette::miette!("Failed to parse DID from URI"))?; 134 269 135 - println!("Deployed site '{}': {}", site_name, output.uri); 136 - println!("Available at: https://sites.wisp.place/{}/{}", did, site_name); 270 + println!("\n✓ Deployed site '{}': {}", site_name, output.uri); 271 + println!(" Total files: {} ({} reused, {} uploaded)", total_files, reused_count, uploaded_count); 272 + println!(" Available at: https://sites.wisp.place/{}/{}", did, site_name); 137 273 138 274 Ok(()) 139 275 } 140 276 141 - 277 + /// Recursively build a Directory from a filesystem path 278 + /// current_path is the path from the root of the site (e.g., "" for root, "config" for config dir) 142 279 fn build_directory<'a>( 143 280 agent: &'a Agent<impl jacquard::client::AgentSession + IdentityResolver + 'a>, 144 281 dir_path: &'a Path, 145 - ) -> std::pin::Pin<Box<dyn std::future::Future<Output = miette::Result<Directory<'static>>> + 'a>> 282 + existing_blobs: &'a HashMap<String, (jacquard_common::types::blob::BlobRef<'static>, String)>, 283 + current_path: String, 284 + ) -> std::pin::Pin<Box<dyn std::future::Future<Output = miette::Result<(Directory<'static>, usize, usize)>> + 'a>> 146 285 { 147 286 Box::pin(async move { 148 287 // Collect all directory entries first ··· 167 306 168 307 169 308 309 + let metadata = entry.metadata().into_diagnostic()?; 170 310 171 - 172 - 173 - 174 - 175 - 176 - 311 + if metadata.is_file() { 312 + // Construct full path for this file (for blob map lookup) 313 + let full_path = if current_path.is_empty() { 314 + name_str.clone() 315 + } else { 316 + format!("{}/{}", current_path, name_str) 317 + }; 318 + file_tasks.push((name_str, path, full_path)); 319 + } else if metadata.is_dir() { 320 + dir_tasks.push((name_str, path)); 321 + } 177 322 } 178 323 179 324 // Process files concurrently with a limit of 5 180 - let file_entries: Vec<Entry> = stream::iter(file_tasks) 181 - .map(|(name, path)| async move { 182 - let file_node = process_file(agent, &path).await?; 183 - Ok::<_, miette::Report>(Entry::new() 325 + let file_results: Vec<(Entry<'static>, bool)> = stream::iter(file_tasks) 326 + .map(|(name, path, full_path)| async move { 327 + let (file_node, reused) = process_file(agent, &path, &full_path, existing_blobs).await?; 328 + let entry = Entry::new() 184 329 .name(CowStr::from(name)) 185 330 .node(EntryNode::File(Box::new(file_node))) 186 - .build()) 331 + .build(); 332 + Ok::<_, miette::Report>((entry, reused)) 187 333 }) 188 334 .buffer_unordered(5) 189 335 .collect::<Vec<_>>() 190 336 .await 191 337 .into_iter() 192 338 .collect::<miette::Result<Vec<_>>>()?; 339 + 340 + let mut file_entries = Vec::new(); 341 + let mut reused_count = 0; 342 + let mut total_files = 0; 343 + 344 + for (entry, reused) in file_results { 345 + file_entries.push(entry); 346 + total_files += 1; 347 + if reused { 348 + reused_count += 1; 349 + } 350 + } 193 351 194 352 // Process directories recursively (sequentially to avoid too much nesting) 195 353 let mut dir_entries = Vec::new(); 196 354 for (name, path) in dir_tasks { 197 - let subdir = build_directory(agent, &path).await?; 355 + // Construct full path for subdirectory 356 + let subdir_path = if current_path.is_empty() { 357 + name.clone() 358 + } else { 359 + format!("{}/{}", current_path, name) 360 + }; 361 + let (subdir, sub_total, sub_reused) = build_directory(agent, &path, existing_blobs, subdir_path).await?; 198 362 dir_entries.push(Entry::new() 199 363 .name(CowStr::from(name)) 200 364 .node(EntryNode::Directory(Box::new(subdir))) 201 365 .build()); 366 + total_files += sub_total; 367 + reused_count += sub_reused; 202 368 } 203 369 204 370 // Combine file and directory entries 205 371 let mut entries = file_entries; 206 372 entries.extend(dir_entries); 207 373 208 - Ok(Directory::new() 374 + let directory = Directory::new() 209 375 .r#type(CowStr::from("directory")) 210 376 .entries(entries) 211 - .build()) 377 + .build(); 378 + 379 + Ok((directory, total_files, reused_count)) 212 380 }) 213 381 } 214 382 215 - /// Process a single file: gzip -> base64 -> upload blob 383 + /// Process a single file: gzip -> base64 -> upload blob (or reuse existing) 384 + /// Returns (File, reused: bool) 385 + /// file_path_key is the full path from the site root (e.g., "config/file.json") for blob map lookup 216 386 async fn process_file( 217 387 agent: &Agent<impl jacquard::client::AgentSession + IdentityResolver>, 218 388 file_path: &Path, 219 - ) -> miette::Result<File<'static>> 389 + file_path_key: &str, 390 + existing_blobs: &HashMap<String, (jacquard_common::types::blob::BlobRef<'static>, String)>, 391 + ) -> miette::Result<(File<'static>, bool)> 220 392 { 221 393 // Read file 222 394 let file_data = std::fs::read(file_path).into_diagnostic()?; ··· 234 406 // Base64 encode the gzipped data 235 407 let base64_bytes = base64::prelude::BASE64_STANDARD.encode(&gzipped).into_bytes(); 236 408 237 - // Upload blob as octet-stream 409 + // Compute CID for this file (CRITICAL: on base64-encoded gzipped content) 410 + let file_cid = cid::compute_cid(&base64_bytes); 411 + 412 + // Check if we have an existing blob with the same CID 413 + let existing_blob = existing_blobs.get(file_path_key); 414 + 415 + if let Some((existing_blob_ref, existing_cid)) = existing_blob { 416 + if existing_cid == &file_cid { 417 + // CIDs match - reuse existing blob 418 + println!(" ✓ Reusing blob for {} (CID: {})", file_path_key, file_cid); 419 + return Ok(( 420 + File::new() 421 + .r#type(CowStr::from("file")) 422 + .blob(existing_blob_ref.clone()) 423 + .encoding(CowStr::from("gzip")) 424 + .mime_type(CowStr::from(original_mime)) 425 + .base64(true) 426 + .build(), 427 + true 428 + )); 429 + } 430 + } 431 + 432 + // File is new or changed - upload it 433 + println!(" ↑ Uploading {} ({} bytes, CID: {})", file_path_key, base64_bytes.len(), file_cid); 238 434 let blob = agent.upload_blob( 239 435 base64_bytes, 240 436 MimeType::new_static("application/octet-stream"), 241 437 ).await?; 242 438 243 - Ok(File::new() 244 - .r#type(CowStr::from("file")) 245 - .blob(blob) 246 - .encoding(CowStr::from("gzip")) 247 - .mime_type(CowStr::from(original_mime)) 248 - .base64(true) 249 - .build()) 439 + Ok(( 440 + File::new() 441 + .r#type(CowStr::from("file")) 442 + .blob(blob) 443 + .encoding(CowStr::from("gzip")) 444 + .mime_type(CowStr::from(original_mime)) 445 + .base64(true) 446 + .build(), 447 + false 448 + )) 250 449 } 251 450 252 - /// Count total files in a directory tree 253 - fn count_files(dir: &Directory) -> usize { 254 - let mut count = 0; 255 - for entry in &dir.entries { 256 - match &entry.node { 257 - EntryNode::File(_) => count += 1, 258 - EntryNode::Directory(subdir) => count += count_files(subdir), 259 - _ => {} // Unknown variants 260 - } 261 - } 262 - count 263 - }
+71
cli/src/download.rs
··· 1 + use base64::Engine; 2 + use bytes::Bytes; 3 + use flate2::read::GzDecoder; 4 + use jacquard_common::types::blob::BlobRef; 5 + use miette::IntoDiagnostic; 6 + use std::io::Read; 7 + use url::Url; 8 + 9 + /// Download a blob from the PDS 10 + pub async fn download_blob(pds_url: &Url, blob_ref: &BlobRef<'_>, did: &str) -> miette::Result<Bytes> { 11 + // Extract CID from blob ref 12 + let cid = blob_ref.blob().r#ref.to_string(); 13 + 14 + // Construct blob download URL 15 + // The correct endpoint is: /xrpc/com.atproto.sync.getBlob?did={did}&cid={cid} 16 + let blob_url = pds_url 17 + .join(&format!("/xrpc/com.atproto.sync.getBlob?did={}&cid={}", did, cid)) 18 + .into_diagnostic()?; 19 + 20 + let client = reqwest::Client::new(); 21 + let response = client 22 + .get(blob_url) 23 + .send() 24 + .await 25 + .into_diagnostic()?; 26 + 27 + if !response.status().is_success() { 28 + return Err(miette::miette!( 29 + "Failed to download blob: {}", 30 + response.status() 31 + )); 32 + } 33 + 34 + let bytes = response.bytes().await.into_diagnostic()?; 35 + Ok(bytes) 36 + } 37 + 38 + /// Decompress and decode a blob (base64 + gzip) 39 + pub fn decompress_blob(data: &[u8], is_base64: bool, is_gzipped: bool) -> miette::Result<Vec<u8>> { 40 + let mut current_data = data.to_vec(); 41 + 42 + // First, decode base64 if needed 43 + if is_base64 { 44 + current_data = base64::prelude::BASE64_STANDARD 45 + .decode(&current_data) 46 + .into_diagnostic()?; 47 + } 48 + 49 + // Then, decompress gzip if needed 50 + if is_gzipped { 51 + let mut decoder = GzDecoder::new(&current_data[..]); 52 + let mut decompressed = Vec::new(); 53 + decoder.read_to_end(&mut decompressed).into_diagnostic()?; 54 + current_data = decompressed; 55 + } 56 + 57 + Ok(current_data) 58 + } 59 + 60 + /// Download and decompress a blob 61 + pub async fn download_and_decompress_blob( 62 + pds_url: &Url, 63 + blob_ref: &BlobRef<'_>, 64 + did: &str, 65 + is_base64: bool, 66 + is_gzipped: bool, 67 + ) -> miette::Result<Vec<u8>> { 68 + let data = download_blob(pds_url, blob_ref, did).await?; 69 + decompress_blob(&data, is_base64, is_gzipped) 70 + } 71 +
+46
cli/src/metadata.rs
··· 1 + use serde::{Deserialize, Serialize}; 2 + use std::collections::HashMap; 3 + use std::path::Path; 4 + use miette::IntoDiagnostic; 5 + 6 + /// Metadata tracking file CIDs for incremental updates 7 + #[derive(Debug, Clone, Serialize, Deserialize)] 8 + pub struct SiteMetadata { 9 + /// Record CID from the PDS 10 + pub record_cid: String, 11 + /// Map of file paths to their blob CIDs 12 + pub file_cids: HashMap<String, String>, 13 + /// Timestamp when the site was last synced 14 + pub last_sync: i64, 15 + } 16 + 17 + impl SiteMetadata { 18 + pub fn new(record_cid: String, file_cids: HashMap<String, String>) -> Self { 19 + Self { 20 + record_cid, 21 + file_cids, 22 + last_sync: chrono::Utc::now().timestamp(), 23 + } 24 + } 25 + 26 + /// Load metadata from a directory 27 + pub fn load(dir: &Path) -> miette::Result<Option<Self>> { 28 + let metadata_path = dir.join(".wisp-metadata.json"); 29 + if !metadata_path.exists() { 30 + return Ok(None); 31 + } 32 + 33 + let contents = std::fs::read_to_string(&metadata_path).into_diagnostic()?; 34 + let metadata: SiteMetadata = serde_json::from_str(&contents).into_diagnostic()?; 35 + Ok(Some(metadata)) 36 + } 37 + 38 + /// Save metadata to a directory 39 + pub fn save(&self, dir: &Path) -> miette::Result<()> { 40 + let metadata_path = dir.join(".wisp-metadata.json"); 41 + let contents = serde_json::to_string_pretty(self).into_diagnostic()?; 42 + std::fs::write(&metadata_path, contents).into_diagnostic()?; 43 + Ok(()) 44 + } 45 + } 46 +
+305
cli/src/pull.rs
··· 1 + use crate::blob_map; 2 + use crate::download; 3 + use crate::metadata::SiteMetadata; 4 + use crate::place_wisp::fs::*; 5 + use jacquard::CowStr; 6 + use jacquard::prelude::IdentityResolver; 7 + use jacquard_common::types::string::Did; 8 + use jacquard_common::xrpc::XrpcExt; 9 + use jacquard_identity::PublicResolver; 10 + use miette::IntoDiagnostic; 11 + use std::collections::HashMap; 12 + use std::path::{Path, PathBuf}; 13 + use url::Url; 14 + 15 + /// Pull a site from the PDS to a local directory 16 + pub async fn pull_site( 17 + input: CowStr<'static>, 18 + rkey: CowStr<'static>, 19 + output_dir: PathBuf, 20 + ) -> miette::Result<()> { 21 + println!("Pulling site {} from {}...", rkey, input); 22 + 23 + // Resolve handle to DID if needed 24 + let resolver = PublicResolver::default(); 25 + let did = if input.starts_with("did:") { 26 + Did::new(&input).into_diagnostic()? 27 + } else { 28 + // It's a handle, resolve it 29 + let handle = jacquard_common::types::string::Handle::new(&input).into_diagnostic()?; 30 + resolver.resolve_handle(&handle).await.into_diagnostic()? 31 + }; 32 + 33 + // Resolve PDS endpoint for the DID 34 + let pds_url = resolver.pds_for_did(&did).await.into_diagnostic()?; 35 + println!("Resolved PDS: {}", pds_url); 36 + 37 + // Fetch the place.wisp.fs record 38 + 39 + println!("Fetching record from PDS..."); 40 + let client = reqwest::Client::new(); 41 + 42 + // Use com.atproto.repo.getRecord 43 + use jacquard::api::com_atproto::repo::get_record::GetRecord; 44 + use jacquard_common::types::string::Rkey as RkeyType; 45 + let rkey_parsed = RkeyType::new(&rkey).into_diagnostic()?; 46 + 47 + use jacquard_common::types::ident::AtIdentifier; 48 + use jacquard_common::types::string::RecordKey; 49 + let request = GetRecord::new() 50 + .repo(AtIdentifier::Did(did.clone())) 51 + .collection(CowStr::from("place.wisp.fs")) 52 + .rkey(RecordKey::from(rkey_parsed)) 53 + .build(); 54 + 55 + let response = client 56 + .xrpc(pds_url.clone()) 57 + .send(&request) 58 + .await 59 + .into_diagnostic()?; 60 + 61 + let record_output = response.into_output().into_diagnostic()?; 62 + let record_cid = record_output.cid.as_ref().map(|c| c.to_string()).unwrap_or_default(); 63 + 64 + // Parse the record value as Fs 65 + use jacquard_common::types::value::from_data; 66 + let fs_record: Fs = from_data(&record_output.value).into_diagnostic()?; 67 + 68 + let file_count = fs_record.file_count.map(|c| c.to_string()).unwrap_or_else(|| "?".to_string()); 69 + println!("Found site '{}' with {} files", fs_record.site, file_count); 70 + 71 + // Load existing metadata for incremental updates 72 + let existing_metadata = SiteMetadata::load(&output_dir)?; 73 + let existing_file_cids = existing_metadata 74 + .as_ref() 75 + .map(|m| m.file_cids.clone()) 76 + .unwrap_or_default(); 77 + 78 + // Extract blob map from the new manifest 79 + let new_blob_map = blob_map::extract_blob_map(&fs_record.root); 80 + let new_file_cids: HashMap<String, String> = new_blob_map 81 + .iter() 82 + .map(|(path, (_blob_ref, cid))| (path.clone(), cid.clone())) 83 + .collect(); 84 + 85 + // Clean up any leftover temp directories from previous failed attempts 86 + let parent = output_dir.parent().unwrap_or_else(|| std::path::Path::new(".")); 87 + let output_name = output_dir.file_name().unwrap_or_else(|| std::ffi::OsStr::new("site")).to_string_lossy(); 88 + let temp_prefix = format!(".tmp-{}-", output_name); 89 + 90 + if let Ok(entries) = parent.read_dir() { 91 + for entry in entries.flatten() { 92 + let name = entry.file_name(); 93 + if name.to_string_lossy().starts_with(&temp_prefix) { 94 + let _ = std::fs::remove_dir_all(entry.path()); 95 + } 96 + } 97 + } 98 + 99 + // Check if we need to update (but only if output directory actually exists with files) 100 + if let Some(metadata) = &existing_metadata { 101 + if metadata.record_cid == record_cid { 102 + // Verify that the output directory actually exists and has content 103 + let has_content = output_dir.exists() && 104 + output_dir.read_dir() 105 + .map(|mut entries| entries.any(|e| { 106 + if let Ok(entry) = e { 107 + !entry.file_name().to_string_lossy().starts_with(".wisp-metadata") 108 + } else { 109 + false 110 + } 111 + })) 112 + .unwrap_or(false); 113 + 114 + if has_content { 115 + println!("Site is already up to date!"); 116 + return Ok(()); 117 + } 118 + } 119 + } 120 + 121 + // Create temporary directory for atomic update 122 + // Place temp dir in parent directory to avoid issues with non-existent output_dir 123 + let parent = output_dir.parent().unwrap_or_else(|| std::path::Path::new(".")); 124 + let temp_dir_name = format!( 125 + ".tmp-{}-{}", 126 + output_dir.file_name().unwrap_or_else(|| std::ffi::OsStr::new("site")).to_string_lossy(), 127 + chrono::Utc::now().timestamp() 128 + ); 129 + let temp_dir = parent.join(temp_dir_name); 130 + std::fs::create_dir_all(&temp_dir).into_diagnostic()?; 131 + 132 + println!("Downloading files..."); 133 + let mut downloaded = 0; 134 + let mut reused = 0; 135 + 136 + // Download files recursively 137 + let download_result = download_directory( 138 + &fs_record.root, 139 + &temp_dir, 140 + &pds_url, 141 + did.as_str(), 142 + &new_blob_map, 143 + &existing_file_cids, 144 + &output_dir, 145 + String::new(), 146 + &mut downloaded, 147 + &mut reused, 148 + ) 149 + .await; 150 + 151 + // If download failed, clean up temp directory 152 + if let Err(e) = download_result { 153 + let _ = std::fs::remove_dir_all(&temp_dir); 154 + return Err(e); 155 + } 156 + 157 + println!( 158 + "Downloaded {} files, reused {} files", 159 + downloaded, reused 160 + ); 161 + 162 + // Save metadata 163 + let metadata = SiteMetadata::new(record_cid, new_file_cids); 164 + metadata.save(&temp_dir)?; 165 + 166 + // Move files from temp to output directory 167 + let output_abs = std::fs::canonicalize(&output_dir).unwrap_or_else(|_| output_dir.clone()); 168 + let current_dir = std::env::current_dir().into_diagnostic()?; 169 + 170 + // Special handling for pulling to current directory 171 + if output_abs == current_dir { 172 + // Move files from temp to current directory 173 + for entry in std::fs::read_dir(&temp_dir).into_diagnostic()? { 174 + let entry = entry.into_diagnostic()?; 175 + let dest = current_dir.join(entry.file_name()); 176 + 177 + // Remove existing file/dir if it exists 178 + if dest.exists() { 179 + if dest.is_dir() { 180 + std::fs::remove_dir_all(&dest).into_diagnostic()?; 181 + } else { 182 + std::fs::remove_file(&dest).into_diagnostic()?; 183 + } 184 + } 185 + 186 + // Move from temp to current dir 187 + std::fs::rename(entry.path(), dest).into_diagnostic()?; 188 + } 189 + 190 + // Clean up temp directory 191 + std::fs::remove_dir_all(&temp_dir).into_diagnostic()?; 192 + } else { 193 + // If output directory exists and has content, remove it first 194 + if output_dir.exists() { 195 + std::fs::remove_dir_all(&output_dir).into_diagnostic()?; 196 + } 197 + 198 + // Ensure parent directory exists 199 + if let Some(parent) = output_dir.parent() { 200 + if !parent.as_os_str().is_empty() && !parent.exists() { 201 + std::fs::create_dir_all(parent).into_diagnostic()?; 202 + } 203 + } 204 + 205 + // Rename temp to final location 206 + match std::fs::rename(&temp_dir, &output_dir) { 207 + Ok(_) => {}, 208 + Err(e) => { 209 + // Clean up temp directory on failure 210 + let _ = std::fs::remove_dir_all(&temp_dir); 211 + return Err(miette::miette!("Failed to move temp directory: {}", e)); 212 + } 213 + } 214 + } 215 + 216 + println!("✓ Site pulled successfully to {}", output_dir.display()); 217 + 218 + Ok(()) 219 + } 220 + 221 + /// Recursively download a directory 222 + fn download_directory<'a>( 223 + dir: &'a Directory<'_>, 224 + output_dir: &'a Path, 225 + pds_url: &'a Url, 226 + did: &'a str, 227 + new_blob_map: &'a HashMap<String, (jacquard_common::types::blob::BlobRef<'static>, String)>, 228 + existing_file_cids: &'a HashMap<String, String>, 229 + existing_output_dir: &'a Path, 230 + path_prefix: String, 231 + downloaded: &'a mut usize, 232 + reused: &'a mut usize, 233 + ) -> std::pin::Pin<Box<dyn std::future::Future<Output = miette::Result<()>> + Send + 'a>> { 234 + Box::pin(async move { 235 + for entry in &dir.entries { 236 + let entry_name = entry.name.as_str(); 237 + let current_path = if path_prefix.is_empty() { 238 + entry_name.to_string() 239 + } else { 240 + format!("{}/{}", path_prefix, entry_name) 241 + }; 242 + 243 + match &entry.node { 244 + EntryNode::File(file) => { 245 + let output_path = output_dir.join(entry_name); 246 + 247 + // Check if file CID matches existing 248 + if let Some((_blob_ref, new_cid)) = new_blob_map.get(&current_path) { 249 + if let Some(existing_cid) = existing_file_cids.get(&current_path) { 250 + if existing_cid == new_cid { 251 + // File unchanged, copy from existing directory 252 + let existing_path = existing_output_dir.join(&current_path); 253 + if existing_path.exists() { 254 + std::fs::copy(&existing_path, &output_path).into_diagnostic()?; 255 + *reused += 1; 256 + println!(" ✓ Reused {}", current_path); 257 + continue; 258 + } 259 + } 260 + } 261 + } 262 + 263 + // File is new or changed, download it 264 + println!(" ↓ Downloading {}", current_path); 265 + let data = download::download_and_decompress_blob( 266 + pds_url, 267 + &file.blob, 268 + did, 269 + file.base64.unwrap_or(false), 270 + file.encoding.as_ref().map(|e| e.as_str() == "gzip").unwrap_or(false), 271 + ) 272 + .await?; 273 + 274 + std::fs::write(&output_path, data).into_diagnostic()?; 275 + *downloaded += 1; 276 + } 277 + EntryNode::Directory(subdir) => { 278 + let subdir_path = output_dir.join(entry_name); 279 + std::fs::create_dir_all(&subdir_path).into_diagnostic()?; 280 + 281 + download_directory( 282 + subdir, 283 + &subdir_path, 284 + pds_url, 285 + did, 286 + new_blob_map, 287 + existing_file_cids, 288 + existing_output_dir, 289 + current_path, 290 + downloaded, 291 + reused, 292 + ) 293 + .await?; 294 + } 295 + EntryNode::Unknown(_) => { 296 + // Skip unknown node types 297 + println!(" ⚠ Skipping unknown node type for {}", current_path); 298 + } 299 + } 300 + } 301 + 302 + Ok(()) 303 + }) 304 + } 305 +
+202
cli/src/serve.rs
··· 1 + use crate::pull::pull_site; 2 + use axum::Router; 3 + use jacquard::CowStr; 4 + use jacquard_common::jetstream::{CommitOperation, JetstreamMessage, JetstreamParams}; 5 + use jacquard_common::types::string::Did; 6 + use jacquard_common::xrpc::{SubscriptionClient, TungsteniteSubscriptionClient}; 7 + use miette::IntoDiagnostic; 8 + use n0_future::StreamExt; 9 + use std::path::PathBuf; 10 + use std::sync::Arc; 11 + use tokio::sync::RwLock; 12 + use tower_http::compression::CompressionLayer; 13 + use tower_http::services::ServeDir; 14 + use url::Url; 15 + 16 + /// Shared state for the server 17 + #[derive(Clone)] 18 + struct ServerState { 19 + did: CowStr<'static>, 20 + rkey: CowStr<'static>, 21 + output_dir: PathBuf, 22 + last_cid: Arc<RwLock<Option<String>>>, 23 + } 24 + 25 + /// Serve a site locally with real-time firehose updates 26 + pub async fn serve_site( 27 + input: CowStr<'static>, 28 + rkey: CowStr<'static>, 29 + output_dir: PathBuf, 30 + port: u16, 31 + ) -> miette::Result<()> { 32 + println!("Serving site {} from {} on port {}...", rkey, input, port); 33 + 34 + // Resolve handle to DID if needed 35 + use jacquard_identity::PublicResolver; 36 + use jacquard::prelude::IdentityResolver; 37 + 38 + let resolver = PublicResolver::default(); 39 + let did = if input.starts_with("did:") { 40 + Did::new(&input).into_diagnostic()? 41 + } else { 42 + // It's a handle, resolve it 43 + let handle = jacquard_common::types::string::Handle::new(&input).into_diagnostic()?; 44 + resolver.resolve_handle(&handle).await.into_diagnostic()? 45 + }; 46 + 47 + println!("Resolved to DID: {}", did.as_str()); 48 + 49 + // Create output directory if it doesn't exist 50 + std::fs::create_dir_all(&output_dir).into_diagnostic()?; 51 + 52 + // Initial pull of the site 53 + println!("Performing initial pull..."); 54 + let did_str = CowStr::from(did.as_str().to_string()); 55 + pull_site(did_str.clone(), rkey.clone(), output_dir.clone()).await?; 56 + 57 + // Create shared state 58 + let state = ServerState { 59 + did: did_str.clone(), 60 + rkey: rkey.clone(), 61 + output_dir: output_dir.clone(), 62 + last_cid: Arc::new(RwLock::new(None)), 63 + }; 64 + 65 + // Start firehose listener in background 66 + let firehose_state = state.clone(); 67 + tokio::spawn(async move { 68 + if let Err(e) = watch_firehose(firehose_state).await { 69 + eprintln!("Firehose error: {}", e); 70 + } 71 + }); 72 + 73 + // Create HTTP server with gzip compression 74 + let app = Router::new() 75 + .fallback_service( 76 + ServeDir::new(&output_dir) 77 + .precompressed_gzip() 78 + ) 79 + .layer(CompressionLayer::new()) 80 + .with_state(state); 81 + 82 + let addr = format!("0.0.0.0:{}", port); 83 + let listener = tokio::net::TcpListener::bind(&addr) 84 + .await 85 + .into_diagnostic()?; 86 + 87 + println!("\n✓ Server running at http://localhost:{}", port); 88 + println!(" Watching for updates on the firehose...\n"); 89 + 90 + axum::serve(listener, app).await.into_diagnostic()?; 91 + 92 + Ok(()) 93 + } 94 + 95 + /// Watch the firehose for updates to the specific site 96 + fn watch_firehose(state: ServerState) -> std::pin::Pin<Box<dyn std::future::Future<Output = miette::Result<()>> + Send>> { 97 + Box::pin(async move { 98 + let jetstream_url = Url::parse("wss://jetstream1.us-east.fire.hose.cam") 99 + .into_diagnostic()?; 100 + 101 + println!("[Firehose] Connecting to Jetstream..."); 102 + 103 + // Create subscription client 104 + let client = TungsteniteSubscriptionClient::from_base_uri(jetstream_url); 105 + 106 + // Subscribe with no filters (we'll filter manually) 107 + // Jetstream doesn't support filtering by collection in the params builder 108 + let params = JetstreamParams::new().build(); 109 + 110 + let stream = client.subscribe(&params).await.into_diagnostic()?; 111 + println!("[Firehose] Connected! Watching for updates..."); 112 + 113 + // Convert to typed message stream 114 + let (_sink, mut messages) = stream.into_stream(); 115 + 116 + loop { 117 + match messages.next().await { 118 + Some(Ok(msg)) => { 119 + if let Err(e) = handle_firehose_message(&state, msg).await { 120 + eprintln!("[Firehose] Error handling message: {}", e); 121 + } 122 + } 123 + Some(Err(e)) => { 124 + eprintln!("[Firehose] Stream error: {}", e); 125 + // Try to reconnect after a delay 126 + tokio::time::sleep(tokio::time::Duration::from_secs(5)).await; 127 + return Box::pin(watch_firehose(state)).await; 128 + } 129 + None => { 130 + println!("[Firehose] Stream ended, reconnecting..."); 131 + tokio::time::sleep(tokio::time::Duration::from_secs(5)).await; 132 + return Box::pin(watch_firehose(state)).await; 133 + } 134 + } 135 + } 136 + }) 137 + } 138 + 139 + /// Handle a firehose message 140 + async fn handle_firehose_message( 141 + state: &ServerState, 142 + msg: JetstreamMessage<'_>, 143 + ) -> miette::Result<()> { 144 + match msg { 145 + JetstreamMessage::Commit { 146 + did, 147 + commit, 148 + .. 149 + } => { 150 + // Check if this is our site 151 + if did.as_str() == state.did.as_str() 152 + && commit.collection.as_str() == "place.wisp.fs" 153 + && commit.rkey.as_str() == state.rkey.as_str() 154 + { 155 + match commit.operation { 156 + CommitOperation::Create | CommitOperation::Update => { 157 + let new_cid = commit.cid.as_ref().map(|c| c.to_string()); 158 + 159 + // Check if CID changed 160 + let should_update = { 161 + let last_cid = state.last_cid.read().await; 162 + new_cid != *last_cid 163 + }; 164 + 165 + if should_update { 166 + println!("\n[Update] Detected change to site {} (CID: {:?})", state.rkey, new_cid); 167 + println!("[Update] Pulling latest version..."); 168 + 169 + // Pull the updated site 170 + match pull_site( 171 + state.did.clone(), 172 + state.rkey.clone(), 173 + state.output_dir.clone(), 174 + ) 175 + .await 176 + { 177 + Ok(_) => { 178 + // Update last CID 179 + let mut last_cid = state.last_cid.write().await; 180 + *last_cid = new_cid; 181 + println!("[Update] ✓ Site updated successfully!\n"); 182 + } 183 + Err(e) => { 184 + eprintln!("[Update] Failed to pull site: {}", e); 185 + } 186 + } 187 + } 188 + } 189 + CommitOperation::Delete => { 190 + println!("\n[Update] Site {} was deleted", state.rkey); 191 + } 192 + } 193 + } 194 + } 195 + _ => { 196 + // Ignore identity and account messages 197 + } 198 + } 199 + 200 + Ok(()) 201 + } 202 +
-3
.gitmodules
··· 1 - [submodule "cli/jacquard"] 2 - path = cli/jacquard 3 - url = https://tangled.org/@nonbinary.computer/jacquard
-1
cli/jacquard
··· 1 - Subproject commit d533482a61f540586b1eea620b8e9a01a59d5650
+1 -1
cli/Cargo.lock
··· 4913 4913 4914 4914 [[package]] 4915 4915 name = "wisp-cli" 4916 - version = "0.1.0" 4916 + version = "0.2.0" 4917 4917 dependencies = [ 4918 4918 "axum", 4919 4919 "base64 0.22.1",
+28 -1
crates.nix
··· 19 19 targets.x86_64-pc-windows-gnu.latest.rust-std 20 20 targets.x86_64-unknown-linux-gnu.latest.rust-std 21 21 targets.aarch64-apple-darwin.latest.rust-std 22 + targets.aarch64-unknown-linux-gnu.latest.rust-std 22 23 ]; 23 24 # configure crates 24 25 nci.crates."wisp-cli" = { ··· 26 27 dev.runTests = false; 27 28 release.runTests = false; 28 29 }; 29 - targets."x86_64-unknown-linux-gnu" = { 30 + targets."x86_64-unknown-linux-gnu" = let 31 + targetPkgs = pkgs.pkgsCross.gnu64; 32 + targetCC = targetPkgs.stdenv.cc; 33 + targetCargoEnvVarTarget = targetPkgs.stdenv.hostPlatform.rust.cargoEnvVarTarget; 34 + in rec { 30 35 default = true; 36 + depsDrvConfig.mkDerivation = { 37 + nativeBuildInputs = [targetCC]; 38 + }; 39 + depsDrvConfig.env = rec { 40 + TARGET_CC = "${targetCC.targetPrefix}cc"; 41 + "CARGO_TARGET_${targetCargoEnvVarTarget}_LINKER" = TARGET_CC; 42 + }; 43 + drvConfig = depsDrvConfig; 31 44 }; 32 45 targets."x86_64-pc-windows-gnu" = let 33 46 targetPkgs = pkgs.pkgsCross.mingwW64; ··· 58 71 }; 59 72 drvConfig = depsDrvConfig; 60 73 }; 74 + targets."aarch64-unknown-linux-gnu" = let 75 + targetPkgs = pkgs.pkgsCross.aarch64-multiplatform; 76 + targetCC = targetPkgs.stdenv.cc; 77 + targetCargoEnvVarTarget = targetPkgs.stdenv.hostPlatform.rust.cargoEnvVarTarget; 78 + in rec { 79 + depsDrvConfig.mkDerivation = { 80 + nativeBuildInputs = [targetCC]; 81 + }; 82 + depsDrvConfig.env = rec { 83 + TARGET_CC = "${targetCC.targetPrefix}cc"; 84 + "CARGO_TARGET_${targetCargoEnvVarTarget}_LINKER" = TARGET_CC; 85 + }; 86 + drvConfig = depsDrvConfig; 87 + }; 61 88 }; 62 89 }; 63 90 }
+17 -2
flake.nix
··· 26 26 ... 27 27 }: let 28 28 crateOutputs = config.nci.outputs."wisp-cli"; 29 + mkRenamedPackage = name: pkg: pkgs.runCommand name {} '' 30 + mkdir -p $out/bin 31 + cp ${pkg}/bin/wisp-cli $out/bin/${name} 32 + ''; 29 33 in { 30 34 devShells.default = crateOutputs.devShell; 31 35 packages.default = crateOutputs.packages.release; 32 - packages.wisp-cli-windows = crateOutputs.allTargets."x86_64-pc-windows-gnu".packages.release; 33 - packages.wisp-cli-darwin = crateOutputs.allTargets."aarch64-apple-darwin".packages.release; 36 + packages.wisp-cli-x86_64-linux = mkRenamedPackage "wisp-cli-x86_64-linux" crateOutputs.packages.release; 37 + packages.wisp-cli-aarch64-linux = mkRenamedPackage "wisp-cli-aarch64-linux" crateOutputs.allTargets."aarch64-unknown-linux-gnu".packages.release; 38 + packages.wisp-cli-x86_64-windows = mkRenamedPackage "wisp-cli-x86_64-windows.exe" crateOutputs.allTargets."x86_64-pc-windows-gnu".packages.release; 39 + packages.wisp-cli-aarch64-darwin = mkRenamedPackage "wisp-cli-aarch64-darwin" crateOutputs.allTargets."aarch64-apple-darwin".packages.release; 40 + packages.all = pkgs.symlinkJoin { 41 + name = "wisp-cli-all"; 42 + paths = [ 43 + config.packages.wisp-cli-x86_64-linux 44 + config.packages.wisp-cli-aarch64-linux 45 + config.packages.wisp-cli-x86_64-windows 46 + config.packages.wisp-cli-aarch64-darwin 47 + ]; 48 + }; 34 49 }; 35 50 }; 36 51 }

History

3 rounds 0 comments
sign up or login to add to the discussion
6 commits
expand
9a803381
init support for redirects file
f1f70b3b
Add support for existing blob reuse in deployment process
56b1ef45
dont normalize paths when comparing CIDs
38b1c4c6
add pull and serve to cli
436d7a06
remove jacquard submodule
122e18dd
update flake
expand 0 comments
closed without merging
2 commits
expand
f1f70b3b
Add support for existing blob reuse in deployment process
56b1ef45
dont normalize paths when comparing CIDs
expand 0 comments
2 commits
expand
9a803381
init support for redirects file
f1f70b3b
Add support for existing blob reuse in deployment process
expand 0 comments