···1+* uncategorized notes
2+3+** sync
4+- each client keeps the full data set
5+- dexie sync and observable let us stream change sets
6+- we can publish the "latest" to all peers
7+- on first pull, if not the first client, we can request a dump out of band
8+9+*** rss feed data
10+- do we want to backup feed data?
11+ - conceptually, this should be refetchable
12+ - but feeds go away, and some will only show recent stories
13+ - so yes, we'll need this
14+ - but server side, we can dedupe
15+ - content-addressed server-side cache?
16+17+- server side does RSS pulling
18+ - can feeds be marked private, such that they won't be pulled through the proxy?
19+ - but then we require everything to be fetchable via cors
20+ - client configured proxy settings?
21+22+*** peer connection
23+- on startup, check for current realm-id and key pair
24+- if not present, ask to login or start new
25+ - if login, run through the [[* pairing]] process
26+ - if start new, run through the [[* registration]] process
27+- use keypair to authenticate to server
28+ - response includes list of active peers to connect
29+- clients negotiate sync from there
30+- an identity is a keypair and a realm
31+32+- realm is uuid
33+ - realm on the server is the socket connection for peer discovery
34+ - keeps a list of verified public keys
35+ - and manages the /current/ ~public-key->peer ids~ mapping
36+ - realm on the client side is first piece of info required for sync
37+ - when connecting to the signalling server, you present a realm, and a signed public key
38+ - server accepts/rejects based on signature and current verified keys
39+40+- a new keypair can create a realm
41+42+- a new keypair can double sign an invitation
43+ - invite = ~{ realm:, nonce:, not_before:, not_after:, authorizer: }~, signed with verified key
44+ - exchanging an invite = ~{ invite: }~, signed with my key
45+46+- on startup
47+ - start stand-alone (no syncing required, usually the case on first-run)
48+ - generate a keypair
49+ - want server backup?
50+ - sign a "setup" message with new keypair and send to the server
51+ - server responds with a new realm, that this keypair is already verified for
52+ - move along
53+ - exchange invite to sync to other devices
54+ - generate a keypair
55+ - sign the exchange message with the invite and send to the server
56+ - server verifies the invite
57+ - adds the new public key to the peer list and publishes downstream
58+ - move along
59+60+***** standalone
61+in this mode, there is no syncing. this is the most likely first-time run option.
62+63+- generate a keypair on startup, so we have a stable fingerprint in the future
64+- done
65+66+***** pairing
67+in this mode, there is syncing to a named realm, but not necessarily server resources consumed
68+we don't need an email, since the server is just doing signalling and peer management
69+70+- generate an invite from an existing verified peer
71+ - ~{ realm:, not_before:, not_after:, inviter: peer.public_key }~
72+ - sign that invitation from the existing verified peer
73+74+- standalone -> paired
75+ - get the invitation somehow (QR code?)
76+ - sign an invite exchange with the standalone's public key
77+ - send to server
78+ - server verifies the invite
79+ - adds the new public key to the peer list and publishes downstream
80+81+***** server backup
82+in this mode, there is syncing to a named realm by email.
83+84+goal of server backup mode is that we can go from email->fully working client with latest data without having to have any clients left around that could participate in the sync.
85+86+- generate a keypair on startup
87+- sign a registration message sent to the server
88+ - send a verification email
89+ - if email/realm already exists, this is authorization
90+ - if not, it's email validation
91+ - server starts a realm and associates the public key
92+ - server acts as a peer for the realm, and stores private data
93+94+- since dexie is publishing change sets, we should be able to just store deltas
95+- but we'll need to store _all_ deltas, unless we're materializing on the server side too
96+ - should we use an indexdb shim so we can import/export from the server for clean start?
97+ - how much materialization does the server need?
98+99+** summarized architecture design (may 28-29) :ai:claude:
100+101+key decisions and system design:
102+103+*** sync model
104+- device-specific records for playback state/queues to avoid conflicts
105+- content-addressed server cache with deduplication
106+- dual-JWT invitation flow for secure realm joining
107+108+*** data structures
109+- tag-based filtering system instead of rigid hierarchies
110+- regex patterns for episode title parsing and organization
111+- service worker caching with background download support
112+113+*** core schemas
114+**** client (dexie)
115+- Channel/ChannelEntry for RSS feeds and episodes
116+- PlayRecord/QueueItem scoped by deviceId
117+- FilterView for virtual feed organization
118+119+**** server (drizzle)
120+- ContentStore for deduplicated content by hash
121+- Realm/PeerConnection for sync authorization
122+- HttpCache with health tracking and TTL
123+124+*** push sync strategy
125+- revision-based sync (just send revision ranges in push notifications)
126+- background fetch API for large downloads where supported
127+- graceful degradation to reactive caching
128+129+*** research todos :ai:claude:
130+**** sync and data management
131+***** DONE identity and signature management
132+***** TODO dexie sync capabilities vs rxdb for multi-device sync implementation
133+***** TODO webrtc p2p sync implementation patterns and reliability
134+***** TODO conflict resolution strategies for device-specific data in distributed sync
135+***** TODO content-addressed deduplication algorithms for rss/podcast content
136+**** client-side storage and caching
137+***** TODO opfs storage limits and cleanup strategies for client-side caching
138+***** TODO practical background fetch api limits and edge cases for podcast downloads
139+**** automation and intelligence
140+***** TODO llm-based regex generation for episode title parsing automation
141+***** TODO push notification subscription management and realm authentication
142+**** platform and browser capabilities
143+***** TODO browser audio api capabilities for podcast-specific features (speed, silence skip)
144+***** TODO progressive web app installation and platform-specific behaviors
145+146+** <2025-05-28 Wed>
147+getting everything setup
148+149+the biggest open question I have is what sort of privacy/encryption guarantee I need. I
150+want the server to be able to do things like cache and store feed data long-term.
151+152+Is "if you want full privacy, self-host" valid?
153+154+*** possibilities
155+156+- fully PWA
157+ - CON: cors, which would require a proxy anyway
158+ - CON: audio analysis, llm based stuff for categorization, etc. won't work
159+ - PRO: private as all get out
160+ - can still do WebRTC p2p sync for resiliancy
161+ - can still do server backups, if sync stream is encrypted, but no compaction would be available
162+ - could do _explicit_ server backups as dump files
163+164+- self hostable
165+ - PRO: can do bunches of private stuff on the server, because if you don't want me to see it, do it elsewhere
166+ - CON: hard for folk to use
167+168+*** sync conflict resolution design discussion :ai:claude:
169+170+discussed the sync architecture and dexie conflict handling:
171+172+*dexie syncable limitations*:
173+- logical clocks handle causally-related changes well
174+- basic timestamp-based conflict resolution for concurrent updates
175+- last-writer-wins for same field conflicts
176+- no sophisticated CRDT or vector clock support
177+178+*solutions for podcast-specific conflicts*:
179+180+- play records: device-specific approach
181+ - store separate ~play_records~ per ~device_id~
182+ - each record: ~{ episode_id, device_id, position, completed, timestamp }~
183+ - UI handles conflict resolution with "continue from X device?" prompts
184+ - avoids arbitrary timestamp wins, gives users control
185+186+- subscription trees
187+ - store ~parent_path~ as single string field ("/Tech/Programming")
188+ - simpler than managing folder membership tables
189+ - conflicts still possible but contained to single field
190+ - could store move operations as events for richer resolution
191+192+*other sync considerations*:
193+- settings/preferences: distinguish device-local vs global
194+- bulk operations: "mark all played" can create duplicate operations
195+- metadata updates: server RSS updates vs local renames
196+- temporal ordering: recently played lists, queue reordering
197+- storage limits: cleanup operations conflicting across devices
198+- feed state: refresh timestamps, error states
199+200+*approach*: prefer "events not state" pattern and device-specific records where semantic conflicts are likely
201+202+*** data model brainstorm :ai:claude:
203+204+core entities designed with sync in mind:
205+206+**** ~Feed~ :: RSS/podcast subscription
207+- ~parent_path~ field for folder structure (eg. ~/Tech/Programming~)
208+- ~is_private~ flag to skip server proxy
209+- ~refresh_interval~ for custom update frequencies
210+211+**** ~Episode~ :: individual podcast episodes
212+- standard RSS metadata (guid, title, description, media url)
213+- duration and file info for playback
214+215+**** ~PlayRecord~ :: device-specific playback state
216+- separate record per ~device_id~ to avoid timestamp conflicts
217+- position, completed status, playback speed
218+- UI can prompt "continue from X device?" for resolution
219+220+**** ~QueueItem~ :: device-specific episode queue
221+- ordered list with position field
222+- ~device_id~ scoped to avoid queue conflicts
223+224+**** ~Subscription~ :: feed membership settings
225+- can be global or device-specific
226+- auto-download preferences per device
227+228+**** ~Settings~ :: split global vs device-local
229+- theme, default speed = global
230+- download path, audio device = device-local
231+232+**** Event tables for complex operations:
233+- ~FeedMoveEvent~ for folder reorganization
234+- ~BulkMarkPlayedEvent~ for "mark all read" operations
235+- better conflict resolution than direct state updates
236+237+**** sync considerations
238+- device identity established on first run
239+- dexie syncable handles basic timestamp conflicts
240+- prefer device-scoped records for semantic conflicts
241+- event-driven pattern for bulk operations
242+243+*** schema evolution from previous iteration :ai:claude:
244+245+reviewed existing schema from tmp/feed.ts - well designed foundation:
246+247+**** keep from original
248+- Channel/ChannelEntry naming and structure
249+- ~refreshHP~ adaptive refresh system (much better than simple intervals)
250+- rich podcast metadata (people, tags, enclosure, podcast object)
251+- HTTP caching with etag/status tracking
252+- epoch millisecond timestamps
253+- ~hashId()~ approach for entry IDs
254+255+**** add for multi-device sync
256+- ~PlayState~ table (device-scoped position/completion)
257+- Subscription table (with ~parentPath~ for folders, device-scoped settings)
258+- ~QueueItem~ table (device-scoped episode queues)
259+- Device table (identity management)
260+261+**** migration considerations
262+- existing Channel/ChannelEntry can be preserved
263+- new tables are additive
264+- ~fetchAndUpsert~ method works well with server proxy architecture
265+- dexie sync vs rxdb - need to evaluate change tracking capabilities
266+267+*** content-addressed caching for offline resilience :ai:claude:
268+269+designed caching system for when upstream feeds fail/disappear, building on existing cache-schema.ts:
270+271+**** server-side schema evolution (drizzle sqlite):
272+- keep existing ~httpCacheTable~ design (health tracking, http headers, ttl)
273+- add ~contentHash~ field pointing to deduplicated content
274+- new ~contentStoreTable~: deduplicated blobs by sha256 hash
275+- new ~contentHistoryTable~: url -> contentHash timeline with isLatest flag
276+- reference counting for garbage collection
277+278+**** client-side OPFS storage
279+- ~/cache/content/{contentHash}.xml~ for raw feeds
280+- ~/cache/media/{contentHash}.mp3~ for podcast episodes
281+- ~LocalCacheEntry~ metadata tracks expiration and offline-only flags
282+- maintains last N versions per feed for historical access
283+284+**** fetch strategy & fallback
285+1. check local OPFS cache first (fastest)
286+2. try server proxy ~/api/feed?url={feedUrl}~ (deduplicated)
287+3. server checks ~contentHistory~, serves latest or fetches upstream
288+4. server returns ~{contentHash, content, cached: boolean}~
289+5. client stores with content hash as filename
290+6. emergency mode: serve stale content when upstream fails
291+292+- preserves existing health tracking and HTTP caching logic
293+- popular feeds cached once on server, many clients benefit
294+- bandwidth savings via content hash comparison
295+- historical feed state preservation (feeds disappear!)
296+- true offline operation after initial sync
297+298+** <2025-05-29 Thu> :ai:claude:
299+e2e encryption and invitation flow design
300+301+worked through the crypto and invitation architecture. key decisions:
302+303+*** keypair strategy
304+- use jwk format for interoperability (server stores public keys)
305+- ed25519 for signing, separate x25519 for encryption if needed
306+- zustand lazy initialization pattern: ~ensureKeypair()~ on first use
307+- store private jwk in persisted zustand state
308+309+*** invitation flow: dual-jwt approach
310+solved the chicken-and-egg problem of sharing encryption keys securely.
311+312+**** qr code contains two signed jwts:
313+1. invitation token: ~{iss: inviter_fingerprint, sub: invitation_id, purpose: "realm_invite"}~
314+2. encryption key token: ~{iss: inviter_fingerprint, ephemeral_private: base64_key, purpose: "ephemeral_key"}~
315+316+**** exchange process:
317+1. invitee posts jwt1 + their public keys to ~/invitations~
318+2. server verifies jwt1 signature against realm members
319+3. if valid: adds invitee to realm, returns ~{realm_id, realm_members, encrypted_realm_key}~
320+4. invitee verifies jwt2 signature against returned realm members
321+5. invitee extracts ephemeral private key, decrypts realm encryption key
322+323+**** security properties:
324+- server never has decryption capability (missing ephemeral private key)
325+- both jwts must be signed by verified realm member
326+- if first exchange fails, second jwt is cryptographically worthless
327+- atomic operation: identity added only if invitation valid
328+- built-in expiration and tamper detection via jwt standard
329+330+**** considered alternatives:
331+- raw ephemeral keys in qr: simpler but no authenticity
332+- ecdh key agreement: chicken-and-egg problem with public key exchange
333+- server escrow: good but missing authentication layer
334+- password-based: requires secure out-of-band sharing
335+336+the dual-jwt approach provides proper authenticated invitations while maintaining e2e encryption properties.
337+338+**** refined dual-jwt with ephemeral signing
339+simplified the approach by using ephemeral key for second jwt signature:
340+341+**setup**:
342+1. inviter generates ephemeral keypair
343+2. encrypts realm key with ephemeral private key
344+3. posts to server: ~{invitation_id, realm_id, ephemeral_public, encrypted_realm_key}~
345+346+**qr code contains**:
347+#+BEGIN_SRC json
348+// JWT 1: signed with inviter's realm signing key
349+{
350+ "realm_id": "uuid",
351+ "invitation_id": "uuid",
352+ "iss": "inviter_fingerprint"
353+}
354+355+// JWT 2: signed with ephemeral private key
356+{
357+ "ephemeral_private": "base64_key",
358+ "invitation_id": "uuid"
359+}
360+#+END_SRC
361+362+**exchange flow**:
363+1. submit jwt1 → server verifies against realm members → returns ~{invitation_id, realm_id, ephemeral_public, encrypted_realm_key}~
364+2. verify jwt2 signature using ~ephemeral_public~ from server response
365+3. extract ~ephemeral_private~ from jwt2, decrypt realm key
366+367+**benefits over previous version**:
368+- no premature key disclosure (invitee keys shared via normal webrtc peering)
369+- self-contained verification (ephemeral public key verifies jwt2)
370+- cleaner separation of realm auth vs encryption key distribution
371+- simpler flow (no need to return realm member list)
372+373+**crypto verification principle**: digital signatures work as sign-with-private/verify-with-public, while encryption works as encrypt-with-public/decrypt-with-private. jwt2 verification uses signature verification, not decryption.
374+375+**invitation flow diagram**:
376+#+BEGIN_SRC mermaid
377+sequenceDiagram
378+ participant I as Inviter
379+ participant S as Server
380+ participant E as Invitee
381+382+ Note over I: Generate ephemeral keypair
383+ I->>I: ephemeral_private, ephemeral_public
384+385+ Note over I: Encrypt realm key
386+ I->>I: encrypted_realm_key = encrypt(realm_key, ephemeral_private)
387+388+ I->>S: POST /invitations<br/>{invitation_id, realm_id, ephemeral_public, encrypted_realm_key}
389+ S-->>I: OK
390+391+ Note over I: Create JWTs for QR code
392+ I->>I: jwt1 = sign({realm_id, invitation_id}, inviter_private)
393+ I->>I: jwt2 = sign({ephemeral_private, invitation_id}, ephemeral_private)
394+395+ Note over I,E: QR code contains [jwt1, jwt2]
396+397+ E->>S: POST /invitations/exchange<br/>{jwt1}
398+ Note over S: Verify jwt1 signature<br/>against realm members
399+ S-->>E: {invitation_id, realm_id, ephemeral_public, encrypted_realm_key}
400+401+ Note over E: Verify jwt2 signature<br/>using ephemeral_public
402+ E->>E: verify_signature(jwt2, ephemeral_public)
403+404+ Note over E: Extract key and decrypt
405+ E->>E: ephemeral_private = decode(jwt2)
406+ E->>E: realm_key = decrypt(encrypted_realm_key, ephemeral_private)
407+408+ Note over E: Now member of realm!
409+#+END_SRC
410+411+**** jwk keypair generation and validation :ai:claude:
412+413+discussed jwk vs raw crypto.subtle for keypair storage. since public keys need server storage for realm authorization, jwk is better for interoperability.
414+415+**keypair generation**:
416+#+BEGIN_SRC typescript
417+const keypair = await crypto.subtle.generateKey(
418+ { name: "Ed25519" },
419+ true,
420+ ["sign", "verify"]
421+);
422+423+const publicJWK = await crypto.subtle.exportKey("jwk", keypair.publicKey);
424+const privateJWK = await crypto.subtle.exportKey("jwk", keypair.privateKey);
425+426+// JWK format:
427+{
428+ "kty": "OKP",
429+ "crv": "Ed25519",
430+ "x": "base64url-encoded-public-key",
431+ "d": "base64url-encoded-private-key" // only in private JWK
432+}
433+#+END_SRC
434+435+**client validation**:
436+#+BEGIN_SRC typescript
437+function isValidEd25519PublicJWK(jwk: any): boolean {
438+ return (
439+ typeof jwk === 'object' &&
440+ jwk.kty === 'OKP' &&
441+ jwk.crv === 'Ed25519' &&
442+ typeof jwk.x === 'string' &&
443+ jwk.x.length === 43 && // base64url Ed25519 public key length
444+ !jwk.d && // public key shouldn't have private component
445+ !jwk.use || jwk.use === 'sig'
446+ );
447+}
448+449+async function validatePublicKey(publicJWK: JsonWebKey): Promise<CryptoKey | null> {
450+ try {
451+ if (!isValidEd25519PublicJWK(publicJWK)) return null;
452+453+ const key = await crypto.subtle.importKey(
454+ 'jwk',
455+ publicJWK,
456+ { name: 'Ed25519' },
457+ false,
458+ ['verify']
459+ );
460+461+ return key;
462+ } catch {
463+ return null;
464+ }
465+}
466+#+END_SRC
467+468+**server validation (node.js)**:
469+#+BEGIN_SRC typescript
470+import { webcrypto } from 'node:crypto';
471+472+async function validateClientPublicKey(publicJWK: JsonWebKey): Promise<boolean> {
473+ try {
474+ if (!isValidEd25519PublicJWK(publicJWK)) return false;
475+476+ await webcrypto.subtle.importKey(
477+ 'jwk',
478+ publicJWK,
479+ { name: 'Ed25519' },
480+ false,
481+ ['verify']
482+ );
483+484+ return true;
485+ } catch {
486+ return false;
487+ }
488+}
489+#+END_SRC
490+491+**authentication flow**:
492+#+BEGIN_SRC typescript
493+// client signs message
494+const authMessage = {
495+ realm: 'uuid-here',
496+ timestamp: Date.now(),
497+ action: 'join'
498+};
499+500+const signature = await crypto.subtle.sign(
501+ 'Ed25519',
502+ privateKey,
503+ new TextEncoder().encode(JSON.stringify(authMessage))
504+);
505+506+// server verifies
507+async function verifyAuth(req: AuthRequest): Promise<boolean> {
508+ const publicKey = await webcrypto.subtle.importKey(
509+ 'jwk',
510+ req.publicKey,
511+ { name: 'Ed25519' },
512+ false,
513+ ['verify']
514+ );
515+516+ const messageBytes = new TextEncoder().encode(JSON.stringify(req.message));
517+ const signatureBytes = new Uint8Array(req.signature);
518+519+ return await webcrypto.subtle.verify(
520+ 'Ed25519',
521+ publicKey,
522+ signatureBytes,
523+ messageBytes
524+ );
525+}
526+#+END_SRC
527+528+**** proposed schemas :ai:claude:
529+530+***** client-side schema (dexie)
531+532+#+BEGIN_SRC typescript
533+// Core RSS/Podcast data (from your existing design)
534+interface Channel {
535+ id: string;
536+ feedUrl: string;
537+ htmlUrl?: string;
538+ imageUrl?: string;
539+ title?: string;
540+ description?: string;
541+ language?: string;
542+ people?: Record<string, string>;
543+ tags?: string[];
544+545+ // Refresh management
546+ refreshHP: number;
547+ nextRefreshAt?: number;
548+ lastRefreshAt?: number;
549+ lastRefreshStatus?: string;
550+ lastRefreshHttpStatus?: number;
551+ lastRefreshHttpEtag?: string;
552+553+ // Cache info
554+ contentHash?: string;
555+ lastFetchedAt?: number;
556+}
557+558+interface ChannelEntry {
559+ id: string;
560+ channelId: string;
561+ guid: string;
562+ title: string;
563+ linkUrl?: string;
564+ imageUrl?: string;
565+ snippet?: string;
566+ content?: string;
567+568+ enclosure?: {
569+ url: string;
570+ type?: string;
571+ length?: number;
572+ };
573+574+ podcast?: {
575+ explicit?: boolean;
576+ duration?: string;
577+ seasonNum?: number;
578+ episodeNum?: number;
579+ transcriptUrl?: string;
580+ };
581+582+ publishedAt?: number;
583+ fetchedAt?: number;
584+}
585+586+// Device-specific sync tables
587+interface PlayRecord {
588+ id: string;
589+ entryId: string;
590+ deviceId: string;
591+ position: number;
592+ duration?: number;
593+ completed: boolean;
594+ speed: number;
595+ updatedAt: number;
596+}
597+598+interface Subscription {
599+ id: string;
600+ channelId: string;
601+ deviceId?: string;
602+ parentPath: string; // "/Tech/Programming"
603+ autoDownload: boolean;
604+ downloadLimit?: number;
605+ isActive: boolean;
606+ createdAt: number;
607+ updatedAt: number;
608+}
609+610+interface QueueItem {
611+ id: string;
612+ entryId: string;
613+ deviceId: string;
614+ position: number;
615+ addedAt: number;
616+}
617+618+interface Device {
619+ id: string;
620+ name: string;
621+ platform: string;
622+ lastSeen: number;
623+}
624+625+// Local cache metadata
626+interface LocalCache {
627+ id: string;
628+ url: string;
629+ contentHash: string;
630+ filePath: string; // OPFS path
631+ cachedAt: number;
632+ expiresAt?: number;
633+ size: number;
634+ isOfflineOnly: boolean;
635+}
636+637+// Dexie schema
638+const db = new Dexie('SkypodDB');
639+db.version(1).stores({
640+ channels: '&id, feedUrl, contentHash',
641+ channelEntries: '&id, channelId, publishedAt',
642+ playRecords: '&id, [entryId+deviceId], deviceId, updatedAt',
643+ subscriptions: '&id, channelId, deviceId, parentPath',
644+ queueItems: '&id, entryId, deviceId, position',
645+ devices: '&id, lastSeen',
646+ localCache: '&id, url, contentHash, expiresAt'
647+});
648+#+END_SRC
649+650+***** server-side schema
651+652+#+BEGIN_SRC typescript
653+// Content-addressed cache
654+interface ContentStore {
655+ contentHash: string; // Primary key
656+ content: Buffer; // Raw feed content
657+ contentType: string;
658+ contentLength: number;
659+ firstSeenAt: number;
660+ referenceCount: number;
661+}
662+663+interface ContentHistory {
664+ id: string;
665+ url: string;
666+ contentHash: string;
667+ fetchedAt: number;
668+ isLatest: boolean;
669+}
670+671+// HTTP cache with health tracking (from your existing design)
672+interface HttpCache {
673+ key: string; // URL hash, primary key
674+ url: string;
675+676+ status: 'alive' | 'dead';
677+ lastFetchedAt: number;
678+ lastFetchError?: string;
679+ lastFetchErrorStreak: number;
680+681+ lastHttpStatus: number;
682+ lastHttpEtag?: string;
683+ lastHttpHeaders: Record<string, string>;
684+ expiresAt: number;
685+ expirationTtl: number;
686+687+ contentHash: string; // Points to ContentStore
688+}
689+690+// Sync/auth tables
691+interface Realm {
692+ id: string; // UUID
693+ createdAt: number;
694+ verifiedKeys: string[]; // Public key list
695+}
696+697+interface PeerConnection {
698+ id: string;
699+ realmId: string;
700+ publicKey: string;
701+ lastSeen: number;
702+ isOnline: boolean;
703+}
704+705+// Media cache for podcast episodes
706+interface MediaCache {
707+ contentHash: string; // Primary key
708+ originalUrl: string;
709+ mimeType: string;
710+ fileSize: number;
711+ content: Buffer;
712+ cachedAt: number;
713+ accessCount: number;
714+}
715+#+END_SRC
716+717+**** episode title parsing for sub-feed groupings :ai:claude:
718+719+*problem*: some podcast feeds contain multiple shows, need hierarchical organization within a feed
720+721+*example*: "Apocalypse Players" podcast
722+- episode title: "A Term of Art 6 - Winston's Hollow"
723+- desired grouping: "Apocalypse Players > A Term of Art > 6 - Winston's Hollow"
724+- UI shows sub-shows within the main feed
725+726+***** approaches considered
727+728+1. *manual regex patterns* (short-term solution)
729+ - user provides regex with capture groups = tags
730+ - reliable, immediate, user-controlled
731+ - requires manual setup per feed
732+733+2. *LLM-generated regex* (automation goal)
734+ - analyze last 100 episode titles
735+ - generate regex pattern automatically
736+ - good balance of automation + reliability
737+738+3. *NER model training* (experimental)
739+ - train spacy model for episode title parsing
740+ - current prototype: 150 labelled examples, limited success
741+ - needs more training data to be viable
742+743+***** data model implications
744+745+- add regex pattern field to Channel/Feed
746+- store extracted groupings as hierarchical tags on ~ChannelEntry~
747+- maybe add grouping/series field to episodes
748+749+***** plan
750+751+*preference*: start with manual regex, evolve toward LLM automation
752+753+*implementation design*:
754+- if no title pattern: episodes are direct children of the feed
755+- title pattern = regex with named capture groups + path template
756+757+*example configuration*:
758+- regex: ~^(?<series>[^0-9]+)\s*(?<episode>\d+)\s*-\s*(?<title>.+)$~
759+- path template: ~{series} > Episode {episode} - {title}~
760+- result: "A Term of Art 6 - Winston's Hollow" → "A Term of Art > Episode 6 - Winston's Hollow"
761+762+*schema additions*:
763+#+BEGIN_SRC typescript
764+interface Channel {
765+ // ... existing fields
766+ titlePatterns?: Array<{
767+ name: string; // "Main Episodes", "Bonus Content", etc.
768+ regex: string; // named capture groups
769+ pathTemplate: string; // interpolation template
770+ priority: number; // order to try patterns (lower = first)
771+ isActive: boolean; // can disable without deleting
772+ }>;
773+ fallbackPath?: string; // template for unmatched episodes
774+}
775+776+interface ChannelEntry {
777+ // ... existing fields
778+ parsedPath?: string; // computed from titlePattern
779+ parsedGroups?: Record<string, string>; // captured groups
780+ matchedPatternName?: string; // which pattern was used
781+}
782+#+END_SRC
783+784+*pattern matching logic*:
785+1. try patterns in priority order (lower number = higher priority)
786+2. first matching pattern wins
787+3. if no patterns match, use fallbackPath template (e.g., "Misc > {title}")
788+4. if no fallbackPath, episode stays direct child of feed
789+790+*example multi-pattern setup*:
791+- Pattern 1: "Main Episodes" - ~^(?<series>[^0-9]+)\s*(?<episode>\d+)~ → ~{series} > Episode {episode}~
792+- Pattern 2: "Bonus Content" - ~^Bonus:\s*(?<title>.+)~ → ~Bonus > {title}~
793+- Fallback: ~Misc > {title}~
794+795+**** scoped tags and filter-based UI evolution :ai:claude:
796+797+*generalization*: move from rigid hierarchies to tag-based filtering system
798+799+*tag scoping*:
800+- feed-level tags: "Tech", "Gaming", "D&D"
801+- episode-level tags: from regex captures like "series:CriticalRole", "campaign:2", "type:main"
802+- user tags: manual additions like "favorites", "todo"
803+804+*UI as tag filtering*:
805+- default view: all episodes grouped by feed
806+- filter by ~series:CriticalRole~ → shows only CR episodes across all feeds
807+- filter by ~type:bonus~ → shows bonus content from all podcasts
808+- combine filters: ~series:CriticalRole AND type:main~ → main CR episodes only
809+810+*benefits*:
811+- no rigid hierarchy - users create their own views
812+- regex patterns become automated episode taggers
813+- same filtering system works for search, organization, queues
814+- tags are syncable metadata, views are client-side
815+816+*schema evolution*:
817+#+BEGIN_SRC typescript
818+interface Tag {
819+ scope: 'feed' | 'episode' | 'user';
820+ key: string; // "series", "type", "campaign"
821+ value: string; // "CriticalRole", "bonus", "2"
822+}
823+824+interface ChannelEntry {
825+ // ... existing
826+ tags: Tag[]; // includes regex-generated + manual
827+}
828+829+interface FilterView {
830+ id: string;
831+ name: string;
832+ folderPath: string; // "/Channels/Critical Role"
833+ filters: Array<{
834+ key: string;
835+ value: string;
836+ operator: 'equals' | 'contains' | 'not';
837+ }>;
838+ isDefault: boolean;
839+ createdAt: number;
840+}
841+#+END_SRC
842+843+**** default UI construction and feed merging :ai:claude:
844+845+*auto-generated views on subscribe*:
846+- subscribe to "Critical Role" → creates ~/Channels/Critical Role~ folder
847+- default filter view: ~feed:CriticalRole~ (shows all episodes from that feed)
848+- user can customize, split into sub-views, or delete
849+850+*smart view suggestions*:
851+- after regex patterns generate tags, suggest splitting views
852+- "I noticed episodes with ~series:Campaign2~ and ~series:Campaign3~ - create separate views?"
853+- "Create view for ~type:bonus~ episodes?"
854+855+*view management UX*:
856+- right-click feed → "Split by series", "Split by type"
857+- drag episodes between views to create manual filters
858+- views can be nested: ~/Channels/Critical Role/Campaign 2/Main Episodes~
859+860+*feed merging for multi-source shows*:
861+problem: patreon feed + main show feed for same podcast
862+863+#+BEGIN_EXAMPLE
864+/Channels/
865+ Critical Role/
866+ All Episodes # merged view: feed:CriticalRole OR feed:CriticalRolePatreon
867+ Main Feed # filter: feed:CriticalRole
868+ Patreon Feed # filter: feed:CriticalRolePatreon
869+#+END_EXAMPLE
870+871+*deduplication strategy*:
872+- episodes matched by ~guid~ or similar content hash
873+- duplicate episodes get ~source:main,patreon~ tags
874+- UI shows single episode with source indicators
875+- user can choose preferred source for playback
876+- play state syncs across all sources of same episode
877+878+*feed relationship schema*:
879+#+BEGIN_SRC typescript
880+interface FeedGroup {
881+ id: string;
882+ name: string; // "Critical Role"
883+ feedIds: string[]; // [mainFeedId, patreonFeedId]
884+ mergeStrategy: 'guid' | 'title' | 'contentHash';
885+ defaultView: FilterView;
886+}
887+888+interface ChannelEntry {
889+ // ... existing
890+ duplicateOf?: string; // points to canonical episode ID
891+ sources: string[]; // feed IDs where this episode appears
892+}
893+#+END_SRC
894+895+**per-view settings and state**:
896+each filter view acts like a virtual feed with its own:
897+- unread counts (episodes matching filter that haven't been played)
898+- notification settings (notify for new episodes in this view)
899+- muted state (hide notifications, mark as read automatically)
900+- auto-download preferences (download episodes that match this filter)
901+- play queue integration (add new episodes to queue)
902+903+**use cases**:
904+- mute "Bonus Content" view but keep notifications for main episodes
905+- auto-download only "Campaign 2" episodes, skip everything else
906+- separate unread counts: "5 unread in Main Episodes, 2 in Bonus"
907+- queue only certain series automatically
908+909+**schema additions**:
910+#+BEGIN_SRC typescript
911+interface FilterView {
912+ // ... existing fields
913+ settings: {
914+ notificationsEnabled: boolean;
915+ isMuted: boolean;
916+ autoDownload: boolean;
917+ autoQueue: boolean;
918+ downloadLimit?: number; // max episodes to keep
919+ };
920+ state: {
921+ unreadCount: number;
922+ lastViewedAt?: number;
923+ isCollapsed: boolean; // in sidebar
924+ };
925+}
926+#+END_SRC
927+928+*inheritance behavior*:
929+- new filter views inherit settings from parent feed/group
930+- user can override per-view
931+- "mute all Critical Role" vs "mute only bonus episodes"
932+933+**** client-side episode caching strategy :ai:claude:
934+935+*architecture*: service worker-based transparent caching
936+937+*flow*:
938+1. audio player requests ~/audio?url={episodeUrl}~
939+2. service worker intercepts request
940+3. if present in cache (with Range header support):
941+ - serve from cache
942+4. else:
943+ - let request continue to server (immediate playback)
944+ - simultaneously start background fetch of full audio file
945+ - when complete, broadcast "episode-cached" event
946+ - audio player catches event and restarts feed → now uses cached version
947+948+**benefits**:
949+- no playback interruption (streaming starts immediately)
950+- seamless transition to cached version
951+- Range header support for seeking/scrubbing
952+- transparent to audio player implementation
953+954+*implementation considerations*:
955+- cache storage limits and cleanup policies
956+- partial download resumption if interrupted
957+- cache invalidation when episode URLs change
958+- offline playback support
959+- progress tracking for background downloads
960+961+**schema additions**:
962+#+BEGIN_SRC typescript
963+interface CachedEpisode {
964+ episodeId: string;
965+ originalUrl: string;
966+ cacheKey: string; // for cache API
967+ fileSize: number;
968+ cachedAt: number;
969+ lastAccessedAt: number;
970+ downloadProgress?: number; // 0-100 for in-progress downloads
971+}
972+#+END_SRC
973+974+**service worker events**:
975+- ~episode-cache-started~ - background download began
976+- ~episode-cache-progress~ - download progress update
977+- ~episode-cache-complete~ - ready to switch to cached version
978+- ~episode-cache-error~ - download failed, stay with streaming
979+980+**background sync for proactive downloads**:
981+982+**browser support reality**:
983+- Background Sync API: good support (Chrome/Edge, limited Safari)
984+- Periodic Background Sync: very limited (Chrome only, requires PWA install)
985+- Push notifications: good support, but requires user permission
986+987+**hybrid approach**:
988+1. **foreground sync** (reliable): when app is open, check for new episodes
989+2. **background sync** (opportunistic): register sync event when app closes
990+3. **push notifications** (fallback): server pushes "new episodes available"
991+4. **manual sync** (always works): pull-to-refresh, settings toggle
992+993+**implementation strategy**:
994+#+BEGIN_SRC typescript
995+// Register background sync when app becomes hidden
996+document.addEventListener('visibilitychange', () => {
997+ if (document.hidden && 'serviceWorker' in navigator) {
998+ navigator.serviceWorker.ready.then(registration => {
999+ return registration.sync.register('download-episodes');
1000+ });
1001+ }
1002+});
1003+1004+// Service worker handles sync event
1005+self.addEventListener('sync', event => {
1006+ if (event.tag === 'download-episodes') {
1007+ event.waitUntil(syncEpisodes());
1008+ }
1009+});
1010+#+END_SRC
1011+1012+**realistic expectations**:
1013+- iOS Safari: very limited background processing
1014+- Android Chrome: decent background sync support
1015+- Desktop: mostly works
1016+- battery/data saver modes: disabled by OS
1017+1018+**fallback strategy**: rely primarily on foreground sync + push notifications, treat background sync as nice-to-have enhancement
1019+1020+**push notification sync workflow**:
1021+1022+**server-side trigger**:
1023+1. server detects new episodes during RSS refresh
1024+2. check which users are subscribed to that feed
1025+3. send push notification with episode metadata payload
1026+4. notification wakes up service worker on client
1027+1028+**service worker notification handler**:
1029+#+BEGIN_SRC typescript
1030+self.addEventListener('push', event => {
1031+ const data = event.data?.json();
1032+1033+ if (data.type === 'new-episodes') {
1034+ event.waitUntil(
1035+ // Start background download of new episodes
1036+ downloadNewEpisodes(data.episodes)
1037+ .then(() => {
1038+ // Show notification to user
1039+ return self.registration.showNotification('New episodes available', {
1040+ body: ~${data.episodes.length} new episodes downloaded~,
1041+ icon: '/icon-192.png',
1042+ badge: '/badge-72.png',
1043+ tag: 'new-episodes',
1044+ data: { episodeIds: data.episodes.map(e => e.id) }
1045+ });
1046+ })
1047+ );
1048+ }
1049+});
1050+1051+// Handle notification click
1052+self.addEventListener('notificationclick', event => {
1053+ event.notification.close();
1054+1055+ // Open app to specific episode or feed
1056+ event.waitUntil(
1057+ clients.openWindow(~/episodes/${event.notification.data.episodeIds[0]}~)
1058+ );
1059+});
1060+#+END_SRC
1061+1062+**server push logic**:
1063+- batch notifications (don't spam for every episode)
1064+- respect user notification preferences from FilterView settings
1065+- include episode metadata in payload to avoid round-trip
1066+- throttle notifications (max 1 per feed per hour?)
1067+1068+**user flow**:
1069+1. new episode published → server pushes notification
1070+2. service worker downloads episode in background
1071+3. user sees "New episodes downloaded" notification
1072+4. tap notification → opens app to new episode, ready to play offline
1073+1074+*benefits*:
1075+- true background downloading without user interaction
1076+- works even when app is closed
1077+- respects per-feed notification settings
1078+1079+**push payload size constraints**:
1080+- **limit**: ~4KB (4,096 bytes) across most services
1081+- **practical limit**: ~3KB to account for service overhead
1082+- **implications for episode metadata**:
1083+1084+#+BEGIN_SRC json
1085+{
1086+ "type": "new-episodes",
1087+ "episodes": [
1088+ {
1089+ "id": "ep123",
1090+ "channelId": "ch456",
1091+ "title": "Episode Title",
1092+ "url": "https://...",
1093+ "duration": 3600,
1094+ "size": 89432112
1095+ }
1096+ ]
1097+}
1098+#+END_SRC
1099+1100+**payload optimization strategies**:
1101+- minimal episode metadata in push (id, url, basic info)
1102+- batch multiple episodes in single notification
1103+- full episode details fetched after service worker wakes up
1104+- URL shortening for long episode URLs
1105+- compress JSON payload if needed
1106+1107+**alternative for large payloads**:
1108+- push notification contains only "new episodes available" signal
1109+- service worker makes API call to get full episode list
1110+- trade-off: requires network round-trip but unlimited data
1111+1112+**logical clock sync optimization**:
1113+1114+much simpler approach using sync revisions:
1115+1116+#+BEGIN_SRC json
1117+{
1118+ "type": "sync-available",
1119+ "fromRevision": 12345,
1120+ "toRevision": 12389,
1121+ "changeCount": 8
1122+}
1123+#+END_SRC
1124+1125+**service worker sync flow**:
1126+1. push notification wakes service worker with revision range
1127+2. service worker fetches ~/sync?from=12345&to=12389~
1128+3. server returns only changes in that range (episodes, feed updates, etc)
1129+4. service worker applies changes to local dexie store
1130+5. service worker queues background downloads for new episodes
1131+6. updates local revision to 12389
1132+1133+**benefits of revision-based approach**:
1134+- tiny push payload (just revision numbers)
1135+- server can efficiently return only changes in range
1136+- automatic deduplication (revision already applied = skip)
1137+- works for any sync data (episodes, feed metadata, user settings)
1138+- handles offline gaps gracefully (fetch missing revision ranges)
1139+1140+**sync API response**:
1141+#+BEGIN_SRC typescript
1142+interface SyncResponse {
1143+ fromRevision: number;
1144+ toRevision: number;
1145+ changes: Array<{
1146+ type: 'episode' | 'channel' | 'subscription';
1147+ operation: 'create' | 'update' | 'delete';
1148+ data: any;
1149+ revision: number;
1150+ }>;
1151+}
1152+#+END_SRC
1153+1154+**integration with episode downloads**:
1155+- service worker processes sync changes
1156+- identifies new episodes that match user's auto-download filters
1157+- queues those for background cache fetching
1158+- much more efficient than sending episode metadata in push payload
1159+1160+**service worker processing time constraints**:
1161+1162+**hard limits**:
1163+- **30 seconds idle timeout**: service worker terminates after 30s of inactivity
1164+- **5 minutes event processing**: single event/request must complete within 5 minutes
1165+- **30 seconds fetch timeout**: individual network requests timeout after 30s
1166+- **notification requirement**: push events MUST display notification before promise settles
1167+1168+**practical implications**:
1169+- sync API call (~/sync?from=X&to=Y~) must complete within 30s
1170+- large episode downloads must be queued, not started immediately in push handler
1171+- use ~event.waitUntil()~ to keep service worker alive during processing
1172+- break large operations into smaller chunks
1173+1174+**recommended push event flow**:
1175+#+BEGIN_SRC typescript
1176+self.addEventListener('push', event => {
1177+ const data = event.data?.json();
1178+1179+ event.waitUntil(
1180+ // Must complete within 5 minutes total
1181+ handlePushSync(data)
1182+ .then(() => {
1183+ // Required: show notification before promise settles
1184+ return self.registration.showNotification('Episodes synced');
1185+ })
1186+ );
1187+});
1188+1189+async function handlePushSync(data) {
1190+ // 1. Quick sync API call (< 30s)
1191+ const changes = await fetch(~/sync?from=${data.fromRevision}&to=${data.toRevision}~);
1192+1193+ // 2. Apply changes to dexie store (fast, local)
1194+ await applyChangesToStore(changes);
1195+1196+ // 3. Queue episode downloads for later (don't start here)
1197+ await queueEpisodeDownloads(changes.newEpisodes);
1198+1199+ // Total time: < 5 minutes, preferably < 30s
1200+}
1201+#+END_SRC
1202+1203+*download strategy*: use push event for sync + queuing, separate background tasks for actual downloads
1204+1205+*background fetch API for large downloads*:
1206+1207+*progressive enhancement approach*:
1208+#+BEGIN_SRC typescript
1209+async function queueEpisodeDownloads(episodes) {
1210+ for (const episode of episodes) {
1211+ if ('serviceWorker' in navigator && 'BackgroundFetch' in window) {
1212+ // Chrome/Edge: use Background Fetch API for true background downloading
1213+ await navigator.serviceWorker.ready.then(registration => {
1214+ return registration.backgroundFetch.fetch(
1215+ ~episode-${episode.id}~,
1216+ episode.url,
1217+ {
1218+ icons: [{ src: '/icon-256.png', sizes: '256x256', type: 'image/png' }],
1219+ title: ~Downloading: ${episode.title}~,
1220+ downloadTotal: episode.fileSize
1221+ }
1222+ );
1223+ });
1224+ } else {
1225+ // Fallback: queue for reactive download (download while streaming)
1226+ await queueReactiveDownload(episode);
1227+ }
1228+ }
1229+}
1230+1231+// Handle background fetch completion
1232+self.addEventListener('backgroundfetch', event => {
1233+ if (event.tag.startsWith('episode-')) {
1234+ event.waitUntil(handleEpisodeDownloadComplete(event));
1235+ }
1236+});
1237+#+END_SRC
1238+1239+*browser support reality*:
1240+- *Chrome/Edge*: Background Fetch API supported
1241+- *Firefox/Safari*: not supported, fallback to reactive caching
1242+- *mobile*: varies by platform and browser
1243+1244+*benefits when available*:
1245+- true background downloading (survives app close, browser close)
1246+- built-in download progress UI
1247+- automatic retry on network failure
1248+- no service worker time limits during download
1249+1250+*graceful degradation*:
1251+- detect support, use when available
1252+- fallback to reactive caching (download while streaming)
1253+- user gets best experience possible on their platform
1254+1255+*** research todos :ai:claude:
1256+1257+high-level unanswered questions from architecture brainstorming:
1258+1259+**** sync and data management
1260+***** TODO dexie sync capabilities vs rxdb for multi-device sync implementation
1261+***** TODO webrtc p2p sync implementation patterns and reliability
1262+***** TODO conflict resolution strategies for device-specific data in distributed sync
1263+***** TODO content-addressed deduplication algorithms for rss/podcast content
1264+**** client-side storage and caching
1265+***** TODO opfs storage limits and cleanup strategies for client-side caching
1266+***** TODO practical background fetch api limits and edge cases for podcast downloads
1267+**** automation and intelligence
1268+***** TODO llm-based regex generation for episode title parsing automation
1269+***** TODO push notification subscription management and realm authentication
1270+**** platform and browser capabilities
1271+***** TODO browser audio api capabilities for podcast-specific features (speed, silence skip)
1272+***** TODO progressive web app installation and platform-specific behaviors
1273+1274+* webtorrent brainstorming 6/16 :ai:claude:
1275+1276+** WebTorrent + Event Log CRDT Architecture
1277+1278+*** Core Concept Split
1279+We identified two fundamentally different types of data that need different sync strategies:
1280+1281+**** 1. Dynamic Metadata (Event Log CRDT)
1282+- **Data**: Play state, scroll position, settings, subscriptions
1283+- **Characteristics**: Frequently changing, small, device-specific
1284+- **Solution**: Event log with Hybrid Logical Clocks (HLC)
1285+- **Sync**: Merkle tree efficient diff + P2P exchange via realm
1286+1287+**** 2. Static Content (WebTorrent)
1288+- **Data**: RSS feeds, podcast episodes (audio files)
1289+- **Characteristics**: Immutable, large, content-addressable
1290+- **Solution**: WebTorrent with infohash references
1291+- **Storage**: IndexedDB chunk store (idb-chunk-store npm package)
1292+1293+*** Event Log CRDT Design
1294+1295+**** Hybrid Logical Clock (HLC)
1296+Based on James Long's crdt-example-app implementation:
1297+#+BEGIN_SRC typescript
1298+interface HLC {
1299+ millis: number; // physical time
1300+ counter: number; // logical counter (0-65535)
1301+ node: string; // device identity ID
1302+}
1303+1304+interface SyncEvent {
1305+ timestamp: HLC;
1306+ type: 'subscribe' | 'unsubscribe' | 'markPlayed' | 'updatePosition' | ...
1307+ payload: any;
1308+}
1309+#+END_SRC
1310+1311+**Benefits**:
1312+- Causality preserved even with clock drift
1313+- Compact representation (vs full vector clocks)
1314+- Total ordering via (millis, counter, node) comparison
1315+- No merge conflicts - just union of events
1316+1317+**** Merkle Tree Sync
1318+Efficient sync using merkle trees over time ranges:
1319+1320+#+BEGIN_SRC typescript
1321+interface RangeMerkleNode {
1322+ startTime: HLC;
1323+ endTime: HLC;
1324+ hash: string;
1325+ eventCount: number;
1326+}
1327+#+END_SRC
1328+1329+**Sync Protocol**:
1330+1. Exchange merkle roots
1331+2. If different, drill down to find divergent ranges
1332+3. Exchange only missing events
1333+4. Apply in HLC order
1334+1335+**Key insight**: No merge conflicts because events are immutable and ordered by HLC
1336+1337+**** Progressive Compaction
1338+Use idle time to compact old events:
1339+- Recent (< 5 min): Individual events for active sync
1340+- Hourly chunks: After 5 minutes
1341+- Daily chunks: After 24 hours
1342+- Monthly chunks: After 30 days
1343+1344+Benefits:
1345+- Fast recent sync
1346+- Efficient storage of history
1347+- Old chunks can move to OPFS as blobs
1348+1349+*** WebTorrent Integration
1350+1351+**** Content Flow
1352+1. **CORS-friendly feeds**:
1353+ - Browser fetches directly
1354+ - Creates torrent with original URL as webseed
1355+ - Broadcasts infohash to realm
1356+1357+2. **CORS-blocked feeds**:
1358+ - Server fetches and hashes
1359+ - Returns infohash (server doesn't store content)
1360+ - Client uses WebTorrent with original URL as webseed
1361+1362+**** Realm as Private Tracker
1363+- Realm members announce infohashes they have
1364+- No need for DHT or public trackers
1365+- Existing WebRTC signaling used for peer discovery
1366+- Private swarm for each realm
1367+1368+**** Storage via Chunk Store
1369+Use `idb-chunk-store` (or similar) for persistence:
1370+- WebTorrent handles chunking/verification
1371+- IndexedDB provides persistence across sessions
1372+- Abstract-chunk-store interface allows swapping implementations
1373+1374+*** Bootstrap & History Sharing
1375+1376+**** History Snapshots as Torrents
1377+Serialize event history into content-addressed chunks:
1378+1379+#+BEGIN_SRC typescript
1380+interface HistorySnapshot {
1381+ period: "2024-05";
1382+ events: SyncEvent[];
1383+ merkleRoot: string;
1384+ deviceStates: Record<string, DeviceState>;
1385+}
1386+1387+// Share via WebTorrent
1388+const blob = await serializeSnapshot(events);
1389+const infohash = await createTorrent(blob);
1390+realm.broadcast({ type: "historySnapshot", period, infohash });
1391+#+END_SRC
1392+1393+**** Materialized State Snapshots
1394+Using dexie-export-import for database snapshots:
1395+1396+#+BEGIN_SRC typescript
1397+const dbBlob = await exportDB(db, {
1398+ tables: ['channels', 'channelEntries'],
1399+ filter: (table, value) => !isDeviceSpecific(table, value)
1400+});
1401+1402+const infohash = await createTorrent(dbBlob);
1403+#+END_SRC
1404+1405+**** New Device Bootstrap
1406+1. Download latest DB snapshot → Instant UI
1407+2. Download recent events → Apply updates
1408+3. Background: fetch historical event logs
1409+4. Result: Fast startup with complete history
1410+1411+*** Implementation Benefits
1412+1413+1. **Privacy**: No server sees listening history
1414+2. **Offline-first**: Everything works locally
1415+3. **Efficient sync**: Only exchange missing data
1416+4. **P2P content**: Reduce server bandwidth
1417+5. **Scalable**: Torrents for bulk data transfer
1418+6. **Verifiable**: Merkle trees ensure consistency
1419+1420+*** Next Steps
1421+- [ ] Implement HLC timestamps
1422+- [ ] Build merkle tree sync protocol
1423+- [ ] Integrate WebTorrent with realm signaling
1424+- [ ] Create history snapshot system
1425+- [ ] Test cross-device sync scenarios
1426+1427+** Additional Architecture Insights
1428+1429+*** Unified Infohash Approach
1430+Instead of having separate hashes for merkle tree and WebTorrent, use infohashes throughout:
1431+1432+**** Hierarchical Infohash Structure
1433+#+BEGIN_SRC typescript
1434+// Leaf level: individual files
1435+const episode1Hash = await createTorrent(episode1.mp3);
1436+const feedXmlHash = await createTorrent(feed.xml);
1437+1438+// Directory level: multi-file torrent
1439+const feedTorrent = await createTorrent({
1440+ name: 'example.com.rss',
1441+ files: [
1442+ { path: 'rss.xml', infohash: feedXmlHash },
1443+ { path: 'episode-1.mp3', infohash: episode1Hash }
1444+ ]
1445+});
1446+1447+// Root level: torrent of feed torrents
1448+const rootTorrent = await createTorrent({
1449+ name: 'feeds',
1450+ folders: [
1451+ { path: 'example.com.rss', infohash: feedTorrent.infoHash }
1452+ ]
1453+});
1454+#+END_SRC
1455+1456+Benefits:
1457+- Single hash type throughout system
1458+- Progressive loading (directory structure first, then files)
1459+- Natural deduplication
1460+- WebTorrent native sharing of folder structures
1461+1462+*** Long-term Event Log Scaling
1463+1464+**** Checkpoint + Delta Pattern
1465+For handling millions of events, use periodic checkpoints:
1466+1467+#+BEGIN_SRC typescript
1468+interface EventCheckpoint {
1469+ hlc: HLC;
1470+ stateSnapshot: {
1471+ subscriptions: Channel[];
1472+ playStates: PlayRecord[];
1473+ settings: Settings;
1474+ };
1475+ eventCount: number;
1476+ infohash: string; // torrent of this checkpoint
1477+}
1478+1479+// Every 10k events or monthly
1480+async function createCheckpoint(): Promise<Checkpoint> {
1481+ const currentHLC = getLatestEventHLC();
1482+1483+ // Export materialized state using dexie-export-import
1484+ const dbBlob = await exportDB(db, {
1485+ filter: (table, value) => {
1486+ return !['activeSyncs', 'tempData'].includes(table);
1487+ }
1488+ });
1489+1490+ const infohash = await createTorrent(dbBlob);
1491+ return { hlc: currentHLC, dbExport: dbBlob, infohash };
1492+}
1493+#+END_SRC
1494+1495+**** Bootstrap Flow with Checkpoints
1496+1. New device downloads latest checkpoint via WebTorrent
1497+2. Imports directly to IndexedDB: `await importDB(checkpoint.blob)`
1498+3. Requests only recent events since checkpoint
1499+4. Applies recent events to catch up
1500+1501+Benefits:
1502+- Fast bootstrap (one checkpoint instead of million events)
1503+- No double materialization (IndexedDB is already materialized state)
1504+- P2P distribution of checkpoints
1505+- Clear version migration path
1506+1507+*** Sync State Management
1508+1509+**** Catching Up vs Live Events
1510+#+BEGIN_SRC typescript
1511+interface SyncState {
1512+ localHLC: HLC;
1513+ remoteHLC: HLC;
1514+ mode: 'catching-up' | 'live';
1515+}
1516+1517+// Separate handlers for historical vs live events
1518+async function replayHistoricalEvents(from: HLC, to: HLC) {
1519+ const events = await fetchEvents(from, to);
1520+1521+ // Process in batches without UI updates
1522+ await db.transaction('rw', db.tables, async () => {
1523+ for (const batch of chunks(events, 1000)) {
1524+ await Promise.all(batch.map(applyEventSilently));
1525+ }
1526+ });
1527+1528+ // One UI update at the end
1529+ notifyUI('Sync complete', { newEpisodes: 47 });
1530+}
1531+1532+function handleLiveEvent(event: SyncEvent) {
1533+ // Real-time event - update UI immediately
1534+ applyEvent(event);
1535+ if (event.type === 'newEpisode') {
1536+ showNotification(`New episode: ${event.title}`);
1537+ }
1538+}
1539+#+END_SRC
1540+1541+**** HLC Comparison for Ordering
1542+#+BEGIN_SRC typescript
1543+function compareHLC(a: HLC, b: HLC): number {
1544+ if (a.millis !== b.millis) return a.millis - b.millis;
1545+ if (a.counter !== b.counter) return a.counter - b.counter;
1546+ return a.node.localeCompare(b.node);
1547+}
1548+1549+// Determine if caught up
1550+function isCaughtUp(myHLC: HLC, peerHLC: HLC): boolean {
1551+ return compareHLC(myHLC, peerHLC) >= 0;
1552+}
1553+#+END_SRC
1554+1555+*** Handling Out-of-Order Events
1556+1557+**** Idempotent Reducers (No Replay Needed)
1558+Design reducers to handle events arriving out of order:
1559+1560+#+BEGIN_SRC typescript
1561+// HLC-aware reducer that handles out-of-order events
1562+function reducePlayPosition(state, event) {
1563+ if (event.type === 'updatePosition') {
1564+ const existing = state.positions[event.episodeId];
1565+ // Only update if this event is newer
1566+ if (!existing || compareHLC(event.hlc, existing.hlc) > 0) {
1567+ state.positions[event.episodeId] = {
1568+ position: event.position,
1569+ hlc: event.hlc // Track which event set this
1570+ };
1571+ }
1572+ }
1573+}
1574+#+END_SRC
1575+1576+**** Example: Offline Device Rejoining
1577+#+BEGIN_SRC typescript
1578+// Device A offline for a week, comes back with old events
1579+Device A: [
1580+ { hlc: "1000:0:A", type: "markPlayed", episode: "ep1" },
1581+ { hlc: "1100:0:A", type: "updatePosition", episode: "ep1", position: 500 }
1582+]
1583+1584+// Device B already has newer event
1585+Device B: [
1586+ { hlc: "1050:0:B", type: "updatePosition", episode: "ep1", position: 1000 }
1587+]
1588+1589+// Smart reducer produces correct final state
1590+finalState = {
1591+ "ep1": {
1592+ played: true, // from 1000:0:A
1593+ position: 1000, // from 1050:0:B (newer HLC wins)
1594+ lastPositionHLC: "1050:0:B"
1595+ }
1596+}
1597+#+END_SRC
1598+1599+Key principles:
1600+- Store HLC with state changes
1601+- Use "last write wins" with HLC comparison
1602+- Make operations commutative when possible
1603+- No need for full replay when inserting old events
+81-1265
readme-devlog.org
···1#+PROPERTY: COOKIE_DATA recursive
2#+STARTUP: overview
34-most of this is old, I need to rework it
5-6-* design
7-8-** frontend (packages/app)
9-- http://localhost:7891
10-- proxies ~/api~ and ~/sync~ to the backend in development
11-- uses Dexie for local storage with sync plugin
12-- custom sync replication implementation using PeerJS through the signalling server
13-14-** backend (packages/server)
15-- http://localhost:7890
16-- serves ~/dist~ if the directory is present (see ~dist~ script)
17-- serves ~/api~ for RSS caching proxy
18- - file-based routing under the api directory
19-- serves ~/sync~ which is a ~peerjs~ signalling server
20-21-** sync
22-- each client keeps the full data set
23-- dexie sync and observable let us stream change sets
24-- we can publish the "latest" to all peers
25-- on first pull, if not the first client, we can request a dump out of band
26-27-*** rss feed data
28-- do we want to backup feed data?
29- - conceptually, this should be refetchable
30- - but feeds go away, and some will only show recent stories
31- - so yes, we'll need this
32- - but server side, we can dedupe
33- - content-addressed server-side cache?
34-35-- server side does RSS pulling
36- - can feeds be marked private, such that they won't be pulled through the proxy?
37- - but then we require everything to be fetchable via cors
38- - client configured proxy settings?
39-40-*** peer connection
41-- on startup, check for current realm-id and key pair
42-- if not present, ask to login or start new
43- - if login, run through the [[* pairing]] process
44- - if start new, run through the [[* registration]] process
45-- use keypair to authenticate to server
46- - response includes list of active peers to connect
47-- clients negotiate sync from there
48-- an identity is a keypair and a realm
49-50-- realm is uuid
51- - realm on the server is the socket connection for peer discovery
52- - keeps a list of verified public keys
53- - and manages the /current/ ~public-key->peer ids~ mapping
54- - realm on the client side is first piece of info required for sync
55- - when connecting to the signalling server, you present a realm, and a signed public key
56- - server accepts/rejects based on signature and current verified keys
57-58-- a new keypair can create a realm
59-60-- a new keypair can double sign an invitation
61- - invite = ~{ realm:, nonce:, not_before:, not_after:, authorizer: }~, signed with verified key
62- - exchanging an invite = ~{ invite: }~, signed with my key
63-64-- on startup
65- - start stand-alone (no syncing required, usually the case on first-run)
66- - generate a keypair
67- - want server backup?
68- - sign a "setup" message with new keypair and send to the server
69- - server responds with a new realm, that this keypair is already verified for
70- - move along
71- - exchange invite to sync to other devices
72- - generate a keypair
73- - sign the exchange message with the invite and send to the server
74- - server verifies the invite
75- - adds the new public key to the peer list and publishes downstream
76- - move along
77-78-***** standalone
79-in this mode, there is no syncing. this is the most likely first-time run option.
80-81-- generate a keypair on startup, so we have a stable fingerprint in the future
82-- done
83-84-***** pairing
85-in this mode, there is syncing to a named realm, but not necessarily server resources consumed
86-we don't need an email, since the server is just doing signalling and peer management
87-88-- generate an invite from an existing verified peer
89- - ~{ realm:, not_before:, not_after:, inviter: peer.public_key }~
90- - sign that invitation from the existing verified peer
91-92-- standalone -> paired
93- - get the invitation somehow (QR code?)
94- - sign an invite exchange with the standalone's public key
95- - send to server
96- - server verifies the invite
97- - adds the new public key to the peer list and publishes downstream
98-99-***** server backup
100-in this mode, there is syncing to a named realm by email.
101-102-goal of server backup mode is that we can go from email->fully working client with latest data without having to have any clients left around that could participate in the sync.
103-104-- generate a keypair on startup
105-- sign a registration message sent to the server
106- - send a verification email
107- - if email/realm already exists, this is authorization
108- - if not, it's email validation
109- - server starts a realm and associates the public key
110- - server acts as a peer for the realm, and stores private data
111-112-- since dexie is publishing change sets, we should be able to just store deltas
113-- but we'll need to store _all_ deltas, unless we're materializing on the server side too
114- - should we use an indexdb shim so we can import/export from the server for clean start?
115- - how much materialization does the server need?
116-117-* ai instructions
118-- when writing to the devlog, add tags to your entries specifying ~:ai:~ and what tool did it.
119-- false starts and prototypes are in ~./devlog/~
120-121-* notes and decision record [1/11]
122-** architecture design (may 28-29) :ai:claude:
123-124-details notes are in [[./devlog/may-29.org]]
125-key decisions and system design:
126-127-*** sync model
128-- device-specific records for playback state/queues to avoid conflicts
129-- content-addressed server cache with deduplication
130-- dual-JWT invitation flow for secure realm joining
131-132-*** data structures
133-- tag-based filtering system instead of rigid hierarchies
134-- regex patterns for episode title parsing and organization
135-- service worker caching with background download support
136-137-*** core schemas
138-**** client (dexie)
139-- Channel/ChannelEntry for RSS feeds and episodes
140-- PlayRecord/QueueItem scoped by deviceId
141-- FilterView for virtual feed organization
142-143-**** server (drizzle)
144-- ContentStore for deduplicated content by hash
145-- Realm/PeerConnection for sync authorization
146-- HttpCache with health tracking and TTL
147-148-*** push sync strategy
149-- revision-based sync (just send revision ranges in push notifications)
150-- background fetch API for large downloads where supported
151-- graceful degradation to reactive caching
152-153-*** research todos :ai:claude:
154-155-**** sync and data management
156-***** DONE identity and signature management
157-***** TODO dexie sync capabilities vs rxdb for multi-device sync implementation
158-***** TODO webrtc p2p sync implementation patterns and reliability
159-***** TODO conflict resolution strategies for device-specific data in distributed sync
160-***** TODO content-addressed deduplication algorithms for rss/podcast content
161-**** client-side storage and caching
162-***** TODO opfs storage limits and cleanup strategies for client-side caching
163-***** TODO practical background fetch api limits and edge cases for podcast downloads
164-**** automation and intelligence
165-***** TODO llm-based regex generation for episode title parsing automation
166-***** TODO push notification subscription management and realm authentication
167-**** platform and browser capabilities
168-***** TODO browser audio api capabilities for podcast-specific features (speed, silence skip)
169-***** TODO progressive web app installation and platform-specific behaviors
170-171-# Local Variables:
172-# org-hierarchical-todo-statistics: nil
173-# org-checkbox-hierarchical-statistics: nil
174-# End:
175-176-** <2025-05-28 Wed>
177-getting everything setup
178-179-the biggest open question I have is what sort of privacy/encryption guarantee I need. I want the server to be able to do things like cache and store feed data long-term.
180-181-Is "if you want full privacy, self-host" valid?
182-183-*** possibilities
184-185-- fully PWA
186- - CON: cors, which would require a proxy anyway
187- - CON: audio analysis, llm based stuff for categorization, etc. won't work
188- - PRO: private as all get out
189- - can still do WebRTC p2p sync for resiliancy
190- - can still do server backups, if sync stream is encrypted, but no compaction would be available
191- - could do _explicit_ server backups as dump files
192-193-- self hostable
194- - PRO: can do bunches of private stuff on the server, because if you don't want me to see it, do it elsewhere
195- - CON: hard for folk to use
196-197-*** brainstorm :ai:claude:
198-**** sync conflict resolution design discussion :ai:claude:
199-200-discussed the sync architecture and dexie conflict handling:
201-202-*dexie syncable limitations*:
203-- logical clocks handle causally-related changes well
204-- basic timestamp-based conflict resolution for concurrent updates
205-- last-writer-wins for same field conflicts
206-- no sophisticated CRDT or vector clock support
207-208-*solutions for podcast-specific conflicts*:
209-210-- play records: device-specific approach
211- - store separate ~play_records~ per ~device_id~
212- - each record: ~{ episode_id, device_id, position, completed, timestamp }~
213- - UI handles conflict resolution with "continue from X device?" prompts
214- - avoids arbitrary timestamp wins, gives users control
215-216-- subscription trees
217- - store ~parent_path~ as single string field ("/Tech/Programming")
218- - simpler than managing folder membership tables
219- - conflicts still possible but contained to single field
220- - could store move operations as events for richer resolution
221-222-*other sync considerations*:
223-- settings/preferences: distinguish device-local vs global
224-- bulk operations: "mark all played" can create duplicate operations
225-- metadata updates: server RSS updates vs local renames
226-- temporal ordering: recently played lists, queue reordering
227-- storage limits: cleanup operations conflicting across devices
228-- feed state: refresh timestamps, error states
229-230-*approach*: prefer "events not state" pattern and device-specific records where semantic conflicts are likely
231-232-**** data model brainstorm :ai:claude:
233-234-core entities designed with sync in mind:
235-236-***** ~Feed~ :: RSS/podcast subscription
237-- ~parent_path~ field for folder structure (eg. ~/Tech/Programming~)
238-- ~is_private~ flag to skip server proxy
239-- ~refresh_interval~ for custom update frequencies
240-241-***** ~Episode~ :: individual podcast episodes
242-- standard RSS metadata (guid, title, description, media url)
243-- duration and file info for playback
244-245-***** ~PlayRecord~ :: device-specific playback state
246-- separate record per ~device_id~ to avoid timestamp conflicts
247-- position, completed status, playback speed
248-- UI can prompt "continue from X device?" for resolution
249-250-***** ~QueueItem~ :: device-specific episode queue
251-- ordered list with position field
252-- ~device_id~ scoped to avoid queue conflicts
253-254-***** ~Subscription~ :: feed membership settings
255-- can be global or device-specific
256-- auto-download preferences per device
257-258-***** ~Settings~ :: split global vs device-local
259-- theme, default speed = global
260-- download path, audio device = device-local
261-262-***** Event tables for complex operations:
263-- ~FeedMoveEvent~ for folder reorganization
264-- ~BulkMarkPlayedEvent~ for "mark all read" operations
265-- better conflict resolution than direct state updates
266-267-***** sync considerations
268-- device identity established on first run
269-- dexie syncable handles basic timestamp conflicts
270-- prefer device-scoped records for semantic conflicts
271-- event-driven pattern for bulk operations
272-273-**** schema evolution from previous iteration :ai:claude:
274-275-reviewed existing schema from tmp/feed.ts - well designed foundation:
276-277-***** keep from original
278-- Channel/ChannelEntry naming and structure
279-- ~refreshHP~ adaptive refresh system (much better than simple intervals)
280-- rich podcast metadata (people, tags, enclosure, podcast object)
281-- HTTP caching with etag/status tracking
282-- epoch millisecond timestamps
283-- ~hashId()~ approach for entry IDs
284-285-***** add for multi-device sync
286-- ~PlayState~ table (device-scoped position/completion)
287-- Subscription table (with ~parentPath~ for folders, device-scoped settings)
288-- ~QueueItem~ table (device-scoped episode queues)
289-- Device table (identity management)
290-291-***** migration considerations
292-- existing Channel/ChannelEntry can be preserved
293-- new tables are additive
294-- ~fetchAndUpsert~ method works well with server proxy architecture
295-- dexie sync vs rxdb - need to evaluate change tracking capabilities
296-297-**** content-addressed caching for offline resilience :ai:claude:
298-299-designed caching system for when upstream feeds fail/disappear, building on existing cache-schema.ts:
300-301-***** server-side schema evolution (drizzle sqlite):
302-- keep existing ~httpCacheTable~ design (health tracking, http headers, ttl)
303-- add ~contentHash~ field pointing to deduplicated content
304-- new ~contentStoreTable~: deduplicated blobs by sha256 hash
305-- new ~contentHistoryTable~: url -> contentHash timeline with isLatest flag
306-- reference counting for garbage collection
307-308-***** client-side OPFS storage
309-- ~/cache/content/{contentHash}.xml~ for raw feeds
310-- ~/cache/media/{contentHash}.mp3~ for podcast episodes
311-- ~LocalCacheEntry~ metadata tracks expiration and offline-only flags
312-- maintains last N versions per feed for historical access
313-314-***** fetch strategy & fallback
315-1. check local OPFS cache first (fastest)
316-2. try server proxy ~/api/feed?url={feedUrl}~ (deduplicated)
317-3. server checks ~contentHistory~, serves latest or fetches upstream
318-4. server returns ~{contentHash, content, cached: boolean}~
319-5. client stores with content hash as filename
320-6. emergency mode: serve stale content when upstream fails
321-322-- preserves existing health tracking and HTTP caching logic
323-- popular feeds cached once on server, many clients benefit
324-- bandwidth savings via content hash comparison
325-- historical feed state preservation (feeds disappear!)
326-- true offline operation after initial sync
327-328-** <2025-05-29 Thu> :ai:claude:
329-e2e encryption and invitation flow design
330-331-worked through the crypto and invitation architecture. key decisions:
332-333-*** keypair strategy
334-- use jwk format for interoperability (server stores public keys)
335-- ed25519 for signing, separate x25519 for encryption if needed
336-- zustand lazy initialization pattern: ~ensureKeypair()~ on first use
337-- store private jwk in persisted zustand state
338-339-*** invitation flow: dual-jwt approach
340-solved the chicken-and-egg problem of sharing encryption keys securely.
341-342-**** qr code contains two signed jwts:
343-1. invitation token: ~{iss: inviter_fingerprint, sub: invitation_id, purpose: "realm_invite"}~
344-2. encryption key token: ~{iss: inviter_fingerprint, ephemeral_private: base64_key, purpose: "ephemeral_key"}~
345-346-**** exchange process:
347-1. invitee posts jwt1 + their public keys to ~/invitations~
348-2. server verifies jwt1 signature against realm members
349-3. if valid: adds invitee to realm, returns ~{realm_id, realm_members, encrypted_realm_key}~
350-4. invitee verifies jwt2 signature against returned realm members
351-5. invitee extracts ephemeral private key, decrypts realm encryption key
352-353-**** security properties:
354-- server never has decryption capability (missing ephemeral private key)
355-- both jwts must be signed by verified realm member
356-- if first exchange fails, second jwt is cryptographically worthless
357-- atomic operation: identity added only if invitation valid
358-- built-in expiration and tamper detection via jwt standard
359-360-**** considered alternatives:
361-- raw ephemeral keys in qr: simpler but no authenticity
362-- ecdh key agreement: chicken-and-egg problem with public key exchange
363-- server escrow: good but missing authentication layer
364-- password-based: requires secure out-of-band sharing
365-366-the dual-jwt approach provides proper authenticated invitations while maintaining e2e encryption properties.
367-368-**** refined dual-jwt with ephemeral signing
369-simplified the approach by using ephemeral key for second jwt signature:
370-371-**setup**:
372-1. inviter generates ephemeral keypair
373-2. encrypts realm key with ephemeral private key
374-3. posts to server: ~{invitation_id, realm_id, ephemeral_public, encrypted_realm_key}~
375-376-**qr code contains**:
377-#+BEGIN_SRC json
378-// JWT 1: signed with inviter's realm signing key
379-{
380- "realm_id": "uuid",
381- "invitation_id": "uuid",
382- "iss": "inviter_fingerprint"
383-}
384-385-// JWT 2: signed with ephemeral private key
386-{
387- "ephemeral_private": "base64_key",
388- "invitation_id": "uuid"
389-}
390-#+END_SRC
391-392-**exchange flow**:
393-1. submit jwt1 → server verifies against realm members → returns ~{invitation_id, realm_id, ephemeral_public, encrypted_realm_key}~
394-2. verify jwt2 signature using ~ephemeral_public~ from server response
395-3. extract ~ephemeral_private~ from jwt2, decrypt realm key
396-397-**benefits over previous version**:
398-- no premature key disclosure (invitee keys shared via normal webrtc peering)
399-- self-contained verification (ephemeral public key verifies jwt2)
400-- cleaner separation of realm auth vs encryption key distribution
401-- simpler flow (no need to return realm member list)
402-403-**crypto verification principle**: digital signatures work as sign-with-private/verify-with-public, while encryption works as encrypt-with-public/decrypt-with-private. jwt2 verification uses signature verification, not decryption.
404-405-**invitation flow diagram**:
406-#+BEGIN_SRC mermaid
407-sequenceDiagram
408- participant I as Inviter
409- participant S as Server
410- participant E as Invitee
411-412- Note over I: Generate ephemeral keypair
413- I->>I: ephemeral_private, ephemeral_public
414-415- Note over I: Encrypt realm key
416- I->>I: encrypted_realm_key = encrypt(realm_key, ephemeral_private)
417-418- I->>S: POST /invitations<br/>{invitation_id, realm_id, ephemeral_public, encrypted_realm_key}
419- S-->>I: OK
420-421- Note over I: Create JWTs for QR code
422- I->>I: jwt1 = sign({realm_id, invitation_id}, inviter_private)
423- I->>I: jwt2 = sign({ephemeral_private, invitation_id}, ephemeral_private)
424-425- Note over I,E: QR code contains [jwt1, jwt2]
426-427- E->>S: POST /invitations/exchange<br/>{jwt1}
428- Note over S: Verify jwt1 signature<br/>against realm members
429- S-->>E: {invitation_id, realm_id, ephemeral_public, encrypted_realm_key}
430-431- Note over E: Verify jwt2 signature<br/>using ephemeral_public
432- E->>E: verify_signature(jwt2, ephemeral_public)
433-434- Note over E: Extract key and decrypt
435- E->>E: ephemeral_private = decode(jwt2)
436- E->>E: realm_key = decrypt(encrypted_realm_key, ephemeral_private)
437-438- Note over E: Now member of realm!
439-#+END_SRC
440-441-**** jwk keypair generation and validation :ai:claude:
442-443-discussed jwk vs raw crypto.subtle for keypair storage. since public keys need server storage for realm authorization, jwk is better for interoperability.
444-445-**keypair generation**:
446-#+BEGIN_SRC typescript
447-const keypair = await crypto.subtle.generateKey(
448- { name: "Ed25519" },
449- true,
450- ["sign", "verify"]
451-);
452-453-const publicJWK = await crypto.subtle.exportKey("jwk", keypair.publicKey);
454-const privateJWK = await crypto.subtle.exportKey("jwk", keypair.privateKey);
455-456-// JWK format:
457-{
458- "kty": "OKP",
459- "crv": "Ed25519",
460- "x": "base64url-encoded-public-key",
461- "d": "base64url-encoded-private-key" // only in private JWK
462-}
463-#+END_SRC
464-465-**client validation**:
466-#+BEGIN_SRC typescript
467-function isValidEd25519PublicJWK(jwk: any): boolean {
468- return (
469- typeof jwk === 'object' &&
470- jwk.kty === 'OKP' &&
471- jwk.crv === 'Ed25519' &&
472- typeof jwk.x === 'string' &&
473- jwk.x.length === 43 && // base64url Ed25519 public key length
474- !jwk.d && // public key shouldn't have private component
475- !jwk.use || jwk.use === 'sig'
476- );
477-}
478-479-async function validatePublicKey(publicJWK: JsonWebKey): Promise<CryptoKey | null> {
480- try {
481- if (!isValidEd25519PublicJWK(publicJWK)) return null;
482-483- const key = await crypto.subtle.importKey(
484- 'jwk',
485- publicJWK,
486- { name: 'Ed25519' },
487- false,
488- ['verify']
489- );
490-491- return key;
492- } catch {
493- return null;
494- }
495-}
496-#+END_SRC
497-498-**server validation (node.js)**:
499-#+BEGIN_SRC typescript
500-import { webcrypto } from 'node:crypto';
501-502-async function validateClientPublicKey(publicJWK: JsonWebKey): Promise<boolean> {
503- try {
504- if (!isValidEd25519PublicJWK(publicJWK)) return false;
505-506- await webcrypto.subtle.importKey(
507- 'jwk',
508- publicJWK,
509- { name: 'Ed25519' },
510- false,
511- ['verify']
512- );
513-514- return true;
515- } catch {
516- return false;
517- }
518-}
519-#+END_SRC
520-521-**authentication flow**:
522-#+BEGIN_SRC typescript
523-// client signs message
524-const authMessage = {
525- realm: 'uuid-here',
526- timestamp: Date.now(),
527- action: 'join'
528-};
529-530-const signature = await crypto.subtle.sign(
531- 'Ed25519',
532- privateKey,
533- new TextEncoder().encode(JSON.stringify(authMessage))
534-);
535-536-// server verifies
537-async function verifyAuth(req: AuthRequest): Promise<boolean> {
538- const publicKey = await webcrypto.subtle.importKey(
539- 'jwk',
540- req.publicKey,
541- { name: 'Ed25519' },
542- false,
543- ['verify']
544- );
545-546- const messageBytes = new TextEncoder().encode(JSON.stringify(req.message));
547- const signatureBytes = new Uint8Array(req.signature);
548-549- return await webcrypto.subtle.verify(
550- 'Ed25519',
551- publicKey,
552- signatureBytes,
553- messageBytes
554- );
555-}
556-#+END_SRC
557-558-**** proposed schemas :ai:claude:
559-560-***** client-side schema (dexie)
561-562-#+BEGIN_SRC typescript
563-// Core RSS/Podcast data (from your existing design)
564-interface Channel {
565- id: string;
566- feedUrl: string;
567- htmlUrl?: string;
568- imageUrl?: string;
569- title?: string;
570- description?: string;
571- language?: string;
572- people?: Record<string, string>;
573- tags?: string[];
574-575- // Refresh management
576- refreshHP: number;
577- nextRefreshAt?: number;
578- lastRefreshAt?: number;
579- lastRefreshStatus?: string;
580- lastRefreshHttpStatus?: number;
581- lastRefreshHttpEtag?: string;
582-583- // Cache info
584- contentHash?: string;
585- lastFetchedAt?: number;
586-}
587-588-interface ChannelEntry {
589- id: string;
590- channelId: string;
591- guid: string;
592- title: string;
593- linkUrl?: string;
594- imageUrl?: string;
595- snippet?: string;
596- content?: string;
597-598- enclosure?: {
599- url: string;
600- type?: string;
601- length?: number;
602- };
603-604- podcast?: {
605- explicit?: boolean;
606- duration?: string;
607- seasonNum?: number;
608- episodeNum?: number;
609- transcriptUrl?: string;
610- };
611-612- publishedAt?: number;
613- fetchedAt?: number;
614-}
615-616-// Device-specific sync tables
617-interface PlayRecord {
618- id: string;
619- entryId: string;
620- deviceId: string;
621- position: number;
622- duration?: number;
623- completed: boolean;
624- speed: number;
625- updatedAt: number;
626-}
627-628-interface Subscription {
629- id: string;
630- channelId: string;
631- deviceId?: string;
632- parentPath: string; // "/Tech/Programming"
633- autoDownload: boolean;
634- downloadLimit?: number;
635- isActive: boolean;
636- createdAt: number;
637- updatedAt: number;
638-}
639-640-interface QueueItem {
641- id: string;
642- entryId: string;
643- deviceId: string;
644- position: number;
645- addedAt: number;
646-}
647-648-interface Device {
649- id: string;
650- name: string;
651- platform: string;
652- lastSeen: number;
653-}
654-655-// Local cache metadata
656-interface LocalCache {
657- id: string;
658- url: string;
659- contentHash: string;
660- filePath: string; // OPFS path
661- cachedAt: number;
662- expiresAt?: number;
663- size: number;
664- isOfflineOnly: boolean;
665-}
666-667-// Dexie schema
668-const db = new Dexie('SkypodDB');
669-db.version(1).stores({
670- channels: '&id, feedUrl, contentHash',
671- channelEntries: '&id, channelId, publishedAt',
672- playRecords: '&id, [entryId+deviceId], deviceId, updatedAt',
673- subscriptions: '&id, channelId, deviceId, parentPath',
674- queueItems: '&id, entryId, deviceId, position',
675- devices: '&id, lastSeen',
676- localCache: '&id, url, contentHash, expiresAt'
677-});
678-#+END_SRC
679680-***** server-side schema
681682-#+BEGIN_SRC typescript
683-// Content-addressed cache
684-interface ContentStore {
685- contentHash: string; // Primary key
686- content: Buffer; // Raw feed content
687- contentType: string;
688- contentLength: number;
689- firstSeenAt: number;
690- referenceCount: number;
691-}
692693-interface ContentHistory {
694- id: string;
695- url: string;
696- contentHash: string;
697- fetchedAt: number;
698- isLatest: boolean;
699-}
700701-// HTTP cache with health tracking (from your existing design)
702-interface HttpCache {
703- key: string; // URL hash, primary key
704- url: string;
705706- status: 'alive' | 'dead';
707- lastFetchedAt: number;
708- lastFetchError?: string;
709- lastFetchErrorStreak: number;
710711- lastHttpStatus: number;
712- lastHttpEtag?: string;
713- lastHttpHeaders: Record<string, string>;
714- expiresAt: number;
715- expirationTtl: number;
716717- contentHash: string; // Points to ContentStore
718-}
719720-// Sync/auth tables
721-interface Realm {
722- id: string; // UUID
723- createdAt: number;
724- verifiedKeys: string[]; // Public key list
725-}
726727-interface PeerConnection {
728- id: string;
729- realmId: string;
730- publicKey: string;
731- lastSeen: number;
732- isOnline: boolean;
733-}
734735-// Media cache for podcast episodes
736-interface MediaCache {
737- contentHash: string; // Primary key
738- originalUrl: string;
739- mimeType: string;
740- fileSize: number;
741- content: Buffer;
742- cachedAt: number;
743- accessCount: number;
744-}
745-#+END_SRC
746747-**** episode title parsing for sub-feed groupings :ai:claude:
0748749-*problem*: some podcast feeds contain multiple shows, need hierarchical organization within a feed
750751-*example*: "Apocalypse Players" podcast
752-- episode title: "A Term of Art 6 - Winston's Hollow"
753-- desired grouping: "Apocalypse Players > A Term of Art > 6 - Winston's Hollow"
754-- UI shows sub-shows within the main feed
755756-***** approaches considered
757758-1. *manual regex patterns* (short-term solution)
759- - user provides regex with capture groups = tags
760- - reliable, immediate, user-controlled
761- - requires manual setup per feed
762763-2. *LLM-generated regex* (automation goal)
764- - analyze last 100 episode titles
765- - generate regex pattern automatically
766- - good balance of automation + reliability
767768-3. *NER model training* (experimental)
769- - train spacy model for episode title parsing
770- - current prototype: 150 labelled examples, limited success
771- - needs more training data to be viable
772773-***** data model implications
0774775-- add regex pattern field to Channel/Feed
776-- store extracted groupings as hierarchical tags on ~ChannelEntry~
777-- maybe add grouping/series field to episodes
778779-***** plan
780781-*preference*: start with manual regex, evolve toward LLM automation
0782783-*implementation design*:
784-- if no title pattern: episodes are direct children of the feed
785-- title pattern = regex with named capture groups + path template
786787-*example configuration*:
788-- regex: ~^(?<series>[^0-9]+)\s*(?<episode>\d+)\s*-\s*(?<title>.+)$~
789-- path template: ~{series} > Episode {episode} - {title}~
790-- result: "A Term of Art 6 - Winston's Hollow" → "A Term of Art > Episode 6 - Winston's Hollow"
791792-*schema additions*:
793-#+BEGIN_SRC typescript
794-interface Channel {
795- // ... existing fields
796- titlePatterns?: Array<{
797- name: string; // "Main Episodes", "Bonus Content", etc.
798- regex: string; // named capture groups
799- pathTemplate: string; // interpolation template
800- priority: number; // order to try patterns (lower = first)
801- isActive: boolean; // can disable without deleting
802- }>;
803- fallbackPath?: string; // template for unmatched episodes
804-}
805806-interface ChannelEntry {
807- // ... existing fields
808- parsedPath?: string; // computed from titlePattern
809- parsedGroups?: Record<string, string>; // captured groups
810- matchedPatternName?: string; // which pattern was used
811-}
812-#+END_SRC
813814-*pattern matching logic*:
815-1. try patterns in priority order (lower number = higher priority)
816-2. first matching pattern wins
817-3. if no patterns match, use fallbackPath template (e.g., "Misc > {title}")
818-4. if no fallbackPath, episode stays direct child of feed
819820-*example multi-pattern setup*:
821-- Pattern 1: "Main Episodes" - ~^(?<series>[^0-9]+)\s*(?<episode>\d+)~ → ~{series} > Episode {episode}~
822-- Pattern 2: "Bonus Content" - ~^Bonus:\s*(?<title>.+)~ → ~Bonus > {title}~
823-- Fallback: ~Misc > {title}~
824825-**** scoped tags and filter-based UI evolution :ai:claude:
826827-*generalization*: move from rigid hierarchies to tag-based filtering system
0828829-*tag scoping*:
830-- feed-level tags: "Tech", "Gaming", "D&D"
831-- episode-level tags: from regex captures like "series:CriticalRole", "campaign:2", "type:main"
832-- user tags: manual additions like "favorites", "todo"
833-834-*UI as tag filtering*:
835-- default view: all episodes grouped by feed
836-- filter by ~series:CriticalRole~ → shows only CR episodes across all feeds
837-- filter by ~type:bonus~ → shows bonus content from all podcasts
838-- combine filters: ~series:CriticalRole AND type:main~ → main CR episodes only
839-840-*benefits*:
841-- no rigid hierarchy - users create their own views
842-- regex patterns become automated episode taggers
843-- same filtering system works for search, organization, queues
844-- tags are syncable metadata, views are client-side
845-846-*schema evolution*:
847-#+BEGIN_SRC typescript
848-interface Tag {
849- scope: 'feed' | 'episode' | 'user';
850- key: string; // "series", "type", "campaign"
851- value: string; // "CriticalRole", "bonus", "2"
852-}
853-854-interface ChannelEntry {
855- // ... existing
856- tags: Tag[]; // includes regex-generated + manual
857-}
858-859-interface FilterView {
860- id: string;
861- name: string;
862- folderPath: string; // "/Channels/Critical Role"
863- filters: Array<{
864- key: string;
865- value: string;
866- operator: 'equals' | 'contains' | 'not';
867- }>;
868- isDefault: boolean;
869- createdAt: number;
870-}
871-#+END_SRC
872-873-**** default UI construction and feed merging :ai:claude:
874-875-*auto-generated views on subscribe*:
876-- subscribe to "Critical Role" → creates ~/Channels/Critical Role~ folder
877-- default filter view: ~feed:CriticalRole~ (shows all episodes from that feed)
878-- user can customize, split into sub-views, or delete
879-880-*smart view suggestions*:
881-- after regex patterns generate tags, suggest splitting views
882-- "I noticed episodes with ~series:Campaign2~ and ~series:Campaign3~ - create separate views?"
883-- "Create view for ~type:bonus~ episodes?"
884-885-*view management UX*:
886-- right-click feed → "Split by series", "Split by type"
887-- drag episodes between views to create manual filters
888-- views can be nested: ~/Channels/Critical Role/Campaign 2/Main Episodes~
889-890-*feed merging for multi-source shows*:
891-problem: patreon feed + main show feed for same podcast
892-893-#+BEGIN_EXAMPLE
894-/Channels/
895- Critical Role/
896- All Episodes # merged view: feed:CriticalRole OR feed:CriticalRolePatreon
897- Main Feed # filter: feed:CriticalRole
898- Patreon Feed # filter: feed:CriticalRolePatreon
899-#+END_EXAMPLE
900-901-*deduplication strategy*:
902-- episodes matched by ~guid~ or similar content hash
903-- duplicate episodes get ~source:main,patreon~ tags
904-- UI shows single episode with source indicators
905-- user can choose preferred source for playback
906-- play state syncs across all sources of same episode
907908-*feed relationship schema*:
909-#+BEGIN_SRC typescript
910-interface FeedGroup {
911- id: string;
912- name: string; // "Critical Role"
913- feedIds: string[]; // [mainFeedId, patreonFeedId]
914- mergeStrategy: 'guid' | 'title' | 'contentHash';
915- defaultView: FilterView;
916-}
917918-interface ChannelEntry {
919- // ... existing
920- duplicateOf?: string; // points to canonical episode ID
921- sources: string[]; // feed IDs where this episode appears
922-}
923-#+END_SRC
924925-**per-view settings and state**:
926-each filter view acts like a virtual feed with its own:
927-- unread counts (episodes matching filter that haven't been played)
928-- notification settings (notify for new episodes in this view)
929-- muted state (hide notifications, mark as read automatically)
930-- auto-download preferences (download episodes that match this filter)
931-- play queue integration (add new episodes to queue)
932933-**use cases**:
934-- mute "Bonus Content" view but keep notifications for main episodes
935-- auto-download only "Campaign 2" episodes, skip everything else
936-- separate unread counts: "5 unread in Main Episodes, 2 in Bonus"
937-- queue only certain series automatically
938939-**schema additions**:
940-#+BEGIN_SRC typescript
941-interface FilterView {
942- // ... existing fields
943- settings: {
944- notificationsEnabled: boolean;
945- isMuted: boolean;
946- autoDownload: boolean;
947- autoQueue: boolean;
948- downloadLimit?: number; // max episodes to keep
949- };
950- state: {
951- unreadCount: number;
952- lastViewedAt?: number;
953- isCollapsed: boolean; // in sidebar
954- };
955-}
956-#+END_SRC
957958-*inheritance behavior*:
959-- new filter views inherit settings from parent feed/group
960-- user can override per-view
961-- "mute all Critical Role" vs "mute only bonus episodes"
962963-**** client-side episode caching strategy :ai:claude:
00000964965-*architecture*: service worker-based transparent caching
0966967-*flow*:
968-1. audio player requests ~/audio?url={episodeUrl}~
969-2. service worker intercepts request
970-3. if present in cache (with Range header support):
971- - serve from cache
972-4. else:
973- - let request continue to server (immediate playback)
974- - simultaneously start background fetch of full audio file
975- - when complete, broadcast "episode-cached" event
976- - audio player catches event and restarts feed → now uses cached version
977-978-**benefits**:
979-- no playback interruption (streaming starts immediately)
980-- seamless transition to cached version
981-- Range header support for seeking/scrubbing
982-- transparent to audio player implementation
983-984-*implementation considerations*:
985-- cache storage limits and cleanup policies
986-- partial download resumption if interrupted
987-- cache invalidation when episode URLs change
988-- offline playback support
989-- progress tracking for background downloads
990-991-**schema additions**:
992-#+BEGIN_SRC typescript
993-interface CachedEpisode {
994- episodeId: string;
995- originalUrl: string;
996- cacheKey: string; // for cache API
997- fileSize: number;
998- cachedAt: number;
999- lastAccessedAt: number;
1000- downloadProgress?: number; // 0-100 for in-progress downloads
1001-}
1002-#+END_SRC
1003-1004-**service worker events**:
1005-- ~episode-cache-started~ - background download began
1006-- ~episode-cache-progress~ - download progress update
1007-- ~episode-cache-complete~ - ready to switch to cached version
1008-- ~episode-cache-error~ - download failed, stay with streaming
1009-1010-**background sync for proactive downloads**:
1011-1012-**browser support reality**:
1013-- Background Sync API: good support (Chrome/Edge, limited Safari)
1014-- Periodic Background Sync: very limited (Chrome only, requires PWA install)
1015-- Push notifications: good support, but requires user permission
1016-1017-**hybrid approach**:
1018-1. **foreground sync** (reliable): when app is open, check for new episodes
1019-2. **background sync** (opportunistic): register sync event when app closes
1020-3. **push notifications** (fallback): server pushes "new episodes available"
1021-4. **manual sync** (always works): pull-to-refresh, settings toggle
1022-1023-**implementation strategy**:
1024-#+BEGIN_SRC typescript
1025-// Register background sync when app becomes hidden
1026-document.addEventListener('visibilitychange', () => {
1027- if (document.hidden && 'serviceWorker' in navigator) {
1028- navigator.serviceWorker.ready.then(registration => {
1029- return registration.sync.register('download-episodes');
1030- });
1031- }
1032-});
1033-1034-// Service worker handles sync event
1035-self.addEventListener('sync', event => {
1036- if (event.tag === 'download-episodes') {
1037- event.waitUntil(syncEpisodes());
1038- }
1039-});
1040-#+END_SRC
1041-1042-**realistic expectations**:
1043-- iOS Safari: very limited background processing
1044-- Android Chrome: decent background sync support
1045-- Desktop: mostly works
1046-- battery/data saver modes: disabled by OS
1047-1048-**fallback strategy**: rely primarily on foreground sync + push notifications, treat background sync as nice-to-have enhancement
1049-1050-**push notification sync workflow**:
1051-1052-**server-side trigger**:
1053-1. server detects new episodes during RSS refresh
1054-2. check which users are subscribed to that feed
1055-3. send push notification with episode metadata payload
1056-4. notification wakes up service worker on client
1057-1058-**service worker notification handler**:
1059-#+BEGIN_SRC typescript
1060-self.addEventListener('push', event => {
1061- const data = event.data?.json();
1062-1063- if (data.type === 'new-episodes') {
1064- event.waitUntil(
1065- // Start background download of new episodes
1066- downloadNewEpisodes(data.episodes)
1067- .then(() => {
1068- // Show notification to user
1069- return self.registration.showNotification('New episodes available', {
1070- body: ~${data.episodes.length} new episodes downloaded~,
1071- icon: '/icon-192.png',
1072- badge: '/badge-72.png',
1073- tag: 'new-episodes',
1074- data: { episodeIds: data.episodes.map(e => e.id) }
1075- });
1076- })
1077- );
1078- }
1079-});
1080-1081-// Handle notification click
1082-self.addEventListener('notificationclick', event => {
1083- event.notification.close();
1084-1085- // Open app to specific episode or feed
1086- event.waitUntil(
1087- clients.openWindow(~/episodes/${event.notification.data.episodeIds[0]}~)
1088- );
1089-});
1090-#+END_SRC
1091-1092-**server push logic**:
1093-- batch notifications (don't spam for every episode)
1094-- respect user notification preferences from FilterView settings
1095-- include episode metadata in payload to avoid round-trip
1096-- throttle notifications (max 1 per feed per hour?)
1097-1098-**user flow**:
1099-1. new episode published → server pushes notification
1100-2. service worker downloads episode in background
1101-3. user sees "New episodes downloaded" notification
1102-4. tap notification → opens app to new episode, ready to play offline
1103-1104-*benefits*:
1105-- true background downloading without user interaction
1106-- works even when app is closed
1107-- respects per-feed notification settings
1108-1109-**push payload size constraints**:
1110-- **limit**: ~4KB (4,096 bytes) across most services
1111-- **practical limit**: ~3KB to account for service overhead
1112-- **implications for episode metadata**:
1113-1114-#+BEGIN_SRC json
1115-{
1116- "type": "new-episodes",
1117- "episodes": [
1118- {
1119- "id": "ep123",
1120- "channelId": "ch456",
1121- "title": "Episode Title",
1122- "url": "https://...",
1123- "duration": 3600,
1124- "size": 89432112
1125- }
1126- ]
1127-}
1128-#+END_SRC
1129-1130-**payload optimization strategies**:
1131-- minimal episode metadata in push (id, url, basic info)
1132-- batch multiple episodes in single notification
1133-- full episode details fetched after service worker wakes up
1134-- URL shortening for long episode URLs
1135-- compress JSON payload if needed
1136-1137-**alternative for large payloads**:
1138-- push notification contains only "new episodes available" signal
1139-- service worker makes API call to get full episode list
1140-- trade-off: requires network round-trip but unlimited data
1141-1142-**logical clock sync optimization**:
1143-1144-much simpler approach using sync revisions:
1145-1146-#+BEGIN_SRC json
1147-{
1148- "type": "sync-available",
1149- "fromRevision": 12345,
1150- "toRevision": 12389,
1151- "changeCount": 8
1152-}
1153-#+END_SRC
1154-1155-**service worker sync flow**:
1156-1. push notification wakes service worker with revision range
1157-2. service worker fetches ~/sync?from=12345&to=12389~
1158-3. server returns only changes in that range (episodes, feed updates, etc)
1159-4. service worker applies changes to local dexie store
1160-5. service worker queues background downloads for new episodes
1161-6. updates local revision to 12389
1162-1163-**benefits of revision-based approach**:
1164-- tiny push payload (just revision numbers)
1165-- server can efficiently return only changes in range
1166-- automatic deduplication (revision already applied = skip)
1167-- works for any sync data (episodes, feed metadata, user settings)
1168-- handles offline gaps gracefully (fetch missing revision ranges)
1169-1170-**sync API response**:
1171-#+BEGIN_SRC typescript
1172-interface SyncResponse {
1173- fromRevision: number;
1174- toRevision: number;
1175- changes: Array<{
1176- type: 'episode' | 'channel' | 'subscription';
1177- operation: 'create' | 'update' | 'delete';
1178- data: any;
1179- revision: number;
1180- }>;
1181-}
1182-#+END_SRC
1183-1184-**integration with episode downloads**:
1185-- service worker processes sync changes
1186-- identifies new episodes that match user's auto-download filters
1187-- queues those for background cache fetching
1188-- much more efficient than sending episode metadata in push payload
1189-1190-**service worker processing time constraints**:
1191-1192-**hard limits**:
1193-- **30 seconds idle timeout**: service worker terminates after 30s of inactivity
1194-- **5 minutes event processing**: single event/request must complete within 5 minutes
1195-- **30 seconds fetch timeout**: individual network requests timeout after 30s
1196-- **notification requirement**: push events MUST display notification before promise settles
1197-1198-**practical implications**:
1199-- sync API call (~/sync?from=X&to=Y~) must complete within 30s
1200-- large episode downloads must be queued, not started immediately in push handler
1201-- use ~event.waitUntil()~ to keep service worker alive during processing
1202-- break large operations into smaller chunks
1203-1204-**recommended push event flow**:
1205-#+BEGIN_SRC typescript
1206-self.addEventListener('push', event => {
1207- const data = event.data?.json();
1208-1209- event.waitUntil(
1210- // Must complete within 5 minutes total
1211- handlePushSync(data)
1212- .then(() => {
1213- // Required: show notification before promise settles
1214- return self.registration.showNotification('Episodes synced');
1215- })
1216- );
1217-});
1218-1219-async function handlePushSync(data) {
1220- // 1. Quick sync API call (< 30s)
1221- const changes = await fetch(~/sync?from=${data.fromRevision}&to=${data.toRevision}~);
1222-1223- // 2. Apply changes to dexie store (fast, local)
1224- await applyChangesToStore(changes);
1225-1226- // 3. Queue episode downloads for later (don't start here)
1227- await queueEpisodeDownloads(changes.newEpisodes);
1228-1229- // Total time: < 5 minutes, preferably < 30s
1230-}
1231-#+END_SRC
1232-1233-*download strategy*: use push event for sync + queuing, separate background tasks for actual downloads
1234-1235-*background fetch API for large downloads*:
1236-1237-*progressive enhancement approach*:
1238-#+BEGIN_SRC typescript
1239-async function queueEpisodeDownloads(episodes) {
1240- for (const episode of episodes) {
1241- if ('serviceWorker' in navigator && 'BackgroundFetch' in window) {
1242- // Chrome/Edge: use Background Fetch API for true background downloading
1243- await navigator.serviceWorker.ready.then(registration => {
1244- return registration.backgroundFetch.fetch(
1245- ~episode-${episode.id}~,
1246- episode.url,
1247- {
1248- icons: [{ src: '/icon-256.png', sizes: '256x256', type: 'image/png' }],
1249- title: ~Downloading: ${episode.title}~,
1250- downloadTotal: episode.fileSize
1251- }
1252- );
1253- });
1254- } else {
1255- // Fallback: queue for reactive download (download while streaming)
1256- await queueReactiveDownload(episode);
1257- }
1258- }
1259-}
1260-1261-// Handle background fetch completion
1262-self.addEventListener('backgroundfetch', event => {
1263- if (event.tag.startsWith('episode-')) {
1264- event.waitUntil(handleEpisodeDownloadComplete(event));
1265- }
1266-});
1267-#+END_SRC
1268-1269-*browser support reality*:
1270-- *Chrome/Edge*: Background Fetch API supported
1271-- *Firefox/Safari*: not supported, fallback to reactive caching
1272-- *mobile*: varies by platform and browser
1273-1274-*benefits when available*:
1275-- true background downloading (survives app close, browser close)
1276-- built-in download progress UI
1277-- automatic retry on network failure
1278-- no service worker time limits during download
1279-1280-*graceful degradation*:
1281-- detect support, use when available
1282-- fallback to reactive caching (download while streaming)
1283-- user gets best experience possible on their platform
1284-1285-*** research todos :ai:claude:
1286-1287-high-level unanswered questions from architecture brainstorming:
1288-1289-**** sync and data management
1290-***** TODO dexie sync capabilities vs rxdb for multi-device sync implementation
1291-***** TODO webrtc p2p sync implementation patterns and reliability
1292-***** TODO conflict resolution strategies for device-specific data in distributed sync
1293-***** TODO content-addressed deduplication algorithms for rss/podcast content
1294-**** client-side storage and caching
1295-***** TODO opfs storage limits and cleanup strategies for client-side caching
1296-***** TODO practical background fetch api limits and edge cases for podcast downloads
1297-**** automation and intelligence
1298-***** TODO llm-based regex generation for episode title parsing automation
1299-***** TODO push notification subscription management and realm authentication
1300-**** platform and browser capabilities
1301-***** TODO browser audio api capabilities for podcast-specific features (speed, silence skip)
1302-***** TODO progressive web app installation and platform-specific behaviors
1303-1304-# Local Variables:
1305-# org-hierarchical-todo-statistics: nil
1306-# org-checkbox-hierarchical-statistics: nil
1307-# End:
···1#+PROPERTY: COOKIE_DATA recursive
2#+STARTUP: overview
34+* concepts [0/4]
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000056+the skypod architecture is broken into pieces:
78+** p2p with realms
000000000910+In order to sync device and playback state, we need to have devices communicate with each
11+other, which requires a signaling server, for peer discovery at the very least.
000001213+*** realm
0001415+A realm is a collection of known identities, verified with signed JWTs, where those
16+identities can communicate outside of the server's purview.
001718+A realm is not publicly routable; to gain access, one must have the realm id and an
19+invitation to the realm from an already existing member; new realms are created on
20+demand with a random realmid supplied by the client.
002122+**** realm server
02324+A realm at the signalling sever is only a collection of known identity public keys, and
25+the currently connected sockets.
00002627+It acts mostly as a smart router; socket connection, authentication, and directed and
28+realm-wide broadcast.
000002930+***** TODO webrtc requirements
00000000003132+- [ ] what needs to happen for webrtc? I think SDP messages will just flow like normal
33+ - checkout whatever simplepeer does
3435+**** realm client
3637+In order to keep private data off of the server, the realm client takes on the additional
38+task of maintaining a shared encyrption key for the realm, which can be used to encrypt
39+data going over broadcasts.
04041+***** TODO key exchange protocol
4243+*** identity
0004445+Identity in the realm system is just an id and keypair.
0004647+The private key stays local to the installed device, and is used to send signed tokens
48+over the wire, either to the realm server to manage authentication, or over other channels
49+to other members in the realm.
05051+The /public/ key is stored by all members of the realm, and the server, in order to
52+perform signature validation (which is also authentication).
5354+**** browser private key storage
005556+There is no good way to store private keys in a browser, but there are less bad ways.
5758+- private keys are ~CryptoKey~ objects, with ~{ exportable: false }~
59+- WebCrypto native ~CryptoKey~ are structured clonable, which means they can get saved to indexeddb
6061+At the end of the day this is a podcast app.
006263+***** TODO are there other ways to do this?
0006465+Could we use webauthn, or some way to use a yubikey to sign something?
0000000000006667+*** sync
0000006869+Once a device has authenticated to the realm server over a websocket connection, it can
70+send and broadcast any message it likes to the other online of the realm, via the websocket
71+announcement channel.
007273+This sets it up to be used as a signaling server for WebRTC, allowing realm devices to
74+communicate p2p without a dependency on the realm server outside of authentication and
75+signaling.
07677+** feed proxy server
7879+Due to ~CORS~, we'll need to help clients fetch the contents of feeds by running a caching
80+proxy server for various HTTP requests.
8182+- help bypass ~CORS~ restrictions, so clients can access the content of the response
83+- cache feeds, especially with regards to running transformations
84+- perform transformations on responses:
85+ - text feeds: reader mode, detect reading time
86+ - podcast feeds: extract episode metadata, audio analysis for silence skips, etc
87+ - all feeds: extract title tags, etc.
0000000000000000000000000000000000000000000000000000000000000000000000008889+*** TODO open question: is the client able to not use the proxy?
000000009091+I'm not sure yet if we want the PWA to be able to pull feeds directly when the server
92+isn't present. It would be much easier to keep it around, but
00009394+** feed management
0000009596+With a solid p2p WebRTC connection, we can use something like ~dexie~ or ~rxdb~ to get a
97+synced document database that we use to manage feeds.
0009899+* flow
00000000000000000100101+- user goes to https://skypod.accidental.cc
102+ - pwa runs, prompts to do full install for storage and offline
103+ - pwa is installed, sets up caches
0104105+- first run
106+ - identity is generated (id + keypair per device)
107+ - do you want to sync to an existing install?
108+ - if yes, go to invitee flow
109+ - otherwise, new realm is generated and registered
110+ - pubkey and id get stored in the realm, to make future sync easier
111112+- subsequent runs
113+ - identity already exists, so we just go about our day
114115+- invitee flow
116+ - already generated identity
117+ - qr code pops
118+ - scanned by inviter, see inviter flow
119+ - done button after
120+ - camera pops, scan inviter's QR codes
121+ - sends invitation+registration token to server
122+ - added to the realm
123+ - go to subsequent runs
00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
+5-5
readme.org
···29- Common
30 - ES2024 Javascript, running on in modern browsers or [[https://nodejs.org][Node v24]]
31 - [[https://github.com/panva/jose][~jose~]] for cross-platform webcrypto and JWT management
32- - [[https://zod.dev/][Zod v4]] describes schema and builds transforamtion pipelines
33- Backend
34 - [[https://expressjs.com/][Express]] and Node's ~stdlib~ for HTTP and WebSocket servers
35- Frontend
36 - [[https://vite.dev/][Vite]] does FE builds
37- - [[https://react.dev][React]] + [[https://zustand.docs.pmnd.rs][Zustand]] for UI
38- Build & DX
39 - [[https://github.com/google/wireit][Wireit]] does script dependencies and services
40 - [[https://jsdoc.app/][JSDoc]], along with [[https://www.typescriptlang.org/docs/handbook/jsdoc-supported-types.html][Typescript's JSDoc support]] does typechecking
···46- per-realm SQLite databases with node's native sqlite support
47- docker compose for deployment with self-hosted realm storage
4849-See [[./devlog.org]] for design and architecture thoughts.
5051** Scripts
5253All scripts can have ~--watch~ passed as an argument to have ~wireit~ rerun when inputs change.
54This is not useful for everything.
5556-- ~npm run dev~ :: alias for ~npm run start:dev --watch~
57- ~npm run lint~ :: runs ~eslint~
58- ~npm run types~ :: runs ~tsc~ (no emitting, just typechecking)
59- ~npm run docs~ :: runs ~jsdoc~ to generate docs in ~./docs~
···73This program is free software: you can redistribute it and/or modify it under the terms of
74the **Affero General Public License verson 3 or later** (AGPLv3+).
7576-Please see [[./license.txt]] for a copy of the full license.
···29- Common
30 - ES2024 Javascript, running on in modern browsers or [[https://nodejs.org][Node v24]]
31 - [[https://github.com/panva/jose][~jose~]] for cross-platform webcrypto and JWT management
32+ - [[https://zod.dev/][Zod v4]] describes schema and builds transformation pipelines
33- Backend
34 - [[https://expressjs.com/][Express]] and Node's ~stdlib~ for HTTP and WebSocket servers
35- Frontend
36 - [[https://vite.dev/][Vite]] does FE builds
37+ - [[https://preactjs.com/][Preact]] + [[https://zustand.docs.pmnd.rs][Zustand]] for UI
38- Build & DX
39 - [[https://github.com/google/wireit][Wireit]] does script dependencies and services
40 - [[https://jsdoc.app/][JSDoc]], along with [[https://www.typescriptlang.org/docs/handbook/jsdoc-supported-types.html][Typescript's JSDoc support]] does typechecking
···46- per-realm SQLite databases with node's native sqlite support
47- docker compose for deployment with self-hosted realm storage
4849+See [[./readme-devlog.org]] for design and architecture thoughts.
5051** Scripts
5253All scripts can have ~--watch~ passed as an argument to have ~wireit~ rerun when inputs change.
54This is not useful for everything.
5556+- ~npm run dev~ :: alias for ~npm run start:dev~
57- ~npm run lint~ :: runs ~eslint~
58- ~npm run types~ :: runs ~tsc~ (no emitting, just typechecking)
59- ~npm run docs~ :: runs ~jsdoc~ to generate docs in ~./docs~
···73This program is free software: you can redistribute it and/or modify it under the terms of
74the **Affero General Public License verson 3 or later** (AGPLv3+).
7576+Please see [[./readme-license.txt]] for a copy of the full license.
+2-2
src/client/page-app.spec.jsx
···1718 // Check the JSX structure without full rendering
19 expect(component.type).toBe(Fragment)
20- expect(component.props.children).toHaveProperty('type', 'h1')
21- expect(component.props.children.props.children).toBe('whatever')
22 })
23})
24
···1718 // Check the JSX structure without full rendering
19 expect(component.type).toBe(Fragment)
20+ expect(component.props.children[0]).toHaveProperty('type', 'h1')
21+ expect(component.props.children[0].props.children).toBe('whatever')
22 })
23})
24