···11+* uncategorized notes
22+33+** sync
44+- each client keeps the full data set
55+- dexie sync and observable let us stream change sets
66+- we can publish the "latest" to all peers
77+- on first pull, if not the first client, we can request a dump out of band
88+99+*** rss feed data
1010+- do we want to backup feed data?
1111+ - conceptually, this should be refetchable
1212+ - but feeds go away, and some will only show recent stories
1313+ - so yes, we'll need this
1414+ - but server side, we can dedupe
1515+ - content-addressed server-side cache?
1616+1717+- server side does RSS pulling
1818+ - can feeds be marked private, such that they won't be pulled through the proxy?
1919+ - but then we require everything to be fetchable via cors
2020+ - client configured proxy settings?
2121+2222+*** peer connection
2323+- on startup, check for current realm-id and key pair
2424+- if not present, ask to login or start new
2525+ - if login, run through the [[* pairing]] process
2626+ - if start new, run through the [[* registration]] process
2727+- use keypair to authenticate to server
2828+ - response includes list of active peers to connect
2929+- clients negotiate sync from there
3030+- an identity is a keypair and a realm
3131+3232+- realm is uuid
3333+ - realm on the server is the socket connection for peer discovery
3434+ - keeps a list of verified public keys
3535+ - and manages the /current/ ~public-key->peer ids~ mapping
3636+ - realm on the client side is first piece of info required for sync
3737+ - when connecting to the signalling server, you present a realm, and a signed public key
3838+ - server accepts/rejects based on signature and current verified keys
3939+4040+- a new keypair can create a realm
4141+4242+- a new keypair can double sign an invitation
4343+ - invite = ~{ realm:, nonce:, not_before:, not_after:, authorizer: }~, signed with verified key
4444+ - exchanging an invite = ~{ invite: }~, signed with my key
4545+4646+- on startup
4747+ - start stand-alone (no syncing required, usually the case on first-run)
4848+ - generate a keypair
4949+ - want server backup?
5050+ - sign a "setup" message with new keypair and send to the server
5151+ - server responds with a new realm, that this keypair is already verified for
5252+ - move along
5353+ - exchange invite to sync to other devices
5454+ - generate a keypair
5555+ - sign the exchange message with the invite and send to the server
5656+ - server verifies the invite
5757+ - adds the new public key to the peer list and publishes downstream
5858+ - move along
5959+6060+***** standalone
6161+in this mode, there is no syncing. this is the most likely first-time run option.
6262+6363+- generate a keypair on startup, so we have a stable fingerprint in the future
6464+- done
6565+6666+***** pairing
6767+in this mode, there is syncing to a named realm, but not necessarily server resources consumed
6868+we don't need an email, since the server is just doing signalling and peer management
6969+7070+- generate an invite from an existing verified peer
7171+ - ~{ realm:, not_before:, not_after:, inviter: peer.public_key }~
7272+ - sign that invitation from the existing verified peer
7373+7474+- standalone -> paired
7575+ - get the invitation somehow (QR code?)
7676+ - sign an invite exchange with the standalone's public key
7777+ - send to server
7878+ - server verifies the invite
7979+ - adds the new public key to the peer list and publishes downstream
8080+8181+***** server backup
8282+in this mode, there is syncing to a named realm by email.
8383+8484+goal of server backup mode is that we can go from email->fully working client with latest data without having to have any clients left around that could participate in the sync.
8585+8686+- generate a keypair on startup
8787+- sign a registration message sent to the server
8888+ - send a verification email
8989+ - if email/realm already exists, this is authorization
9090+ - if not, it's email validation
9191+ - server starts a realm and associates the public key
9292+ - server acts as a peer for the realm, and stores private data
9393+9494+- since dexie is publishing change sets, we should be able to just store deltas
9595+- but we'll need to store _all_ deltas, unless we're materializing on the server side too
9696+ - should we use an indexdb shim so we can import/export from the server for clean start?
9797+ - how much materialization does the server need?
9898+9999+** summarized architecture design (may 28-29) :ai:claude:
100100+101101+key decisions and system design:
102102+103103+*** sync model
104104+- device-specific records for playback state/queues to avoid conflicts
105105+- content-addressed server cache with deduplication
106106+- dual-JWT invitation flow for secure realm joining
107107+108108+*** data structures
109109+- tag-based filtering system instead of rigid hierarchies
110110+- regex patterns for episode title parsing and organization
111111+- service worker caching with background download support
112112+113113+*** core schemas
114114+**** client (dexie)
115115+- Channel/ChannelEntry for RSS feeds and episodes
116116+- PlayRecord/QueueItem scoped by deviceId
117117+- FilterView for virtual feed organization
118118+119119+**** server (drizzle)
120120+- ContentStore for deduplicated content by hash
121121+- Realm/PeerConnection for sync authorization
122122+- HttpCache with health tracking and TTL
123123+124124+*** push sync strategy
125125+- revision-based sync (just send revision ranges in push notifications)
126126+- background fetch API for large downloads where supported
127127+- graceful degradation to reactive caching
128128+129129+*** research todos :ai:claude:
130130+**** sync and data management
131131+***** DONE identity and signature management
132132+***** TODO dexie sync capabilities vs rxdb for multi-device sync implementation
133133+***** TODO webrtc p2p sync implementation patterns and reliability
134134+***** TODO conflict resolution strategies for device-specific data in distributed sync
135135+***** TODO content-addressed deduplication algorithms for rss/podcast content
136136+**** client-side storage and caching
137137+***** TODO opfs storage limits and cleanup strategies for client-side caching
138138+***** TODO practical background fetch api limits and edge cases for podcast downloads
139139+**** automation and intelligence
140140+***** TODO llm-based regex generation for episode title parsing automation
141141+***** TODO push notification subscription management and realm authentication
142142+**** platform and browser capabilities
143143+***** TODO browser audio api capabilities for podcast-specific features (speed, silence skip)
144144+***** TODO progressive web app installation and platform-specific behaviors
145145+146146+** <2025-05-28 Wed>
147147+getting everything setup
148148+149149+the biggest open question I have is what sort of privacy/encryption guarantee I need. I
150150+want the server to be able to do things like cache and store feed data long-term.
151151+152152+Is "if you want full privacy, self-host" valid?
153153+154154+*** possibilities
155155+156156+- fully PWA
157157+ - CON: cors, which would require a proxy anyway
158158+ - CON: audio analysis, llm based stuff for categorization, etc. won't work
159159+ - PRO: private as all get out
160160+ - can still do WebRTC p2p sync for resiliancy
161161+ - can still do server backups, if sync stream is encrypted, but no compaction would be available
162162+ - could do _explicit_ server backups as dump files
163163+164164+- self hostable
165165+ - PRO: can do bunches of private stuff on the server, because if you don't want me to see it, do it elsewhere
166166+ - CON: hard for folk to use
167167+168168+*** sync conflict resolution design discussion :ai:claude:
169169+170170+discussed the sync architecture and dexie conflict handling:
171171+172172+*dexie syncable limitations*:
173173+- logical clocks handle causally-related changes well
174174+- basic timestamp-based conflict resolution for concurrent updates
175175+- last-writer-wins for same field conflicts
176176+- no sophisticated CRDT or vector clock support
177177+178178+*solutions for podcast-specific conflicts*:
179179+180180+- play records: device-specific approach
181181+ - store separate ~play_records~ per ~device_id~
182182+ - each record: ~{ episode_id, device_id, position, completed, timestamp }~
183183+ - UI handles conflict resolution with "continue from X device?" prompts
184184+ - avoids arbitrary timestamp wins, gives users control
185185+186186+- subscription trees
187187+ - store ~parent_path~ as single string field ("/Tech/Programming")
188188+ - simpler than managing folder membership tables
189189+ - conflicts still possible but contained to single field
190190+ - could store move operations as events for richer resolution
191191+192192+*other sync considerations*:
193193+- settings/preferences: distinguish device-local vs global
194194+- bulk operations: "mark all played" can create duplicate operations
195195+- metadata updates: server RSS updates vs local renames
196196+- temporal ordering: recently played lists, queue reordering
197197+- storage limits: cleanup operations conflicting across devices
198198+- feed state: refresh timestamps, error states
199199+200200+*approach*: prefer "events not state" pattern and device-specific records where semantic conflicts are likely
201201+202202+*** data model brainstorm :ai:claude:
203203+204204+core entities designed with sync in mind:
205205+206206+**** ~Feed~ :: RSS/podcast subscription
207207+- ~parent_path~ field for folder structure (eg. ~/Tech/Programming~)
208208+- ~is_private~ flag to skip server proxy
209209+- ~refresh_interval~ for custom update frequencies
210210+211211+**** ~Episode~ :: individual podcast episodes
212212+- standard RSS metadata (guid, title, description, media url)
213213+- duration and file info for playback
214214+215215+**** ~PlayRecord~ :: device-specific playback state
216216+- separate record per ~device_id~ to avoid timestamp conflicts
217217+- position, completed status, playback speed
218218+- UI can prompt "continue from X device?" for resolution
219219+220220+**** ~QueueItem~ :: device-specific episode queue
221221+- ordered list with position field
222222+- ~device_id~ scoped to avoid queue conflicts
223223+224224+**** ~Subscription~ :: feed membership settings
225225+- can be global or device-specific
226226+- auto-download preferences per device
227227+228228+**** ~Settings~ :: split global vs device-local
229229+- theme, default speed = global
230230+- download path, audio device = device-local
231231+232232+**** Event tables for complex operations:
233233+- ~FeedMoveEvent~ for folder reorganization
234234+- ~BulkMarkPlayedEvent~ for "mark all read" operations
235235+- better conflict resolution than direct state updates
236236+237237+**** sync considerations
238238+- device identity established on first run
239239+- dexie syncable handles basic timestamp conflicts
240240+- prefer device-scoped records for semantic conflicts
241241+- event-driven pattern for bulk operations
242242+243243+*** schema evolution from previous iteration :ai:claude:
244244+245245+reviewed existing schema from tmp/feed.ts - well designed foundation:
246246+247247+**** keep from original
248248+- Channel/ChannelEntry naming and structure
249249+- ~refreshHP~ adaptive refresh system (much better than simple intervals)
250250+- rich podcast metadata (people, tags, enclosure, podcast object)
251251+- HTTP caching with etag/status tracking
252252+- epoch millisecond timestamps
253253+- ~hashId()~ approach for entry IDs
254254+255255+**** add for multi-device sync
256256+- ~PlayState~ table (device-scoped position/completion)
257257+- Subscription table (with ~parentPath~ for folders, device-scoped settings)
258258+- ~QueueItem~ table (device-scoped episode queues)
259259+- Device table (identity management)
260260+261261+**** migration considerations
262262+- existing Channel/ChannelEntry can be preserved
263263+- new tables are additive
264264+- ~fetchAndUpsert~ method works well with server proxy architecture
265265+- dexie sync vs rxdb - need to evaluate change tracking capabilities
266266+267267+*** content-addressed caching for offline resilience :ai:claude:
268268+269269+designed caching system for when upstream feeds fail/disappear, building on existing cache-schema.ts:
270270+271271+**** server-side schema evolution (drizzle sqlite):
272272+- keep existing ~httpCacheTable~ design (health tracking, http headers, ttl)
273273+- add ~contentHash~ field pointing to deduplicated content
274274+- new ~contentStoreTable~: deduplicated blobs by sha256 hash
275275+- new ~contentHistoryTable~: url -> contentHash timeline with isLatest flag
276276+- reference counting for garbage collection
277277+278278+**** client-side OPFS storage
279279+- ~/cache/content/{contentHash}.xml~ for raw feeds
280280+- ~/cache/media/{contentHash}.mp3~ for podcast episodes
281281+- ~LocalCacheEntry~ metadata tracks expiration and offline-only flags
282282+- maintains last N versions per feed for historical access
283283+284284+**** fetch strategy & fallback
285285+1. check local OPFS cache first (fastest)
286286+2. try server proxy ~/api/feed?url={feedUrl}~ (deduplicated)
287287+3. server checks ~contentHistory~, serves latest or fetches upstream
288288+4. server returns ~{contentHash, content, cached: boolean}~
289289+5. client stores with content hash as filename
290290+6. emergency mode: serve stale content when upstream fails
291291+292292+- preserves existing health tracking and HTTP caching logic
293293+- popular feeds cached once on server, many clients benefit
294294+- bandwidth savings via content hash comparison
295295+- historical feed state preservation (feeds disappear!)
296296+- true offline operation after initial sync
297297+298298+** <2025-05-29 Thu> :ai:claude:
299299+e2e encryption and invitation flow design
300300+301301+worked through the crypto and invitation architecture. key decisions:
302302+303303+*** keypair strategy
304304+- use jwk format for interoperability (server stores public keys)
305305+- ed25519 for signing, separate x25519 for encryption if needed
306306+- zustand lazy initialization pattern: ~ensureKeypair()~ on first use
307307+- store private jwk in persisted zustand state
308308+309309+*** invitation flow: dual-jwt approach
310310+solved the chicken-and-egg problem of sharing encryption keys securely.
311311+312312+**** qr code contains two signed jwts:
313313+1. invitation token: ~{iss: inviter_fingerprint, sub: invitation_id, purpose: "realm_invite"}~
314314+2. encryption key token: ~{iss: inviter_fingerprint, ephemeral_private: base64_key, purpose: "ephemeral_key"}~
315315+316316+**** exchange process:
317317+1. invitee posts jwt1 + their public keys to ~/invitations~
318318+2. server verifies jwt1 signature against realm members
319319+3. if valid: adds invitee to realm, returns ~{realm_id, realm_members, encrypted_realm_key}~
320320+4. invitee verifies jwt2 signature against returned realm members
321321+5. invitee extracts ephemeral private key, decrypts realm encryption key
322322+323323+**** security properties:
324324+- server never has decryption capability (missing ephemeral private key)
325325+- both jwts must be signed by verified realm member
326326+- if first exchange fails, second jwt is cryptographically worthless
327327+- atomic operation: identity added only if invitation valid
328328+- built-in expiration and tamper detection via jwt standard
329329+330330+**** considered alternatives:
331331+- raw ephemeral keys in qr: simpler but no authenticity
332332+- ecdh key agreement: chicken-and-egg problem with public key exchange
333333+- server escrow: good but missing authentication layer
334334+- password-based: requires secure out-of-band sharing
335335+336336+the dual-jwt approach provides proper authenticated invitations while maintaining e2e encryption properties.
337337+338338+**** refined dual-jwt with ephemeral signing
339339+simplified the approach by using ephemeral key for second jwt signature:
340340+341341+**setup**:
342342+1. inviter generates ephemeral keypair
343343+2. encrypts realm key with ephemeral private key
344344+3. posts to server: ~{invitation_id, realm_id, ephemeral_public, encrypted_realm_key}~
345345+346346+**qr code contains**:
347347+#+BEGIN_SRC json
348348+// JWT 1: signed with inviter's realm signing key
349349+{
350350+ "realm_id": "uuid",
351351+ "invitation_id": "uuid",
352352+ "iss": "inviter_fingerprint"
353353+}
354354+355355+// JWT 2: signed with ephemeral private key
356356+{
357357+ "ephemeral_private": "base64_key",
358358+ "invitation_id": "uuid"
359359+}
360360+#+END_SRC
361361+362362+**exchange flow**:
363363+1. submit jwt1 → server verifies against realm members → returns ~{invitation_id, realm_id, ephemeral_public, encrypted_realm_key}~
364364+2. verify jwt2 signature using ~ephemeral_public~ from server response
365365+3. extract ~ephemeral_private~ from jwt2, decrypt realm key
366366+367367+**benefits over previous version**:
368368+- no premature key disclosure (invitee keys shared via normal webrtc peering)
369369+- self-contained verification (ephemeral public key verifies jwt2)
370370+- cleaner separation of realm auth vs encryption key distribution
371371+- simpler flow (no need to return realm member list)
372372+373373+**crypto verification principle**: digital signatures work as sign-with-private/verify-with-public, while encryption works as encrypt-with-public/decrypt-with-private. jwt2 verification uses signature verification, not decryption.
374374+375375+**invitation flow diagram**:
376376+#+BEGIN_SRC mermaid
377377+sequenceDiagram
378378+ participant I as Inviter
379379+ participant S as Server
380380+ participant E as Invitee
381381+382382+ Note over I: Generate ephemeral keypair
383383+ I->>I: ephemeral_private, ephemeral_public
384384+385385+ Note over I: Encrypt realm key
386386+ I->>I: encrypted_realm_key = encrypt(realm_key, ephemeral_private)
387387+388388+ I->>S: POST /invitations<br/>{invitation_id, realm_id, ephemeral_public, encrypted_realm_key}
389389+ S-->>I: OK
390390+391391+ Note over I: Create JWTs for QR code
392392+ I->>I: jwt1 = sign({realm_id, invitation_id}, inviter_private)
393393+ I->>I: jwt2 = sign({ephemeral_private, invitation_id}, ephemeral_private)
394394+395395+ Note over I,E: QR code contains [jwt1, jwt2]
396396+397397+ E->>S: POST /invitations/exchange<br/>{jwt1}
398398+ Note over S: Verify jwt1 signature<br/>against realm members
399399+ S-->>E: {invitation_id, realm_id, ephemeral_public, encrypted_realm_key}
400400+401401+ Note over E: Verify jwt2 signature<br/>using ephemeral_public
402402+ E->>E: verify_signature(jwt2, ephemeral_public)
403403+404404+ Note over E: Extract key and decrypt
405405+ E->>E: ephemeral_private = decode(jwt2)
406406+ E->>E: realm_key = decrypt(encrypted_realm_key, ephemeral_private)
407407+408408+ Note over E: Now member of realm!
409409+#+END_SRC
410410+411411+**** jwk keypair generation and validation :ai:claude:
412412+413413+discussed jwk vs raw crypto.subtle for keypair storage. since public keys need server storage for realm authorization, jwk is better for interoperability.
414414+415415+**keypair generation**:
416416+#+BEGIN_SRC typescript
417417+const keypair = await crypto.subtle.generateKey(
418418+ { name: "Ed25519" },
419419+ true,
420420+ ["sign", "verify"]
421421+);
422422+423423+const publicJWK = await crypto.subtle.exportKey("jwk", keypair.publicKey);
424424+const privateJWK = await crypto.subtle.exportKey("jwk", keypair.privateKey);
425425+426426+// JWK format:
427427+{
428428+ "kty": "OKP",
429429+ "crv": "Ed25519",
430430+ "x": "base64url-encoded-public-key",
431431+ "d": "base64url-encoded-private-key" // only in private JWK
432432+}
433433+#+END_SRC
434434+435435+**client validation**:
436436+#+BEGIN_SRC typescript
437437+function isValidEd25519PublicJWK(jwk: any): boolean {
438438+ return (
439439+ typeof jwk === 'object' &&
440440+ jwk.kty === 'OKP' &&
441441+ jwk.crv === 'Ed25519' &&
442442+ typeof jwk.x === 'string' &&
443443+ jwk.x.length === 43 && // base64url Ed25519 public key length
444444+ !jwk.d && // public key shouldn't have private component
445445+ !jwk.use || jwk.use === 'sig'
446446+ );
447447+}
448448+449449+async function validatePublicKey(publicJWK: JsonWebKey): Promise<CryptoKey | null> {
450450+ try {
451451+ if (!isValidEd25519PublicJWK(publicJWK)) return null;
452452+453453+ const key = await crypto.subtle.importKey(
454454+ 'jwk',
455455+ publicJWK,
456456+ { name: 'Ed25519' },
457457+ false,
458458+ ['verify']
459459+ );
460460+461461+ return key;
462462+ } catch {
463463+ return null;
464464+ }
465465+}
466466+#+END_SRC
467467+468468+**server validation (node.js)**:
469469+#+BEGIN_SRC typescript
470470+import { webcrypto } from 'node:crypto';
471471+472472+async function validateClientPublicKey(publicJWK: JsonWebKey): Promise<boolean> {
473473+ try {
474474+ if (!isValidEd25519PublicJWK(publicJWK)) return false;
475475+476476+ await webcrypto.subtle.importKey(
477477+ 'jwk',
478478+ publicJWK,
479479+ { name: 'Ed25519' },
480480+ false,
481481+ ['verify']
482482+ );
483483+484484+ return true;
485485+ } catch {
486486+ return false;
487487+ }
488488+}
489489+#+END_SRC
490490+491491+**authentication flow**:
492492+#+BEGIN_SRC typescript
493493+// client signs message
494494+const authMessage = {
495495+ realm: 'uuid-here',
496496+ timestamp: Date.now(),
497497+ action: 'join'
498498+};
499499+500500+const signature = await crypto.subtle.sign(
501501+ 'Ed25519',
502502+ privateKey,
503503+ new TextEncoder().encode(JSON.stringify(authMessage))
504504+);
505505+506506+// server verifies
507507+async function verifyAuth(req: AuthRequest): Promise<boolean> {
508508+ const publicKey = await webcrypto.subtle.importKey(
509509+ 'jwk',
510510+ req.publicKey,
511511+ { name: 'Ed25519' },
512512+ false,
513513+ ['verify']
514514+ );
515515+516516+ const messageBytes = new TextEncoder().encode(JSON.stringify(req.message));
517517+ const signatureBytes = new Uint8Array(req.signature);
518518+519519+ return await webcrypto.subtle.verify(
520520+ 'Ed25519',
521521+ publicKey,
522522+ signatureBytes,
523523+ messageBytes
524524+ );
525525+}
526526+#+END_SRC
527527+528528+**** proposed schemas :ai:claude:
529529+530530+***** client-side schema (dexie)
531531+532532+#+BEGIN_SRC typescript
533533+// Core RSS/Podcast data (from your existing design)
534534+interface Channel {
535535+ id: string;
536536+ feedUrl: string;
537537+ htmlUrl?: string;
538538+ imageUrl?: string;
539539+ title?: string;
540540+ description?: string;
541541+ language?: string;
542542+ people?: Record<string, string>;
543543+ tags?: string[];
544544+545545+ // Refresh management
546546+ refreshHP: number;
547547+ nextRefreshAt?: number;
548548+ lastRefreshAt?: number;
549549+ lastRefreshStatus?: string;
550550+ lastRefreshHttpStatus?: number;
551551+ lastRefreshHttpEtag?: string;
552552+553553+ // Cache info
554554+ contentHash?: string;
555555+ lastFetchedAt?: number;
556556+}
557557+558558+interface ChannelEntry {
559559+ id: string;
560560+ channelId: string;
561561+ guid: string;
562562+ title: string;
563563+ linkUrl?: string;
564564+ imageUrl?: string;
565565+ snippet?: string;
566566+ content?: string;
567567+568568+ enclosure?: {
569569+ url: string;
570570+ type?: string;
571571+ length?: number;
572572+ };
573573+574574+ podcast?: {
575575+ explicit?: boolean;
576576+ duration?: string;
577577+ seasonNum?: number;
578578+ episodeNum?: number;
579579+ transcriptUrl?: string;
580580+ };
581581+582582+ publishedAt?: number;
583583+ fetchedAt?: number;
584584+}
585585+586586+// Device-specific sync tables
587587+interface PlayRecord {
588588+ id: string;
589589+ entryId: string;
590590+ deviceId: string;
591591+ position: number;
592592+ duration?: number;
593593+ completed: boolean;
594594+ speed: number;
595595+ updatedAt: number;
596596+}
597597+598598+interface Subscription {
599599+ id: string;
600600+ channelId: string;
601601+ deviceId?: string;
602602+ parentPath: string; // "/Tech/Programming"
603603+ autoDownload: boolean;
604604+ downloadLimit?: number;
605605+ isActive: boolean;
606606+ createdAt: number;
607607+ updatedAt: number;
608608+}
609609+610610+interface QueueItem {
611611+ id: string;
612612+ entryId: string;
613613+ deviceId: string;
614614+ position: number;
615615+ addedAt: number;
616616+}
617617+618618+interface Device {
619619+ id: string;
620620+ name: string;
621621+ platform: string;
622622+ lastSeen: number;
623623+}
624624+625625+// Local cache metadata
626626+interface LocalCache {
627627+ id: string;
628628+ url: string;
629629+ contentHash: string;
630630+ filePath: string; // OPFS path
631631+ cachedAt: number;
632632+ expiresAt?: number;
633633+ size: number;
634634+ isOfflineOnly: boolean;
635635+}
636636+637637+// Dexie schema
638638+const db = new Dexie('SkypodDB');
639639+db.version(1).stores({
640640+ channels: '&id, feedUrl, contentHash',
641641+ channelEntries: '&id, channelId, publishedAt',
642642+ playRecords: '&id, [entryId+deviceId], deviceId, updatedAt',
643643+ subscriptions: '&id, channelId, deviceId, parentPath',
644644+ queueItems: '&id, entryId, deviceId, position',
645645+ devices: '&id, lastSeen',
646646+ localCache: '&id, url, contentHash, expiresAt'
647647+});
648648+#+END_SRC
649649+650650+***** server-side schema
651651+652652+#+BEGIN_SRC typescript
653653+// Content-addressed cache
654654+interface ContentStore {
655655+ contentHash: string; // Primary key
656656+ content: Buffer; // Raw feed content
657657+ contentType: string;
658658+ contentLength: number;
659659+ firstSeenAt: number;
660660+ referenceCount: number;
661661+}
662662+663663+interface ContentHistory {
664664+ id: string;
665665+ url: string;
666666+ contentHash: string;
667667+ fetchedAt: number;
668668+ isLatest: boolean;
669669+}
670670+671671+// HTTP cache with health tracking (from your existing design)
672672+interface HttpCache {
673673+ key: string; // URL hash, primary key
674674+ url: string;
675675+676676+ status: 'alive' | 'dead';
677677+ lastFetchedAt: number;
678678+ lastFetchError?: string;
679679+ lastFetchErrorStreak: number;
680680+681681+ lastHttpStatus: number;
682682+ lastHttpEtag?: string;
683683+ lastHttpHeaders: Record<string, string>;
684684+ expiresAt: number;
685685+ expirationTtl: number;
686686+687687+ contentHash: string; // Points to ContentStore
688688+}
689689+690690+// Sync/auth tables
691691+interface Realm {
692692+ id: string; // UUID
693693+ createdAt: number;
694694+ verifiedKeys: string[]; // Public key list
695695+}
696696+697697+interface PeerConnection {
698698+ id: string;
699699+ realmId: string;
700700+ publicKey: string;
701701+ lastSeen: number;
702702+ isOnline: boolean;
703703+}
704704+705705+// Media cache for podcast episodes
706706+interface MediaCache {
707707+ contentHash: string; // Primary key
708708+ originalUrl: string;
709709+ mimeType: string;
710710+ fileSize: number;
711711+ content: Buffer;
712712+ cachedAt: number;
713713+ accessCount: number;
714714+}
715715+#+END_SRC
716716+717717+**** episode title parsing for sub-feed groupings :ai:claude:
718718+719719+*problem*: some podcast feeds contain multiple shows, need hierarchical organization within a feed
720720+721721+*example*: "Apocalypse Players" podcast
722722+- episode title: "A Term of Art 6 - Winston's Hollow"
723723+- desired grouping: "Apocalypse Players > A Term of Art > 6 - Winston's Hollow"
724724+- UI shows sub-shows within the main feed
725725+726726+***** approaches considered
727727+728728+1. *manual regex patterns* (short-term solution)
729729+ - user provides regex with capture groups = tags
730730+ - reliable, immediate, user-controlled
731731+ - requires manual setup per feed
732732+733733+2. *LLM-generated regex* (automation goal)
734734+ - analyze last 100 episode titles
735735+ - generate regex pattern automatically
736736+ - good balance of automation + reliability
737737+738738+3. *NER model training* (experimental)
739739+ - train spacy model for episode title parsing
740740+ - current prototype: 150 labelled examples, limited success
741741+ - needs more training data to be viable
742742+743743+***** data model implications
744744+745745+- add regex pattern field to Channel/Feed
746746+- store extracted groupings as hierarchical tags on ~ChannelEntry~
747747+- maybe add grouping/series field to episodes
748748+749749+***** plan
750750+751751+*preference*: start with manual regex, evolve toward LLM automation
752752+753753+*implementation design*:
754754+- if no title pattern: episodes are direct children of the feed
755755+- title pattern = regex with named capture groups + path template
756756+757757+*example configuration*:
758758+- regex: ~^(?<series>[^0-9]+)\s*(?<episode>\d+)\s*-\s*(?<title>.+)$~
759759+- path template: ~{series} > Episode {episode} - {title}~
760760+- result: "A Term of Art 6 - Winston's Hollow" → "A Term of Art > Episode 6 - Winston's Hollow"
761761+762762+*schema additions*:
763763+#+BEGIN_SRC typescript
764764+interface Channel {
765765+ // ... existing fields
766766+ titlePatterns?: Array<{
767767+ name: string; // "Main Episodes", "Bonus Content", etc.
768768+ regex: string; // named capture groups
769769+ pathTemplate: string; // interpolation template
770770+ priority: number; // order to try patterns (lower = first)
771771+ isActive: boolean; // can disable without deleting
772772+ }>;
773773+ fallbackPath?: string; // template for unmatched episodes
774774+}
775775+776776+interface ChannelEntry {
777777+ // ... existing fields
778778+ parsedPath?: string; // computed from titlePattern
779779+ parsedGroups?: Record<string, string>; // captured groups
780780+ matchedPatternName?: string; // which pattern was used
781781+}
782782+#+END_SRC
783783+784784+*pattern matching logic*:
785785+1. try patterns in priority order (lower number = higher priority)
786786+2. first matching pattern wins
787787+3. if no patterns match, use fallbackPath template (e.g., "Misc > {title}")
788788+4. if no fallbackPath, episode stays direct child of feed
789789+790790+*example multi-pattern setup*:
791791+- Pattern 1: "Main Episodes" - ~^(?<series>[^0-9]+)\s*(?<episode>\d+)~ → ~{series} > Episode {episode}~
792792+- Pattern 2: "Bonus Content" - ~^Bonus:\s*(?<title>.+)~ → ~Bonus > {title}~
793793+- Fallback: ~Misc > {title}~
794794+795795+**** scoped tags and filter-based UI evolution :ai:claude:
796796+797797+*generalization*: move from rigid hierarchies to tag-based filtering system
798798+799799+*tag scoping*:
800800+- feed-level tags: "Tech", "Gaming", "D&D"
801801+- episode-level tags: from regex captures like "series:CriticalRole", "campaign:2", "type:main"
802802+- user tags: manual additions like "favorites", "todo"
803803+804804+*UI as tag filtering*:
805805+- default view: all episodes grouped by feed
806806+- filter by ~series:CriticalRole~ → shows only CR episodes across all feeds
807807+- filter by ~type:bonus~ → shows bonus content from all podcasts
808808+- combine filters: ~series:CriticalRole AND type:main~ → main CR episodes only
809809+810810+*benefits*:
811811+- no rigid hierarchy - users create their own views
812812+- regex patterns become automated episode taggers
813813+- same filtering system works for search, organization, queues
814814+- tags are syncable metadata, views are client-side
815815+816816+*schema evolution*:
817817+#+BEGIN_SRC typescript
818818+interface Tag {
819819+ scope: 'feed' | 'episode' | 'user';
820820+ key: string; // "series", "type", "campaign"
821821+ value: string; // "CriticalRole", "bonus", "2"
822822+}
823823+824824+interface ChannelEntry {
825825+ // ... existing
826826+ tags: Tag[]; // includes regex-generated + manual
827827+}
828828+829829+interface FilterView {
830830+ id: string;
831831+ name: string;
832832+ folderPath: string; // "/Channels/Critical Role"
833833+ filters: Array<{
834834+ key: string;
835835+ value: string;
836836+ operator: 'equals' | 'contains' | 'not';
837837+ }>;
838838+ isDefault: boolean;
839839+ createdAt: number;
840840+}
841841+#+END_SRC
842842+843843+**** default UI construction and feed merging :ai:claude:
844844+845845+*auto-generated views on subscribe*:
846846+- subscribe to "Critical Role" → creates ~/Channels/Critical Role~ folder
847847+- default filter view: ~feed:CriticalRole~ (shows all episodes from that feed)
848848+- user can customize, split into sub-views, or delete
849849+850850+*smart view suggestions*:
851851+- after regex patterns generate tags, suggest splitting views
852852+- "I noticed episodes with ~series:Campaign2~ and ~series:Campaign3~ - create separate views?"
853853+- "Create view for ~type:bonus~ episodes?"
854854+855855+*view management UX*:
856856+- right-click feed → "Split by series", "Split by type"
857857+- drag episodes between views to create manual filters
858858+- views can be nested: ~/Channels/Critical Role/Campaign 2/Main Episodes~
859859+860860+*feed merging for multi-source shows*:
861861+problem: patreon feed + main show feed for same podcast
862862+863863+#+BEGIN_EXAMPLE
864864+/Channels/
865865+ Critical Role/
866866+ All Episodes # merged view: feed:CriticalRole OR feed:CriticalRolePatreon
867867+ Main Feed # filter: feed:CriticalRole
868868+ Patreon Feed # filter: feed:CriticalRolePatreon
869869+#+END_EXAMPLE
870870+871871+*deduplication strategy*:
872872+- episodes matched by ~guid~ or similar content hash
873873+- duplicate episodes get ~source:main,patreon~ tags
874874+- UI shows single episode with source indicators
875875+- user can choose preferred source for playback
876876+- play state syncs across all sources of same episode
877877+878878+*feed relationship schema*:
879879+#+BEGIN_SRC typescript
880880+interface FeedGroup {
881881+ id: string;
882882+ name: string; // "Critical Role"
883883+ feedIds: string[]; // [mainFeedId, patreonFeedId]
884884+ mergeStrategy: 'guid' | 'title' | 'contentHash';
885885+ defaultView: FilterView;
886886+}
887887+888888+interface ChannelEntry {
889889+ // ... existing
890890+ duplicateOf?: string; // points to canonical episode ID
891891+ sources: string[]; // feed IDs where this episode appears
892892+}
893893+#+END_SRC
894894+895895+**per-view settings and state**:
896896+each filter view acts like a virtual feed with its own:
897897+- unread counts (episodes matching filter that haven't been played)
898898+- notification settings (notify for new episodes in this view)
899899+- muted state (hide notifications, mark as read automatically)
900900+- auto-download preferences (download episodes that match this filter)
901901+- play queue integration (add new episodes to queue)
902902+903903+**use cases**:
904904+- mute "Bonus Content" view but keep notifications for main episodes
905905+- auto-download only "Campaign 2" episodes, skip everything else
906906+- separate unread counts: "5 unread in Main Episodes, 2 in Bonus"
907907+- queue only certain series automatically
908908+909909+**schema additions**:
910910+#+BEGIN_SRC typescript
911911+interface FilterView {
912912+ // ... existing fields
913913+ settings: {
914914+ notificationsEnabled: boolean;
915915+ isMuted: boolean;
916916+ autoDownload: boolean;
917917+ autoQueue: boolean;
918918+ downloadLimit?: number; // max episodes to keep
919919+ };
920920+ state: {
921921+ unreadCount: number;
922922+ lastViewedAt?: number;
923923+ isCollapsed: boolean; // in sidebar
924924+ };
925925+}
926926+#+END_SRC
927927+928928+*inheritance behavior*:
929929+- new filter views inherit settings from parent feed/group
930930+- user can override per-view
931931+- "mute all Critical Role" vs "mute only bonus episodes"
932932+933933+**** client-side episode caching strategy :ai:claude:
934934+935935+*architecture*: service worker-based transparent caching
936936+937937+*flow*:
938938+1. audio player requests ~/audio?url={episodeUrl}~
939939+2. service worker intercepts request
940940+3. if present in cache (with Range header support):
941941+ - serve from cache
942942+4. else:
943943+ - let request continue to server (immediate playback)
944944+ - simultaneously start background fetch of full audio file
945945+ - when complete, broadcast "episode-cached" event
946946+ - audio player catches event and restarts feed → now uses cached version
947947+948948+**benefits**:
949949+- no playback interruption (streaming starts immediately)
950950+- seamless transition to cached version
951951+- Range header support for seeking/scrubbing
952952+- transparent to audio player implementation
953953+954954+*implementation considerations*:
955955+- cache storage limits and cleanup policies
956956+- partial download resumption if interrupted
957957+- cache invalidation when episode URLs change
958958+- offline playback support
959959+- progress tracking for background downloads
960960+961961+**schema additions**:
962962+#+BEGIN_SRC typescript
963963+interface CachedEpisode {
964964+ episodeId: string;
965965+ originalUrl: string;
966966+ cacheKey: string; // for cache API
967967+ fileSize: number;
968968+ cachedAt: number;
969969+ lastAccessedAt: number;
970970+ downloadProgress?: number; // 0-100 for in-progress downloads
971971+}
972972+#+END_SRC
973973+974974+**service worker events**:
975975+- ~episode-cache-started~ - background download began
976976+- ~episode-cache-progress~ - download progress update
977977+- ~episode-cache-complete~ - ready to switch to cached version
978978+- ~episode-cache-error~ - download failed, stay with streaming
979979+980980+**background sync for proactive downloads**:
981981+982982+**browser support reality**:
983983+- Background Sync API: good support (Chrome/Edge, limited Safari)
984984+- Periodic Background Sync: very limited (Chrome only, requires PWA install)
985985+- Push notifications: good support, but requires user permission
986986+987987+**hybrid approach**:
988988+1. **foreground sync** (reliable): when app is open, check for new episodes
989989+2. **background sync** (opportunistic): register sync event when app closes
990990+3. **push notifications** (fallback): server pushes "new episodes available"
991991+4. **manual sync** (always works): pull-to-refresh, settings toggle
992992+993993+**implementation strategy**:
994994+#+BEGIN_SRC typescript
995995+// Register background sync when app becomes hidden
996996+document.addEventListener('visibilitychange', () => {
997997+ if (document.hidden && 'serviceWorker' in navigator) {
998998+ navigator.serviceWorker.ready.then(registration => {
999999+ return registration.sync.register('download-episodes');
10001000+ });
10011001+ }
10021002+});
10031003+10041004+// Service worker handles sync event
10051005+self.addEventListener('sync', event => {
10061006+ if (event.tag === 'download-episodes') {
10071007+ event.waitUntil(syncEpisodes());
10081008+ }
10091009+});
10101010+#+END_SRC
10111011+10121012+**realistic expectations**:
10131013+- iOS Safari: very limited background processing
10141014+- Android Chrome: decent background sync support
10151015+- Desktop: mostly works
10161016+- battery/data saver modes: disabled by OS
10171017+10181018+**fallback strategy**: rely primarily on foreground sync + push notifications, treat background sync as nice-to-have enhancement
10191019+10201020+**push notification sync workflow**:
10211021+10221022+**server-side trigger**:
10231023+1. server detects new episodes during RSS refresh
10241024+2. check which users are subscribed to that feed
10251025+3. send push notification with episode metadata payload
10261026+4. notification wakes up service worker on client
10271027+10281028+**service worker notification handler**:
10291029+#+BEGIN_SRC typescript
10301030+self.addEventListener('push', event => {
10311031+ const data = event.data?.json();
10321032+10331033+ if (data.type === 'new-episodes') {
10341034+ event.waitUntil(
10351035+ // Start background download of new episodes
10361036+ downloadNewEpisodes(data.episodes)
10371037+ .then(() => {
10381038+ // Show notification to user
10391039+ return self.registration.showNotification('New episodes available', {
10401040+ body: ~${data.episodes.length} new episodes downloaded~,
10411041+ icon: '/icon-192.png',
10421042+ badge: '/badge-72.png',
10431043+ tag: 'new-episodes',
10441044+ data: { episodeIds: data.episodes.map(e => e.id) }
10451045+ });
10461046+ })
10471047+ );
10481048+ }
10491049+});
10501050+10511051+// Handle notification click
10521052+self.addEventListener('notificationclick', event => {
10531053+ event.notification.close();
10541054+10551055+ // Open app to specific episode or feed
10561056+ event.waitUntil(
10571057+ clients.openWindow(~/episodes/${event.notification.data.episodeIds[0]}~)
10581058+ );
10591059+});
10601060+#+END_SRC
10611061+10621062+**server push logic**:
10631063+- batch notifications (don't spam for every episode)
10641064+- respect user notification preferences from FilterView settings
10651065+- include episode metadata in payload to avoid round-trip
10661066+- throttle notifications (max 1 per feed per hour?)
10671067+10681068+**user flow**:
10691069+1. new episode published → server pushes notification
10701070+2. service worker downloads episode in background
10711071+3. user sees "New episodes downloaded" notification
10721072+4. tap notification → opens app to new episode, ready to play offline
10731073+10741074+*benefits*:
10751075+- true background downloading without user interaction
10761076+- works even when app is closed
10771077+- respects per-feed notification settings
10781078+10791079+**push payload size constraints**:
10801080+- **limit**: ~4KB (4,096 bytes) across most services
10811081+- **practical limit**: ~3KB to account for service overhead
10821082+- **implications for episode metadata**:
10831083+10841084+#+BEGIN_SRC json
10851085+{
10861086+ "type": "new-episodes",
10871087+ "episodes": [
10881088+ {
10891089+ "id": "ep123",
10901090+ "channelId": "ch456",
10911091+ "title": "Episode Title",
10921092+ "url": "https://...",
10931093+ "duration": 3600,
10941094+ "size": 89432112
10951095+ }
10961096+ ]
10971097+}
10981098+#+END_SRC
10991099+11001100+**payload optimization strategies**:
11011101+- minimal episode metadata in push (id, url, basic info)
11021102+- batch multiple episodes in single notification
11031103+- full episode details fetched after service worker wakes up
11041104+- URL shortening for long episode URLs
11051105+- compress JSON payload if needed
11061106+11071107+**alternative for large payloads**:
11081108+- push notification contains only "new episodes available" signal
11091109+- service worker makes API call to get full episode list
11101110+- trade-off: requires network round-trip but unlimited data
11111111+11121112+**logical clock sync optimization**:
11131113+11141114+much simpler approach using sync revisions:
11151115+11161116+#+BEGIN_SRC json
11171117+{
11181118+ "type": "sync-available",
11191119+ "fromRevision": 12345,
11201120+ "toRevision": 12389,
11211121+ "changeCount": 8
11221122+}
11231123+#+END_SRC
11241124+11251125+**service worker sync flow**:
11261126+1. push notification wakes service worker with revision range
11271127+2. service worker fetches ~/sync?from=12345&to=12389~
11281128+3. server returns only changes in that range (episodes, feed updates, etc)
11291129+4. service worker applies changes to local dexie store
11301130+5. service worker queues background downloads for new episodes
11311131+6. updates local revision to 12389
11321132+11331133+**benefits of revision-based approach**:
11341134+- tiny push payload (just revision numbers)
11351135+- server can efficiently return only changes in range
11361136+- automatic deduplication (revision already applied = skip)
11371137+- works for any sync data (episodes, feed metadata, user settings)
11381138+- handles offline gaps gracefully (fetch missing revision ranges)
11391139+11401140+**sync API response**:
11411141+#+BEGIN_SRC typescript
11421142+interface SyncResponse {
11431143+ fromRevision: number;
11441144+ toRevision: number;
11451145+ changes: Array<{
11461146+ type: 'episode' | 'channel' | 'subscription';
11471147+ operation: 'create' | 'update' | 'delete';
11481148+ data: any;
11491149+ revision: number;
11501150+ }>;
11511151+}
11521152+#+END_SRC
11531153+11541154+**integration with episode downloads**:
11551155+- service worker processes sync changes
11561156+- identifies new episodes that match user's auto-download filters
11571157+- queues those for background cache fetching
11581158+- much more efficient than sending episode metadata in push payload
11591159+11601160+**service worker processing time constraints**:
11611161+11621162+**hard limits**:
11631163+- **30 seconds idle timeout**: service worker terminates after 30s of inactivity
11641164+- **5 minutes event processing**: single event/request must complete within 5 minutes
11651165+- **30 seconds fetch timeout**: individual network requests timeout after 30s
11661166+- **notification requirement**: push events MUST display notification before promise settles
11671167+11681168+**practical implications**:
11691169+- sync API call (~/sync?from=X&to=Y~) must complete within 30s
11701170+- large episode downloads must be queued, not started immediately in push handler
11711171+- use ~event.waitUntil()~ to keep service worker alive during processing
11721172+- break large operations into smaller chunks
11731173+11741174+**recommended push event flow**:
11751175+#+BEGIN_SRC typescript
11761176+self.addEventListener('push', event => {
11771177+ const data = event.data?.json();
11781178+11791179+ event.waitUntil(
11801180+ // Must complete within 5 minutes total
11811181+ handlePushSync(data)
11821182+ .then(() => {
11831183+ // Required: show notification before promise settles
11841184+ return self.registration.showNotification('Episodes synced');
11851185+ })
11861186+ );
11871187+});
11881188+11891189+async function handlePushSync(data) {
11901190+ // 1. Quick sync API call (< 30s)
11911191+ const changes = await fetch(~/sync?from=${data.fromRevision}&to=${data.toRevision}~);
11921192+11931193+ // 2. Apply changes to dexie store (fast, local)
11941194+ await applyChangesToStore(changes);
11951195+11961196+ // 3. Queue episode downloads for later (don't start here)
11971197+ await queueEpisodeDownloads(changes.newEpisodes);
11981198+11991199+ // Total time: < 5 minutes, preferably < 30s
12001200+}
12011201+#+END_SRC
12021202+12031203+*download strategy*: use push event for sync + queuing, separate background tasks for actual downloads
12041204+12051205+*background fetch API for large downloads*:
12061206+12071207+*progressive enhancement approach*:
12081208+#+BEGIN_SRC typescript
12091209+async function queueEpisodeDownloads(episodes) {
12101210+ for (const episode of episodes) {
12111211+ if ('serviceWorker' in navigator && 'BackgroundFetch' in window) {
12121212+ // Chrome/Edge: use Background Fetch API for true background downloading
12131213+ await navigator.serviceWorker.ready.then(registration => {
12141214+ return registration.backgroundFetch.fetch(
12151215+ ~episode-${episode.id}~,
12161216+ episode.url,
12171217+ {
12181218+ icons: [{ src: '/icon-256.png', sizes: '256x256', type: 'image/png' }],
12191219+ title: ~Downloading: ${episode.title}~,
12201220+ downloadTotal: episode.fileSize
12211221+ }
12221222+ );
12231223+ });
12241224+ } else {
12251225+ // Fallback: queue for reactive download (download while streaming)
12261226+ await queueReactiveDownload(episode);
12271227+ }
12281228+ }
12291229+}
12301230+12311231+// Handle background fetch completion
12321232+self.addEventListener('backgroundfetch', event => {
12331233+ if (event.tag.startsWith('episode-')) {
12341234+ event.waitUntil(handleEpisodeDownloadComplete(event));
12351235+ }
12361236+});
12371237+#+END_SRC
12381238+12391239+*browser support reality*:
12401240+- *Chrome/Edge*: Background Fetch API supported
12411241+- *Firefox/Safari*: not supported, fallback to reactive caching
12421242+- *mobile*: varies by platform and browser
12431243+12441244+*benefits when available*:
12451245+- true background downloading (survives app close, browser close)
12461246+- built-in download progress UI
12471247+- automatic retry on network failure
12481248+- no service worker time limits during download
12491249+12501250+*graceful degradation*:
12511251+- detect support, use when available
12521252+- fallback to reactive caching (download while streaming)
12531253+- user gets best experience possible on their platform
12541254+12551255+*** research todos :ai:claude:
12561256+12571257+high-level unanswered questions from architecture brainstorming:
12581258+12591259+**** sync and data management
12601260+***** TODO dexie sync capabilities vs rxdb for multi-device sync implementation
12611261+***** TODO webrtc p2p sync implementation patterns and reliability
12621262+***** TODO conflict resolution strategies for device-specific data in distributed sync
12631263+***** TODO content-addressed deduplication algorithms for rss/podcast content
12641264+**** client-side storage and caching
12651265+***** TODO opfs storage limits and cleanup strategies for client-side caching
12661266+***** TODO practical background fetch api limits and edge cases for podcast downloads
12671267+**** automation and intelligence
12681268+***** TODO llm-based regex generation for episode title parsing automation
12691269+***** TODO push notification subscription management and realm authentication
12701270+**** platform and browser capabilities
12711271+***** TODO browser audio api capabilities for podcast-specific features (speed, silence skip)
12721272+***** TODO progressive web app installation and platform-specific behaviors
12731273+12741274+* webtorrent brainstorming 6/16 :ai:claude:
12751275+12761276+** WebTorrent + Event Log CRDT Architecture
12771277+12781278+*** Core Concept Split
12791279+We identified two fundamentally different types of data that need different sync strategies:
12801280+12811281+**** 1. Dynamic Metadata (Event Log CRDT)
12821282+- **Data**: Play state, scroll position, settings, subscriptions
12831283+- **Characteristics**: Frequently changing, small, device-specific
12841284+- **Solution**: Event log with Hybrid Logical Clocks (HLC)
12851285+- **Sync**: Merkle tree efficient diff + P2P exchange via realm
12861286+12871287+**** 2. Static Content (WebTorrent)
12881288+- **Data**: RSS feeds, podcast episodes (audio files)
12891289+- **Characteristics**: Immutable, large, content-addressable
12901290+- **Solution**: WebTorrent with infohash references
12911291+- **Storage**: IndexedDB chunk store (idb-chunk-store npm package)
12921292+12931293+*** Event Log CRDT Design
12941294+12951295+**** Hybrid Logical Clock (HLC)
12961296+Based on James Long's crdt-example-app implementation:
12971297+#+BEGIN_SRC typescript
12981298+interface HLC {
12991299+ millis: number; // physical time
13001300+ counter: number; // logical counter (0-65535)
13011301+ node: string; // device identity ID
13021302+}
13031303+13041304+interface SyncEvent {
13051305+ timestamp: HLC;
13061306+ type: 'subscribe' | 'unsubscribe' | 'markPlayed' | 'updatePosition' | ...
13071307+ payload: any;
13081308+}
13091309+#+END_SRC
13101310+13111311+**Benefits**:
13121312+- Causality preserved even with clock drift
13131313+- Compact representation (vs full vector clocks)
13141314+- Total ordering via (millis, counter, node) comparison
13151315+- No merge conflicts - just union of events
13161316+13171317+**** Merkle Tree Sync
13181318+Efficient sync using merkle trees over time ranges:
13191319+13201320+#+BEGIN_SRC typescript
13211321+interface RangeMerkleNode {
13221322+ startTime: HLC;
13231323+ endTime: HLC;
13241324+ hash: string;
13251325+ eventCount: number;
13261326+}
13271327+#+END_SRC
13281328+13291329+**Sync Protocol**:
13301330+1. Exchange merkle roots
13311331+2. If different, drill down to find divergent ranges
13321332+3. Exchange only missing events
13331333+4. Apply in HLC order
13341334+13351335+**Key insight**: No merge conflicts because events are immutable and ordered by HLC
13361336+13371337+**** Progressive Compaction
13381338+Use idle time to compact old events:
13391339+- Recent (< 5 min): Individual events for active sync
13401340+- Hourly chunks: After 5 minutes
13411341+- Daily chunks: After 24 hours
13421342+- Monthly chunks: After 30 days
13431343+13441344+Benefits:
13451345+- Fast recent sync
13461346+- Efficient storage of history
13471347+- Old chunks can move to OPFS as blobs
13481348+13491349+*** WebTorrent Integration
13501350+13511351+**** Content Flow
13521352+1. **CORS-friendly feeds**:
13531353+ - Browser fetches directly
13541354+ - Creates torrent with original URL as webseed
13551355+ - Broadcasts infohash to realm
13561356+13571357+2. **CORS-blocked feeds**:
13581358+ - Server fetches and hashes
13591359+ - Returns infohash (server doesn't store content)
13601360+ - Client uses WebTorrent with original URL as webseed
13611361+13621362+**** Realm as Private Tracker
13631363+- Realm members announce infohashes they have
13641364+- No need for DHT or public trackers
13651365+- Existing WebRTC signaling used for peer discovery
13661366+- Private swarm for each realm
13671367+13681368+**** Storage via Chunk Store
13691369+Use `idb-chunk-store` (or similar) for persistence:
13701370+- WebTorrent handles chunking/verification
13711371+- IndexedDB provides persistence across sessions
13721372+- Abstract-chunk-store interface allows swapping implementations
13731373+13741374+*** Bootstrap & History Sharing
13751375+13761376+**** History Snapshots as Torrents
13771377+Serialize event history into content-addressed chunks:
13781378+13791379+#+BEGIN_SRC typescript
13801380+interface HistorySnapshot {
13811381+ period: "2024-05";
13821382+ events: SyncEvent[];
13831383+ merkleRoot: string;
13841384+ deviceStates: Record<string, DeviceState>;
13851385+}
13861386+13871387+// Share via WebTorrent
13881388+const blob = await serializeSnapshot(events);
13891389+const infohash = await createTorrent(blob);
13901390+realm.broadcast({ type: "historySnapshot", period, infohash });
13911391+#+END_SRC
13921392+13931393+**** Materialized State Snapshots
13941394+Using dexie-export-import for database snapshots:
13951395+13961396+#+BEGIN_SRC typescript
13971397+const dbBlob = await exportDB(db, {
13981398+ tables: ['channels', 'channelEntries'],
13991399+ filter: (table, value) => !isDeviceSpecific(table, value)
14001400+});
14011401+14021402+const infohash = await createTorrent(dbBlob);
14031403+#+END_SRC
14041404+14051405+**** New Device Bootstrap
14061406+1. Download latest DB snapshot → Instant UI
14071407+2. Download recent events → Apply updates
14081408+3. Background: fetch historical event logs
14091409+4. Result: Fast startup with complete history
14101410+14111411+*** Implementation Benefits
14121412+14131413+1. **Privacy**: No server sees listening history
14141414+2. **Offline-first**: Everything works locally
14151415+3. **Efficient sync**: Only exchange missing data
14161416+4. **P2P content**: Reduce server bandwidth
14171417+5. **Scalable**: Torrents for bulk data transfer
14181418+6. **Verifiable**: Merkle trees ensure consistency
14191419+14201420+*** Next Steps
14211421+- [ ] Implement HLC timestamps
14221422+- [ ] Build merkle tree sync protocol
14231423+- [ ] Integrate WebTorrent with realm signaling
14241424+- [ ] Create history snapshot system
14251425+- [ ] Test cross-device sync scenarios
14261426+14271427+** Additional Architecture Insights
14281428+14291429+*** Unified Infohash Approach
14301430+Instead of having separate hashes for merkle tree and WebTorrent, use infohashes throughout:
14311431+14321432+**** Hierarchical Infohash Structure
14331433+#+BEGIN_SRC typescript
14341434+// Leaf level: individual files
14351435+const episode1Hash = await createTorrent(episode1.mp3);
14361436+const feedXmlHash = await createTorrent(feed.xml);
14371437+14381438+// Directory level: multi-file torrent
14391439+const feedTorrent = await createTorrent({
14401440+ name: 'example.com.rss',
14411441+ files: [
14421442+ { path: 'rss.xml', infohash: feedXmlHash },
14431443+ { path: 'episode-1.mp3', infohash: episode1Hash }
14441444+ ]
14451445+});
14461446+14471447+// Root level: torrent of feed torrents
14481448+const rootTorrent = await createTorrent({
14491449+ name: 'feeds',
14501450+ folders: [
14511451+ { path: 'example.com.rss', infohash: feedTorrent.infoHash }
14521452+ ]
14531453+});
14541454+#+END_SRC
14551455+14561456+Benefits:
14571457+- Single hash type throughout system
14581458+- Progressive loading (directory structure first, then files)
14591459+- Natural deduplication
14601460+- WebTorrent native sharing of folder structures
14611461+14621462+*** Long-term Event Log Scaling
14631463+14641464+**** Checkpoint + Delta Pattern
14651465+For handling millions of events, use periodic checkpoints:
14661466+14671467+#+BEGIN_SRC typescript
14681468+interface EventCheckpoint {
14691469+ hlc: HLC;
14701470+ stateSnapshot: {
14711471+ subscriptions: Channel[];
14721472+ playStates: PlayRecord[];
14731473+ settings: Settings;
14741474+ };
14751475+ eventCount: number;
14761476+ infohash: string; // torrent of this checkpoint
14771477+}
14781478+14791479+// Every 10k events or monthly
14801480+async function createCheckpoint(): Promise<Checkpoint> {
14811481+ const currentHLC = getLatestEventHLC();
14821482+14831483+ // Export materialized state using dexie-export-import
14841484+ const dbBlob = await exportDB(db, {
14851485+ filter: (table, value) => {
14861486+ return !['activeSyncs', 'tempData'].includes(table);
14871487+ }
14881488+ });
14891489+14901490+ const infohash = await createTorrent(dbBlob);
14911491+ return { hlc: currentHLC, dbExport: dbBlob, infohash };
14921492+}
14931493+#+END_SRC
14941494+14951495+**** Bootstrap Flow with Checkpoints
14961496+1. New device downloads latest checkpoint via WebTorrent
14971497+2. Imports directly to IndexedDB: `await importDB(checkpoint.blob)`
14981498+3. Requests only recent events since checkpoint
14991499+4. Applies recent events to catch up
15001500+15011501+Benefits:
15021502+- Fast bootstrap (one checkpoint instead of million events)
15031503+- No double materialization (IndexedDB is already materialized state)
15041504+- P2P distribution of checkpoints
15051505+- Clear version migration path
15061506+15071507+*** Sync State Management
15081508+15091509+**** Catching Up vs Live Events
15101510+#+BEGIN_SRC typescript
15111511+interface SyncState {
15121512+ localHLC: HLC;
15131513+ remoteHLC: HLC;
15141514+ mode: 'catching-up' | 'live';
15151515+}
15161516+15171517+// Separate handlers for historical vs live events
15181518+async function replayHistoricalEvents(from: HLC, to: HLC) {
15191519+ const events = await fetchEvents(from, to);
15201520+15211521+ // Process in batches without UI updates
15221522+ await db.transaction('rw', db.tables, async () => {
15231523+ for (const batch of chunks(events, 1000)) {
15241524+ await Promise.all(batch.map(applyEventSilently));
15251525+ }
15261526+ });
15271527+15281528+ // One UI update at the end
15291529+ notifyUI('Sync complete', { newEpisodes: 47 });
15301530+}
15311531+15321532+function handleLiveEvent(event: SyncEvent) {
15331533+ // Real-time event - update UI immediately
15341534+ applyEvent(event);
15351535+ if (event.type === 'newEpisode') {
15361536+ showNotification(`New episode: ${event.title}`);
15371537+ }
15381538+}
15391539+#+END_SRC
15401540+15411541+**** HLC Comparison for Ordering
15421542+#+BEGIN_SRC typescript
15431543+function compareHLC(a: HLC, b: HLC): number {
15441544+ if (a.millis !== b.millis) return a.millis - b.millis;
15451545+ if (a.counter !== b.counter) return a.counter - b.counter;
15461546+ return a.node.localeCompare(b.node);
15471547+}
15481548+15491549+// Determine if caught up
15501550+function isCaughtUp(myHLC: HLC, peerHLC: HLC): boolean {
15511551+ return compareHLC(myHLC, peerHLC) >= 0;
15521552+}
15531553+#+END_SRC
15541554+15551555+*** Handling Out-of-Order Events
15561556+15571557+**** Idempotent Reducers (No Replay Needed)
15581558+Design reducers to handle events arriving out of order:
15591559+15601560+#+BEGIN_SRC typescript
15611561+// HLC-aware reducer that handles out-of-order events
15621562+function reducePlayPosition(state, event) {
15631563+ if (event.type === 'updatePosition') {
15641564+ const existing = state.positions[event.episodeId];
15651565+ // Only update if this event is newer
15661566+ if (!existing || compareHLC(event.hlc, existing.hlc) > 0) {
15671567+ state.positions[event.episodeId] = {
15681568+ position: event.position,
15691569+ hlc: event.hlc // Track which event set this
15701570+ };
15711571+ }
15721572+ }
15731573+}
15741574+#+END_SRC
15751575+15761576+**** Example: Offline Device Rejoining
15771577+#+BEGIN_SRC typescript
15781578+// Device A offline for a week, comes back with old events
15791579+Device A: [
15801580+ { hlc: "1000:0:A", type: "markPlayed", episode: "ep1" },
15811581+ { hlc: "1100:0:A", type: "updatePosition", episode: "ep1", position: 500 }
15821582+]
15831583+15841584+// Device B already has newer event
15851585+Device B: [
15861586+ { hlc: "1050:0:B", type: "updatePosition", episode: "ep1", position: 1000 }
15871587+]
15881588+15891589+// Smart reducer produces correct final state
15901590+finalState = {
15911591+ "ep1": {
15921592+ played: true, // from 1000:0:A
15931593+ position: 1000, // from 1050:0:B (newer HLC wins)
15941594+ lastPositionHLC: "1050:0:B"
15951595+ }
15961596+}
15971597+#+END_SRC
15981598+15991599+Key principles:
16001600+- Store HLC with state changes
16011601+- Use "last write wins" with HLC comparison
16021602+- Make operations commutative when possible
16031603+- No need for full replay when inserting old events
+81-1265
readme-devlog.org
···11#+PROPERTY: COOKIE_DATA recursive
22#+STARTUP: overview
3344-most of this is old, I need to rework it
55-66-* design
77-88-** frontend (packages/app)
99-- http://localhost:7891
1010-- proxies ~/api~ and ~/sync~ to the backend in development
1111-- uses Dexie for local storage with sync plugin
1212-- custom sync replication implementation using PeerJS through the signalling server
1313-1414-** backend (packages/server)
1515-- http://localhost:7890
1616-- serves ~/dist~ if the directory is present (see ~dist~ script)
1717-- serves ~/api~ for RSS caching proxy
1818- - file-based routing under the api directory
1919-- serves ~/sync~ which is a ~peerjs~ signalling server
2020-2121-** sync
2222-- each client keeps the full data set
2323-- dexie sync and observable let us stream change sets
2424-- we can publish the "latest" to all peers
2525-- on first pull, if not the first client, we can request a dump out of band
2626-2727-*** rss feed data
2828-- do we want to backup feed data?
2929- - conceptually, this should be refetchable
3030- - but feeds go away, and some will only show recent stories
3131- - so yes, we'll need this
3232- - but server side, we can dedupe
3333- - content-addressed server-side cache?
3434-3535-- server side does RSS pulling
3636- - can feeds be marked private, such that they won't be pulled through the proxy?
3737- - but then we require everything to be fetchable via cors
3838- - client configured proxy settings?
3939-4040-*** peer connection
4141-- on startup, check for current realm-id and key pair
4242-- if not present, ask to login or start new
4343- - if login, run through the [[* pairing]] process
4444- - if start new, run through the [[* registration]] process
4545-- use keypair to authenticate to server
4646- - response includes list of active peers to connect
4747-- clients negotiate sync from there
4848-- an identity is a keypair and a realm
4949-5050-- realm is uuid
5151- - realm on the server is the socket connection for peer discovery
5252- - keeps a list of verified public keys
5353- - and manages the /current/ ~public-key->peer ids~ mapping
5454- - realm on the client side is first piece of info required for sync
5555- - when connecting to the signalling server, you present a realm, and a signed public key
5656- - server accepts/rejects based on signature and current verified keys
5757-5858-- a new keypair can create a realm
5959-6060-- a new keypair can double sign an invitation
6161- - invite = ~{ realm:, nonce:, not_before:, not_after:, authorizer: }~, signed with verified key
6262- - exchanging an invite = ~{ invite: }~, signed with my key
6363-6464-- on startup
6565- - start stand-alone (no syncing required, usually the case on first-run)
6666- - generate a keypair
6767- - want server backup?
6868- - sign a "setup" message with new keypair and send to the server
6969- - server responds with a new realm, that this keypair is already verified for
7070- - move along
7171- - exchange invite to sync to other devices
7272- - generate a keypair
7373- - sign the exchange message with the invite and send to the server
7474- - server verifies the invite
7575- - adds the new public key to the peer list and publishes downstream
7676- - move along
7777-7878-***** standalone
7979-in this mode, there is no syncing. this is the most likely first-time run option.
8080-8181-- generate a keypair on startup, so we have a stable fingerprint in the future
8282-- done
8383-8484-***** pairing
8585-in this mode, there is syncing to a named realm, but not necessarily server resources consumed
8686-we don't need an email, since the server is just doing signalling and peer management
8787-8888-- generate an invite from an existing verified peer
8989- - ~{ realm:, not_before:, not_after:, inviter: peer.public_key }~
9090- - sign that invitation from the existing verified peer
9191-9292-- standalone -> paired
9393- - get the invitation somehow (QR code?)
9494- - sign an invite exchange with the standalone's public key
9595- - send to server
9696- - server verifies the invite
9797- - adds the new public key to the peer list and publishes downstream
9898-9999-***** server backup
100100-in this mode, there is syncing to a named realm by email.
101101-102102-goal of server backup mode is that we can go from email->fully working client with latest data without having to have any clients left around that could participate in the sync.
103103-104104-- generate a keypair on startup
105105-- sign a registration message sent to the server
106106- - send a verification email
107107- - if email/realm already exists, this is authorization
108108- - if not, it's email validation
109109- - server starts a realm and associates the public key
110110- - server acts as a peer for the realm, and stores private data
111111-112112-- since dexie is publishing change sets, we should be able to just store deltas
113113-- but we'll need to store _all_ deltas, unless we're materializing on the server side too
114114- - should we use an indexdb shim so we can import/export from the server for clean start?
115115- - how much materialization does the server need?
116116-117117-* ai instructions
118118-- when writing to the devlog, add tags to your entries specifying ~:ai:~ and what tool did it.
119119-- false starts and prototypes are in ~./devlog/~
120120-121121-* notes and decision record [1/11]
122122-** architecture design (may 28-29) :ai:claude:
123123-124124-details notes are in [[./devlog/may-29.org]]
125125-key decisions and system design:
126126-127127-*** sync model
128128-- device-specific records for playback state/queues to avoid conflicts
129129-- content-addressed server cache with deduplication
130130-- dual-JWT invitation flow for secure realm joining
131131-132132-*** data structures
133133-- tag-based filtering system instead of rigid hierarchies
134134-- regex patterns for episode title parsing and organization
135135-- service worker caching with background download support
136136-137137-*** core schemas
138138-**** client (dexie)
139139-- Channel/ChannelEntry for RSS feeds and episodes
140140-- PlayRecord/QueueItem scoped by deviceId
141141-- FilterView for virtual feed organization
142142-143143-**** server (drizzle)
144144-- ContentStore for deduplicated content by hash
145145-- Realm/PeerConnection for sync authorization
146146-- HttpCache with health tracking and TTL
147147-148148-*** push sync strategy
149149-- revision-based sync (just send revision ranges in push notifications)
150150-- background fetch API for large downloads where supported
151151-- graceful degradation to reactive caching
152152-153153-*** research todos :ai:claude:
154154-155155-**** sync and data management
156156-***** DONE identity and signature management
157157-***** TODO dexie sync capabilities vs rxdb for multi-device sync implementation
158158-***** TODO webrtc p2p sync implementation patterns and reliability
159159-***** TODO conflict resolution strategies for device-specific data in distributed sync
160160-***** TODO content-addressed deduplication algorithms for rss/podcast content
161161-**** client-side storage and caching
162162-***** TODO opfs storage limits and cleanup strategies for client-side caching
163163-***** TODO practical background fetch api limits and edge cases for podcast downloads
164164-**** automation and intelligence
165165-***** TODO llm-based regex generation for episode title parsing automation
166166-***** TODO push notification subscription management and realm authentication
167167-**** platform and browser capabilities
168168-***** TODO browser audio api capabilities for podcast-specific features (speed, silence skip)
169169-***** TODO progressive web app installation and platform-specific behaviors
170170-171171-# Local Variables:
172172-# org-hierarchical-todo-statistics: nil
173173-# org-checkbox-hierarchical-statistics: nil
174174-# End:
175175-176176-** <2025-05-28 Wed>
177177-getting everything setup
178178-179179-the biggest open question I have is what sort of privacy/encryption guarantee I need. I want the server to be able to do things like cache and store feed data long-term.
180180-181181-Is "if you want full privacy, self-host" valid?
182182-183183-*** possibilities
184184-185185-- fully PWA
186186- - CON: cors, which would require a proxy anyway
187187- - CON: audio analysis, llm based stuff for categorization, etc. won't work
188188- - PRO: private as all get out
189189- - can still do WebRTC p2p sync for resiliancy
190190- - can still do server backups, if sync stream is encrypted, but no compaction would be available
191191- - could do _explicit_ server backups as dump files
192192-193193-- self hostable
194194- - PRO: can do bunches of private stuff on the server, because if you don't want me to see it, do it elsewhere
195195- - CON: hard for folk to use
196196-197197-*** brainstorm :ai:claude:
198198-**** sync conflict resolution design discussion :ai:claude:
199199-200200-discussed the sync architecture and dexie conflict handling:
201201-202202-*dexie syncable limitations*:
203203-- logical clocks handle causally-related changes well
204204-- basic timestamp-based conflict resolution for concurrent updates
205205-- last-writer-wins for same field conflicts
206206-- no sophisticated CRDT or vector clock support
207207-208208-*solutions for podcast-specific conflicts*:
209209-210210-- play records: device-specific approach
211211- - store separate ~play_records~ per ~device_id~
212212- - each record: ~{ episode_id, device_id, position, completed, timestamp }~
213213- - UI handles conflict resolution with "continue from X device?" prompts
214214- - avoids arbitrary timestamp wins, gives users control
215215-216216-- subscription trees
217217- - store ~parent_path~ as single string field ("/Tech/Programming")
218218- - simpler than managing folder membership tables
219219- - conflicts still possible but contained to single field
220220- - could store move operations as events for richer resolution
221221-222222-*other sync considerations*:
223223-- settings/preferences: distinguish device-local vs global
224224-- bulk operations: "mark all played" can create duplicate operations
225225-- metadata updates: server RSS updates vs local renames
226226-- temporal ordering: recently played lists, queue reordering
227227-- storage limits: cleanup operations conflicting across devices
228228-- feed state: refresh timestamps, error states
229229-230230-*approach*: prefer "events not state" pattern and device-specific records where semantic conflicts are likely
231231-232232-**** data model brainstorm :ai:claude:
233233-234234-core entities designed with sync in mind:
235235-236236-***** ~Feed~ :: RSS/podcast subscription
237237-- ~parent_path~ field for folder structure (eg. ~/Tech/Programming~)
238238-- ~is_private~ flag to skip server proxy
239239-- ~refresh_interval~ for custom update frequencies
240240-241241-***** ~Episode~ :: individual podcast episodes
242242-- standard RSS metadata (guid, title, description, media url)
243243-- duration and file info for playback
244244-245245-***** ~PlayRecord~ :: device-specific playback state
246246-- separate record per ~device_id~ to avoid timestamp conflicts
247247-- position, completed status, playback speed
248248-- UI can prompt "continue from X device?" for resolution
249249-250250-***** ~QueueItem~ :: device-specific episode queue
251251-- ordered list with position field
252252-- ~device_id~ scoped to avoid queue conflicts
253253-254254-***** ~Subscription~ :: feed membership settings
255255-- can be global or device-specific
256256-- auto-download preferences per device
257257-258258-***** ~Settings~ :: split global vs device-local
259259-- theme, default speed = global
260260-- download path, audio device = device-local
261261-262262-***** Event tables for complex operations:
263263-- ~FeedMoveEvent~ for folder reorganization
264264-- ~BulkMarkPlayedEvent~ for "mark all read" operations
265265-- better conflict resolution than direct state updates
266266-267267-***** sync considerations
268268-- device identity established on first run
269269-- dexie syncable handles basic timestamp conflicts
270270-- prefer device-scoped records for semantic conflicts
271271-- event-driven pattern for bulk operations
272272-273273-**** schema evolution from previous iteration :ai:claude:
274274-275275-reviewed existing schema from tmp/feed.ts - well designed foundation:
276276-277277-***** keep from original
278278-- Channel/ChannelEntry naming and structure
279279-- ~refreshHP~ adaptive refresh system (much better than simple intervals)
280280-- rich podcast metadata (people, tags, enclosure, podcast object)
281281-- HTTP caching with etag/status tracking
282282-- epoch millisecond timestamps
283283-- ~hashId()~ approach for entry IDs
284284-285285-***** add for multi-device sync
286286-- ~PlayState~ table (device-scoped position/completion)
287287-- Subscription table (with ~parentPath~ for folders, device-scoped settings)
288288-- ~QueueItem~ table (device-scoped episode queues)
289289-- Device table (identity management)
290290-291291-***** migration considerations
292292-- existing Channel/ChannelEntry can be preserved
293293-- new tables are additive
294294-- ~fetchAndUpsert~ method works well with server proxy architecture
295295-- dexie sync vs rxdb - need to evaluate change tracking capabilities
296296-297297-**** content-addressed caching for offline resilience :ai:claude:
298298-299299-designed caching system for when upstream feeds fail/disappear, building on existing cache-schema.ts:
300300-301301-***** server-side schema evolution (drizzle sqlite):
302302-- keep existing ~httpCacheTable~ design (health tracking, http headers, ttl)
303303-- add ~contentHash~ field pointing to deduplicated content
304304-- new ~contentStoreTable~: deduplicated blobs by sha256 hash
305305-- new ~contentHistoryTable~: url -> contentHash timeline with isLatest flag
306306-- reference counting for garbage collection
307307-308308-***** client-side OPFS storage
309309-- ~/cache/content/{contentHash}.xml~ for raw feeds
310310-- ~/cache/media/{contentHash}.mp3~ for podcast episodes
311311-- ~LocalCacheEntry~ metadata tracks expiration and offline-only flags
312312-- maintains last N versions per feed for historical access
313313-314314-***** fetch strategy & fallback
315315-1. check local OPFS cache first (fastest)
316316-2. try server proxy ~/api/feed?url={feedUrl}~ (deduplicated)
317317-3. server checks ~contentHistory~, serves latest or fetches upstream
318318-4. server returns ~{contentHash, content, cached: boolean}~
319319-5. client stores with content hash as filename
320320-6. emergency mode: serve stale content when upstream fails
321321-322322-- preserves existing health tracking and HTTP caching logic
323323-- popular feeds cached once on server, many clients benefit
324324-- bandwidth savings via content hash comparison
325325-- historical feed state preservation (feeds disappear!)
326326-- true offline operation after initial sync
327327-328328-** <2025-05-29 Thu> :ai:claude:
329329-e2e encryption and invitation flow design
330330-331331-worked through the crypto and invitation architecture. key decisions:
332332-333333-*** keypair strategy
334334-- use jwk format for interoperability (server stores public keys)
335335-- ed25519 for signing, separate x25519 for encryption if needed
336336-- zustand lazy initialization pattern: ~ensureKeypair()~ on first use
337337-- store private jwk in persisted zustand state
338338-339339-*** invitation flow: dual-jwt approach
340340-solved the chicken-and-egg problem of sharing encryption keys securely.
341341-342342-**** qr code contains two signed jwts:
343343-1. invitation token: ~{iss: inviter_fingerprint, sub: invitation_id, purpose: "realm_invite"}~
344344-2. encryption key token: ~{iss: inviter_fingerprint, ephemeral_private: base64_key, purpose: "ephemeral_key"}~
345345-346346-**** exchange process:
347347-1. invitee posts jwt1 + their public keys to ~/invitations~
348348-2. server verifies jwt1 signature against realm members
349349-3. if valid: adds invitee to realm, returns ~{realm_id, realm_members, encrypted_realm_key}~
350350-4. invitee verifies jwt2 signature against returned realm members
351351-5. invitee extracts ephemeral private key, decrypts realm encryption key
352352-353353-**** security properties:
354354-- server never has decryption capability (missing ephemeral private key)
355355-- both jwts must be signed by verified realm member
356356-- if first exchange fails, second jwt is cryptographically worthless
357357-- atomic operation: identity added only if invitation valid
358358-- built-in expiration and tamper detection via jwt standard
359359-360360-**** considered alternatives:
361361-- raw ephemeral keys in qr: simpler but no authenticity
362362-- ecdh key agreement: chicken-and-egg problem with public key exchange
363363-- server escrow: good but missing authentication layer
364364-- password-based: requires secure out-of-band sharing
365365-366366-the dual-jwt approach provides proper authenticated invitations while maintaining e2e encryption properties.
367367-368368-**** refined dual-jwt with ephemeral signing
369369-simplified the approach by using ephemeral key for second jwt signature:
370370-371371-**setup**:
372372-1. inviter generates ephemeral keypair
373373-2. encrypts realm key with ephemeral private key
374374-3. posts to server: ~{invitation_id, realm_id, ephemeral_public, encrypted_realm_key}~
375375-376376-**qr code contains**:
377377-#+BEGIN_SRC json
378378-// JWT 1: signed with inviter's realm signing key
379379-{
380380- "realm_id": "uuid",
381381- "invitation_id": "uuid",
382382- "iss": "inviter_fingerprint"
383383-}
384384-385385-// JWT 2: signed with ephemeral private key
386386-{
387387- "ephemeral_private": "base64_key",
388388- "invitation_id": "uuid"
389389-}
390390-#+END_SRC
391391-392392-**exchange flow**:
393393-1. submit jwt1 → server verifies against realm members → returns ~{invitation_id, realm_id, ephemeral_public, encrypted_realm_key}~
394394-2. verify jwt2 signature using ~ephemeral_public~ from server response
395395-3. extract ~ephemeral_private~ from jwt2, decrypt realm key
396396-397397-**benefits over previous version**:
398398-- no premature key disclosure (invitee keys shared via normal webrtc peering)
399399-- self-contained verification (ephemeral public key verifies jwt2)
400400-- cleaner separation of realm auth vs encryption key distribution
401401-- simpler flow (no need to return realm member list)
402402-403403-**crypto verification principle**: digital signatures work as sign-with-private/verify-with-public, while encryption works as encrypt-with-public/decrypt-with-private. jwt2 verification uses signature verification, not decryption.
404404-405405-**invitation flow diagram**:
406406-#+BEGIN_SRC mermaid
407407-sequenceDiagram
408408- participant I as Inviter
409409- participant S as Server
410410- participant E as Invitee
411411-412412- Note over I: Generate ephemeral keypair
413413- I->>I: ephemeral_private, ephemeral_public
414414-415415- Note over I: Encrypt realm key
416416- I->>I: encrypted_realm_key = encrypt(realm_key, ephemeral_private)
417417-418418- I->>S: POST /invitations<br/>{invitation_id, realm_id, ephemeral_public, encrypted_realm_key}
419419- S-->>I: OK
420420-421421- Note over I: Create JWTs for QR code
422422- I->>I: jwt1 = sign({realm_id, invitation_id}, inviter_private)
423423- I->>I: jwt2 = sign({ephemeral_private, invitation_id}, ephemeral_private)
424424-425425- Note over I,E: QR code contains [jwt1, jwt2]
426426-427427- E->>S: POST /invitations/exchange<br/>{jwt1}
428428- Note over S: Verify jwt1 signature<br/>against realm members
429429- S-->>E: {invitation_id, realm_id, ephemeral_public, encrypted_realm_key}
430430-431431- Note over E: Verify jwt2 signature<br/>using ephemeral_public
432432- E->>E: verify_signature(jwt2, ephemeral_public)
433433-434434- Note over E: Extract key and decrypt
435435- E->>E: ephemeral_private = decode(jwt2)
436436- E->>E: realm_key = decrypt(encrypted_realm_key, ephemeral_private)
437437-438438- Note over E: Now member of realm!
439439-#+END_SRC
440440-441441-**** jwk keypair generation and validation :ai:claude:
442442-443443-discussed jwk vs raw crypto.subtle for keypair storage. since public keys need server storage for realm authorization, jwk is better for interoperability.
444444-445445-**keypair generation**:
446446-#+BEGIN_SRC typescript
447447-const keypair = await crypto.subtle.generateKey(
448448- { name: "Ed25519" },
449449- true,
450450- ["sign", "verify"]
451451-);
452452-453453-const publicJWK = await crypto.subtle.exportKey("jwk", keypair.publicKey);
454454-const privateJWK = await crypto.subtle.exportKey("jwk", keypair.privateKey);
455455-456456-// JWK format:
457457-{
458458- "kty": "OKP",
459459- "crv": "Ed25519",
460460- "x": "base64url-encoded-public-key",
461461- "d": "base64url-encoded-private-key" // only in private JWK
462462-}
463463-#+END_SRC
464464-465465-**client validation**:
466466-#+BEGIN_SRC typescript
467467-function isValidEd25519PublicJWK(jwk: any): boolean {
468468- return (
469469- typeof jwk === 'object' &&
470470- jwk.kty === 'OKP' &&
471471- jwk.crv === 'Ed25519' &&
472472- typeof jwk.x === 'string' &&
473473- jwk.x.length === 43 && // base64url Ed25519 public key length
474474- !jwk.d && // public key shouldn't have private component
475475- !jwk.use || jwk.use === 'sig'
476476- );
477477-}
478478-479479-async function validatePublicKey(publicJWK: JsonWebKey): Promise<CryptoKey | null> {
480480- try {
481481- if (!isValidEd25519PublicJWK(publicJWK)) return null;
482482-483483- const key = await crypto.subtle.importKey(
484484- 'jwk',
485485- publicJWK,
486486- { name: 'Ed25519' },
487487- false,
488488- ['verify']
489489- );
490490-491491- return key;
492492- } catch {
493493- return null;
494494- }
495495-}
496496-#+END_SRC
497497-498498-**server validation (node.js)**:
499499-#+BEGIN_SRC typescript
500500-import { webcrypto } from 'node:crypto';
501501-502502-async function validateClientPublicKey(publicJWK: JsonWebKey): Promise<boolean> {
503503- try {
504504- if (!isValidEd25519PublicJWK(publicJWK)) return false;
505505-506506- await webcrypto.subtle.importKey(
507507- 'jwk',
508508- publicJWK,
509509- { name: 'Ed25519' },
510510- false,
511511- ['verify']
512512- );
513513-514514- return true;
515515- } catch {
516516- return false;
517517- }
518518-}
519519-#+END_SRC
520520-521521-**authentication flow**:
522522-#+BEGIN_SRC typescript
523523-// client signs message
524524-const authMessage = {
525525- realm: 'uuid-here',
526526- timestamp: Date.now(),
527527- action: 'join'
528528-};
529529-530530-const signature = await crypto.subtle.sign(
531531- 'Ed25519',
532532- privateKey,
533533- new TextEncoder().encode(JSON.stringify(authMessage))
534534-);
535535-536536-// server verifies
537537-async function verifyAuth(req: AuthRequest): Promise<boolean> {
538538- const publicKey = await webcrypto.subtle.importKey(
539539- 'jwk',
540540- req.publicKey,
541541- { name: 'Ed25519' },
542542- false,
543543- ['verify']
544544- );
545545-546546- const messageBytes = new TextEncoder().encode(JSON.stringify(req.message));
547547- const signatureBytes = new Uint8Array(req.signature);
548548-549549- return await webcrypto.subtle.verify(
550550- 'Ed25519',
551551- publicKey,
552552- signatureBytes,
553553- messageBytes
554554- );
555555-}
556556-#+END_SRC
557557-558558-**** proposed schemas :ai:claude:
559559-560560-***** client-side schema (dexie)
561561-562562-#+BEGIN_SRC typescript
563563-// Core RSS/Podcast data (from your existing design)
564564-interface Channel {
565565- id: string;
566566- feedUrl: string;
567567- htmlUrl?: string;
568568- imageUrl?: string;
569569- title?: string;
570570- description?: string;
571571- language?: string;
572572- people?: Record<string, string>;
573573- tags?: string[];
574574-575575- // Refresh management
576576- refreshHP: number;
577577- nextRefreshAt?: number;
578578- lastRefreshAt?: number;
579579- lastRefreshStatus?: string;
580580- lastRefreshHttpStatus?: number;
581581- lastRefreshHttpEtag?: string;
582582-583583- // Cache info
584584- contentHash?: string;
585585- lastFetchedAt?: number;
586586-}
587587-588588-interface ChannelEntry {
589589- id: string;
590590- channelId: string;
591591- guid: string;
592592- title: string;
593593- linkUrl?: string;
594594- imageUrl?: string;
595595- snippet?: string;
596596- content?: string;
597597-598598- enclosure?: {
599599- url: string;
600600- type?: string;
601601- length?: number;
602602- };
603603-604604- podcast?: {
605605- explicit?: boolean;
606606- duration?: string;
607607- seasonNum?: number;
608608- episodeNum?: number;
609609- transcriptUrl?: string;
610610- };
611611-612612- publishedAt?: number;
613613- fetchedAt?: number;
614614-}
615615-616616-// Device-specific sync tables
617617-interface PlayRecord {
618618- id: string;
619619- entryId: string;
620620- deviceId: string;
621621- position: number;
622622- duration?: number;
623623- completed: boolean;
624624- speed: number;
625625- updatedAt: number;
626626-}
627627-628628-interface Subscription {
629629- id: string;
630630- channelId: string;
631631- deviceId?: string;
632632- parentPath: string; // "/Tech/Programming"
633633- autoDownload: boolean;
634634- downloadLimit?: number;
635635- isActive: boolean;
636636- createdAt: number;
637637- updatedAt: number;
638638-}
639639-640640-interface QueueItem {
641641- id: string;
642642- entryId: string;
643643- deviceId: string;
644644- position: number;
645645- addedAt: number;
646646-}
647647-648648-interface Device {
649649- id: string;
650650- name: string;
651651- platform: string;
652652- lastSeen: number;
653653-}
654654-655655-// Local cache metadata
656656-interface LocalCache {
657657- id: string;
658658- url: string;
659659- contentHash: string;
660660- filePath: string; // OPFS path
661661- cachedAt: number;
662662- expiresAt?: number;
663663- size: number;
664664- isOfflineOnly: boolean;
665665-}
666666-667667-// Dexie schema
668668-const db = new Dexie('SkypodDB');
669669-db.version(1).stores({
670670- channels: '&id, feedUrl, contentHash',
671671- channelEntries: '&id, channelId, publishedAt',
672672- playRecords: '&id, [entryId+deviceId], deviceId, updatedAt',
673673- subscriptions: '&id, channelId, deviceId, parentPath',
674674- queueItems: '&id, entryId, deviceId, position',
675675- devices: '&id, lastSeen',
676676- localCache: '&id, url, contentHash, expiresAt'
677677-});
678678-#+END_SRC
44+* concepts [0/4]
6795680680-***** server-side schema
66+the skypod architecture is broken into pieces:
6817682682-#+BEGIN_SRC typescript
683683-// Content-addressed cache
684684-interface ContentStore {
685685- contentHash: string; // Primary key
686686- content: Buffer; // Raw feed content
687687- contentType: string;
688688- contentLength: number;
689689- firstSeenAt: number;
690690- referenceCount: number;
691691-}
88+** p2p with realms
6929693693-interface ContentHistory {
694694- id: string;
695695- url: string;
696696- contentHash: string;
697697- fetchedAt: number;
698698- isLatest: boolean;
699699-}
1010+In order to sync device and playback state, we need to have devices communicate with each
1111+other, which requires a signaling server, for peer discovery at the very least.
70012701701-// HTTP cache with health tracking (from your existing design)
702702-interface HttpCache {
703703- key: string; // URL hash, primary key
704704- url: string;
1313+*** realm
70514706706- status: 'alive' | 'dead';
707707- lastFetchedAt: number;
708708- lastFetchError?: string;
709709- lastFetchErrorStreak: number;
1515+A realm is a collection of known identities, verified with signed JWTs, where those
1616+identities can communicate outside of the server's purview.
71017711711- lastHttpStatus: number;
712712- lastHttpEtag?: string;
713713- lastHttpHeaders: Record<string, string>;
714714- expiresAt: number;
715715- expirationTtl: number;
1818+A realm is not publicly routable; to gain access, one must have the realm id and an
1919+invitation to the realm from an already existing member; new realms are created on
2020+demand with a random realmid supplied by the client.
71621717717- contentHash: string; // Points to ContentStore
718718-}
2222+**** realm server
71923720720-// Sync/auth tables
721721-interface Realm {
722722- id: string; // UUID
723723- createdAt: number;
724724- verifiedKeys: string[]; // Public key list
725725-}
2424+A realm at the signalling sever is only a collection of known identity public keys, and
2525+the currently connected sockets.
72626727727-interface PeerConnection {
728728- id: string;
729729- realmId: string;
730730- publicKey: string;
731731- lastSeen: number;
732732- isOnline: boolean;
733733-}
2727+It acts mostly as a smart router; socket connection, authentication, and directed and
2828+realm-wide broadcast.
73429735735-// Media cache for podcast episodes
736736-interface MediaCache {
737737- contentHash: string; // Primary key
738738- originalUrl: string;
739739- mimeType: string;
740740- fileSize: number;
741741- content: Buffer;
742742- cachedAt: number;
743743- accessCount: number;
744744-}
745745-#+END_SRC
3030+***** TODO webrtc requirements
74631747747-**** episode title parsing for sub-feed groupings :ai:claude:
3232+- [ ] what needs to happen for webrtc? I think SDP messages will just flow like normal
3333+ - checkout whatever simplepeer does
74834749749-*problem*: some podcast feeds contain multiple shows, need hierarchical organization within a feed
3535+**** realm client
75036751751-*example*: "Apocalypse Players" podcast
752752-- episode title: "A Term of Art 6 - Winston's Hollow"
753753-- desired grouping: "Apocalypse Players > A Term of Art > 6 - Winston's Hollow"
754754-- UI shows sub-shows within the main feed
3737+In order to keep private data off of the server, the realm client takes on the additional
3838+task of maintaining a shared encyrption key for the realm, which can be used to encrypt
3939+data going over broadcasts.
75540756756-***** approaches considered
4141+***** TODO key exchange protocol
75742758758-1. *manual regex patterns* (short-term solution)
759759- - user provides regex with capture groups = tags
760760- - reliable, immediate, user-controlled
761761- - requires manual setup per feed
4343+*** identity
76244763763-2. *LLM-generated regex* (automation goal)
764764- - analyze last 100 episode titles
765765- - generate regex pattern automatically
766766- - good balance of automation + reliability
4545+Identity in the realm system is just an id and keypair.
76746768768-3. *NER model training* (experimental)
769769- - train spacy model for episode title parsing
770770- - current prototype: 150 labelled examples, limited success
771771- - needs more training data to be viable
4747+The private key stays local to the installed device, and is used to send signed tokens
4848+over the wire, either to the realm server to manage authentication, or over other channels
4949+to other members in the realm.
77250773773-***** data model implications
5151+The /public/ key is stored by all members of the realm, and the server, in order to
5252+perform signature validation (which is also authentication).
77453775775-- add regex pattern field to Channel/Feed
776776-- store extracted groupings as hierarchical tags on ~ChannelEntry~
777777-- maybe add grouping/series field to episodes
5454+**** browser private key storage
77855779779-***** plan
5656+There is no good way to store private keys in a browser, but there are less bad ways.
78057781781-*preference*: start with manual regex, evolve toward LLM automation
5858+- private keys are ~CryptoKey~ objects, with ~{ exportable: false }~
5959+- WebCrypto native ~CryptoKey~ are structured clonable, which means they can get saved to indexeddb
78260783783-*implementation design*:
784784-- if no title pattern: episodes are direct children of the feed
785785-- title pattern = regex with named capture groups + path template
6161+At the end of the day this is a podcast app.
78662787787-*example configuration*:
788788-- regex: ~^(?<series>[^0-9]+)\s*(?<episode>\d+)\s*-\s*(?<title>.+)$~
789789-- path template: ~{series} > Episode {episode} - {title}~
790790-- result: "A Term of Art 6 - Winston's Hollow" → "A Term of Art > Episode 6 - Winston's Hollow"
6363+***** TODO are there other ways to do this?
79164792792-*schema additions*:
793793-#+BEGIN_SRC typescript
794794-interface Channel {
795795- // ... existing fields
796796- titlePatterns?: Array<{
797797- name: string; // "Main Episodes", "Bonus Content", etc.
798798- regex: string; // named capture groups
799799- pathTemplate: string; // interpolation template
800800- priority: number; // order to try patterns (lower = first)
801801- isActive: boolean; // can disable without deleting
802802- }>;
803803- fallbackPath?: string; // template for unmatched episodes
804804-}
6565+Could we use webauthn, or some way to use a yubikey to sign something?
80566806806-interface ChannelEntry {
807807- // ... existing fields
808808- parsedPath?: string; // computed from titlePattern
809809- parsedGroups?: Record<string, string>; // captured groups
810810- matchedPatternName?: string; // which pattern was used
811811-}
812812-#+END_SRC
6767+*** sync
81368814814-*pattern matching logic*:
815815-1. try patterns in priority order (lower number = higher priority)
816816-2. first matching pattern wins
817817-3. if no patterns match, use fallbackPath template (e.g., "Misc > {title}")
818818-4. if no fallbackPath, episode stays direct child of feed
6969+Once a device has authenticated to the realm server over a websocket connection, it can
7070+send and broadcast any message it likes to the other online of the realm, via the websocket
7171+announcement channel.
81972820820-*example multi-pattern setup*:
821821-- Pattern 1: "Main Episodes" - ~^(?<series>[^0-9]+)\s*(?<episode>\d+)~ → ~{series} > Episode {episode}~
822822-- Pattern 2: "Bonus Content" - ~^Bonus:\s*(?<title>.+)~ → ~Bonus > {title}~
823823-- Fallback: ~Misc > {title}~
7373+This sets it up to be used as a signaling server for WebRTC, allowing realm devices to
7474+communicate p2p without a dependency on the realm server outside of authentication and
7575+signaling.
82476825825-**** scoped tags and filter-based UI evolution :ai:claude:
7777+** feed proxy server
82678827827-*generalization*: move from rigid hierarchies to tag-based filtering system
7979+Due to ~CORS~, we'll need to help clients fetch the contents of feeds by running a caching
8080+proxy server for various HTTP requests.
82881829829-*tag scoping*:
830830-- feed-level tags: "Tech", "Gaming", "D&D"
831831-- episode-level tags: from regex captures like "series:CriticalRole", "campaign:2", "type:main"
832832-- user tags: manual additions like "favorites", "todo"
833833-834834-*UI as tag filtering*:
835835-- default view: all episodes grouped by feed
836836-- filter by ~series:CriticalRole~ → shows only CR episodes across all feeds
837837-- filter by ~type:bonus~ → shows bonus content from all podcasts
838838-- combine filters: ~series:CriticalRole AND type:main~ → main CR episodes only
839839-840840-*benefits*:
841841-- no rigid hierarchy - users create their own views
842842-- regex patterns become automated episode taggers
843843-- same filtering system works for search, organization, queues
844844-- tags are syncable metadata, views are client-side
845845-846846-*schema evolution*:
847847-#+BEGIN_SRC typescript
848848-interface Tag {
849849- scope: 'feed' | 'episode' | 'user';
850850- key: string; // "series", "type", "campaign"
851851- value: string; // "CriticalRole", "bonus", "2"
852852-}
853853-854854-interface ChannelEntry {
855855- // ... existing
856856- tags: Tag[]; // includes regex-generated + manual
857857-}
858858-859859-interface FilterView {
860860- id: string;
861861- name: string;
862862- folderPath: string; // "/Channels/Critical Role"
863863- filters: Array<{
864864- key: string;
865865- value: string;
866866- operator: 'equals' | 'contains' | 'not';
867867- }>;
868868- isDefault: boolean;
869869- createdAt: number;
870870-}
871871-#+END_SRC
872872-873873-**** default UI construction and feed merging :ai:claude:
874874-875875-*auto-generated views on subscribe*:
876876-- subscribe to "Critical Role" → creates ~/Channels/Critical Role~ folder
877877-- default filter view: ~feed:CriticalRole~ (shows all episodes from that feed)
878878-- user can customize, split into sub-views, or delete
879879-880880-*smart view suggestions*:
881881-- after regex patterns generate tags, suggest splitting views
882882-- "I noticed episodes with ~series:Campaign2~ and ~series:Campaign3~ - create separate views?"
883883-- "Create view for ~type:bonus~ episodes?"
884884-885885-*view management UX*:
886886-- right-click feed → "Split by series", "Split by type"
887887-- drag episodes between views to create manual filters
888888-- views can be nested: ~/Channels/Critical Role/Campaign 2/Main Episodes~
889889-890890-*feed merging for multi-source shows*:
891891-problem: patreon feed + main show feed for same podcast
892892-893893-#+BEGIN_EXAMPLE
894894-/Channels/
895895- Critical Role/
896896- All Episodes # merged view: feed:CriticalRole OR feed:CriticalRolePatreon
897897- Main Feed # filter: feed:CriticalRole
898898- Patreon Feed # filter: feed:CriticalRolePatreon
899899-#+END_EXAMPLE
900900-901901-*deduplication strategy*:
902902-- episodes matched by ~guid~ or similar content hash
903903-- duplicate episodes get ~source:main,patreon~ tags
904904-- UI shows single episode with source indicators
905905-- user can choose preferred source for playback
906906-- play state syncs across all sources of same episode
8282+- help bypass ~CORS~ restrictions, so clients can access the content of the response
8383+- cache feeds, especially with regards to running transformations
8484+- perform transformations on responses:
8585+ - text feeds: reader mode, detect reading time
8686+ - podcast feeds: extract episode metadata, audio analysis for silence skips, etc
8787+ - all feeds: extract title tags, etc.
90788908908-*feed relationship schema*:
909909-#+BEGIN_SRC typescript
910910-interface FeedGroup {
911911- id: string;
912912- name: string; // "Critical Role"
913913- feedIds: string[]; // [mainFeedId, patreonFeedId]
914914- mergeStrategy: 'guid' | 'title' | 'contentHash';
915915- defaultView: FilterView;
916916-}
8989+*** TODO open question: is the client able to not use the proxy?
91790918918-interface ChannelEntry {
919919- // ... existing
920920- duplicateOf?: string; // points to canonical episode ID
921921- sources: string[]; // feed IDs where this episode appears
922922-}
923923-#+END_SRC
9191+I'm not sure yet if we want the PWA to be able to pull feeds directly when the server
9292+isn't present. It would be much easier to keep it around, but
92493925925-**per-view settings and state**:
926926-each filter view acts like a virtual feed with its own:
927927-- unread counts (episodes matching filter that haven't been played)
928928-- notification settings (notify for new episodes in this view)
929929-- muted state (hide notifications, mark as read automatically)
930930-- auto-download preferences (download episodes that match this filter)
931931-- play queue integration (add new episodes to queue)
9494+** feed management
93295933933-**use cases**:
934934-- mute "Bonus Content" view but keep notifications for main episodes
935935-- auto-download only "Campaign 2" episodes, skip everything else
936936-- separate unread counts: "5 unread in Main Episodes, 2 in Bonus"
937937-- queue only certain series automatically
9696+With a solid p2p WebRTC connection, we can use something like ~dexie~ or ~rxdb~ to get a
9797+synced document database that we use to manage feeds.
93898939939-**schema additions**:
940940-#+BEGIN_SRC typescript
941941-interface FilterView {
942942- // ... existing fields
943943- settings: {
944944- notificationsEnabled: boolean;
945945- isMuted: boolean;
946946- autoDownload: boolean;
947947- autoQueue: boolean;
948948- downloadLimit?: number; // max episodes to keep
949949- };
950950- state: {
951951- unreadCount: number;
952952- lastViewedAt?: number;
953953- isCollapsed: boolean; // in sidebar
954954- };
955955-}
956956-#+END_SRC
9999+* flow
957100958958-*inheritance behavior*:
959959-- new filter views inherit settings from parent feed/group
960960-- user can override per-view
961961-- "mute all Critical Role" vs "mute only bonus episodes"
101101+- user goes to https://skypod.accidental.cc
102102+ - pwa runs, prompts to do full install for storage and offline
103103+ - pwa is installed, sets up caches
962104963963-**** client-side episode caching strategy :ai:claude:
105105+- first run
106106+ - identity is generated (id + keypair per device)
107107+ - do you want to sync to an existing install?
108108+ - if yes, go to invitee flow
109109+ - otherwise, new realm is generated and registered
110110+ - pubkey and id get stored in the realm, to make future sync easier
964111965965-*architecture*: service worker-based transparent caching
112112+- subsequent runs
113113+ - identity already exists, so we just go about our day
966114967967-*flow*:
968968-1. audio player requests ~/audio?url={episodeUrl}~
969969-2. service worker intercepts request
970970-3. if present in cache (with Range header support):
971971- - serve from cache
972972-4. else:
973973- - let request continue to server (immediate playback)
974974- - simultaneously start background fetch of full audio file
975975- - when complete, broadcast "episode-cached" event
976976- - audio player catches event and restarts feed → now uses cached version
977977-978978-**benefits**:
979979-- no playback interruption (streaming starts immediately)
980980-- seamless transition to cached version
981981-- Range header support for seeking/scrubbing
982982-- transparent to audio player implementation
983983-984984-*implementation considerations*:
985985-- cache storage limits and cleanup policies
986986-- partial download resumption if interrupted
987987-- cache invalidation when episode URLs change
988988-- offline playback support
989989-- progress tracking for background downloads
990990-991991-**schema additions**:
992992-#+BEGIN_SRC typescript
993993-interface CachedEpisode {
994994- episodeId: string;
995995- originalUrl: string;
996996- cacheKey: string; // for cache API
997997- fileSize: number;
998998- cachedAt: number;
999999- lastAccessedAt: number;
10001000- downloadProgress?: number; // 0-100 for in-progress downloads
10011001-}
10021002-#+END_SRC
10031003-10041004-**service worker events**:
10051005-- ~episode-cache-started~ - background download began
10061006-- ~episode-cache-progress~ - download progress update
10071007-- ~episode-cache-complete~ - ready to switch to cached version
10081008-- ~episode-cache-error~ - download failed, stay with streaming
10091009-10101010-**background sync for proactive downloads**:
10111011-10121012-**browser support reality**:
10131013-- Background Sync API: good support (Chrome/Edge, limited Safari)
10141014-- Periodic Background Sync: very limited (Chrome only, requires PWA install)
10151015-- Push notifications: good support, but requires user permission
10161016-10171017-**hybrid approach**:
10181018-1. **foreground sync** (reliable): when app is open, check for new episodes
10191019-2. **background sync** (opportunistic): register sync event when app closes
10201020-3. **push notifications** (fallback): server pushes "new episodes available"
10211021-4. **manual sync** (always works): pull-to-refresh, settings toggle
10221022-10231023-**implementation strategy**:
10241024-#+BEGIN_SRC typescript
10251025-// Register background sync when app becomes hidden
10261026-document.addEventListener('visibilitychange', () => {
10271027- if (document.hidden && 'serviceWorker' in navigator) {
10281028- navigator.serviceWorker.ready.then(registration => {
10291029- return registration.sync.register('download-episodes');
10301030- });
10311031- }
10321032-});
10331033-10341034-// Service worker handles sync event
10351035-self.addEventListener('sync', event => {
10361036- if (event.tag === 'download-episodes') {
10371037- event.waitUntil(syncEpisodes());
10381038- }
10391039-});
10401040-#+END_SRC
10411041-10421042-**realistic expectations**:
10431043-- iOS Safari: very limited background processing
10441044-- Android Chrome: decent background sync support
10451045-- Desktop: mostly works
10461046-- battery/data saver modes: disabled by OS
10471047-10481048-**fallback strategy**: rely primarily on foreground sync + push notifications, treat background sync as nice-to-have enhancement
10491049-10501050-**push notification sync workflow**:
10511051-10521052-**server-side trigger**:
10531053-1. server detects new episodes during RSS refresh
10541054-2. check which users are subscribed to that feed
10551055-3. send push notification with episode metadata payload
10561056-4. notification wakes up service worker on client
10571057-10581058-**service worker notification handler**:
10591059-#+BEGIN_SRC typescript
10601060-self.addEventListener('push', event => {
10611061- const data = event.data?.json();
10621062-10631063- if (data.type === 'new-episodes') {
10641064- event.waitUntil(
10651065- // Start background download of new episodes
10661066- downloadNewEpisodes(data.episodes)
10671067- .then(() => {
10681068- // Show notification to user
10691069- return self.registration.showNotification('New episodes available', {
10701070- body: ~${data.episodes.length} new episodes downloaded~,
10711071- icon: '/icon-192.png',
10721072- badge: '/badge-72.png',
10731073- tag: 'new-episodes',
10741074- data: { episodeIds: data.episodes.map(e => e.id) }
10751075- });
10761076- })
10771077- );
10781078- }
10791079-});
10801080-10811081-// Handle notification click
10821082-self.addEventListener('notificationclick', event => {
10831083- event.notification.close();
10841084-10851085- // Open app to specific episode or feed
10861086- event.waitUntil(
10871087- clients.openWindow(~/episodes/${event.notification.data.episodeIds[0]}~)
10881088- );
10891089-});
10901090-#+END_SRC
10911091-10921092-**server push logic**:
10931093-- batch notifications (don't spam for every episode)
10941094-- respect user notification preferences from FilterView settings
10951095-- include episode metadata in payload to avoid round-trip
10961096-- throttle notifications (max 1 per feed per hour?)
10971097-10981098-**user flow**:
10991099-1. new episode published → server pushes notification
11001100-2. service worker downloads episode in background
11011101-3. user sees "New episodes downloaded" notification
11021102-4. tap notification → opens app to new episode, ready to play offline
11031103-11041104-*benefits*:
11051105-- true background downloading without user interaction
11061106-- works even when app is closed
11071107-- respects per-feed notification settings
11081108-11091109-**push payload size constraints**:
11101110-- **limit**: ~4KB (4,096 bytes) across most services
11111111-- **practical limit**: ~3KB to account for service overhead
11121112-- **implications for episode metadata**:
11131113-11141114-#+BEGIN_SRC json
11151115-{
11161116- "type": "new-episodes",
11171117- "episodes": [
11181118- {
11191119- "id": "ep123",
11201120- "channelId": "ch456",
11211121- "title": "Episode Title",
11221122- "url": "https://...",
11231123- "duration": 3600,
11241124- "size": 89432112
11251125- }
11261126- ]
11271127-}
11281128-#+END_SRC
11291129-11301130-**payload optimization strategies**:
11311131-- minimal episode metadata in push (id, url, basic info)
11321132-- batch multiple episodes in single notification
11331133-- full episode details fetched after service worker wakes up
11341134-- URL shortening for long episode URLs
11351135-- compress JSON payload if needed
11361136-11371137-**alternative for large payloads**:
11381138-- push notification contains only "new episodes available" signal
11391139-- service worker makes API call to get full episode list
11401140-- trade-off: requires network round-trip but unlimited data
11411141-11421142-**logical clock sync optimization**:
11431143-11441144-much simpler approach using sync revisions:
11451145-11461146-#+BEGIN_SRC json
11471147-{
11481148- "type": "sync-available",
11491149- "fromRevision": 12345,
11501150- "toRevision": 12389,
11511151- "changeCount": 8
11521152-}
11531153-#+END_SRC
11541154-11551155-**service worker sync flow**:
11561156-1. push notification wakes service worker with revision range
11571157-2. service worker fetches ~/sync?from=12345&to=12389~
11581158-3. server returns only changes in that range (episodes, feed updates, etc)
11591159-4. service worker applies changes to local dexie store
11601160-5. service worker queues background downloads for new episodes
11611161-6. updates local revision to 12389
11621162-11631163-**benefits of revision-based approach**:
11641164-- tiny push payload (just revision numbers)
11651165-- server can efficiently return only changes in range
11661166-- automatic deduplication (revision already applied = skip)
11671167-- works for any sync data (episodes, feed metadata, user settings)
11681168-- handles offline gaps gracefully (fetch missing revision ranges)
11691169-11701170-**sync API response**:
11711171-#+BEGIN_SRC typescript
11721172-interface SyncResponse {
11731173- fromRevision: number;
11741174- toRevision: number;
11751175- changes: Array<{
11761176- type: 'episode' | 'channel' | 'subscription';
11771177- operation: 'create' | 'update' | 'delete';
11781178- data: any;
11791179- revision: number;
11801180- }>;
11811181-}
11821182-#+END_SRC
11831183-11841184-**integration with episode downloads**:
11851185-- service worker processes sync changes
11861186-- identifies new episodes that match user's auto-download filters
11871187-- queues those for background cache fetching
11881188-- much more efficient than sending episode metadata in push payload
11891189-11901190-**service worker processing time constraints**:
11911191-11921192-**hard limits**:
11931193-- **30 seconds idle timeout**: service worker terminates after 30s of inactivity
11941194-- **5 minutes event processing**: single event/request must complete within 5 minutes
11951195-- **30 seconds fetch timeout**: individual network requests timeout after 30s
11961196-- **notification requirement**: push events MUST display notification before promise settles
11971197-11981198-**practical implications**:
11991199-- sync API call (~/sync?from=X&to=Y~) must complete within 30s
12001200-- large episode downloads must be queued, not started immediately in push handler
12011201-- use ~event.waitUntil()~ to keep service worker alive during processing
12021202-- break large operations into smaller chunks
12031203-12041204-**recommended push event flow**:
12051205-#+BEGIN_SRC typescript
12061206-self.addEventListener('push', event => {
12071207- const data = event.data?.json();
12081208-12091209- event.waitUntil(
12101210- // Must complete within 5 minutes total
12111211- handlePushSync(data)
12121212- .then(() => {
12131213- // Required: show notification before promise settles
12141214- return self.registration.showNotification('Episodes synced');
12151215- })
12161216- );
12171217-});
12181218-12191219-async function handlePushSync(data) {
12201220- // 1. Quick sync API call (< 30s)
12211221- const changes = await fetch(~/sync?from=${data.fromRevision}&to=${data.toRevision}~);
12221222-12231223- // 2. Apply changes to dexie store (fast, local)
12241224- await applyChangesToStore(changes);
12251225-12261226- // 3. Queue episode downloads for later (don't start here)
12271227- await queueEpisodeDownloads(changes.newEpisodes);
12281228-12291229- // Total time: < 5 minutes, preferably < 30s
12301230-}
12311231-#+END_SRC
12321232-12331233-*download strategy*: use push event for sync + queuing, separate background tasks for actual downloads
12341234-12351235-*background fetch API for large downloads*:
12361236-12371237-*progressive enhancement approach*:
12381238-#+BEGIN_SRC typescript
12391239-async function queueEpisodeDownloads(episodes) {
12401240- for (const episode of episodes) {
12411241- if ('serviceWorker' in navigator && 'BackgroundFetch' in window) {
12421242- // Chrome/Edge: use Background Fetch API for true background downloading
12431243- await navigator.serviceWorker.ready.then(registration => {
12441244- return registration.backgroundFetch.fetch(
12451245- ~episode-${episode.id}~,
12461246- episode.url,
12471247- {
12481248- icons: [{ src: '/icon-256.png', sizes: '256x256', type: 'image/png' }],
12491249- title: ~Downloading: ${episode.title}~,
12501250- downloadTotal: episode.fileSize
12511251- }
12521252- );
12531253- });
12541254- } else {
12551255- // Fallback: queue for reactive download (download while streaming)
12561256- await queueReactiveDownload(episode);
12571257- }
12581258- }
12591259-}
12601260-12611261-// Handle background fetch completion
12621262-self.addEventListener('backgroundfetch', event => {
12631263- if (event.tag.startsWith('episode-')) {
12641264- event.waitUntil(handleEpisodeDownloadComplete(event));
12651265- }
12661266-});
12671267-#+END_SRC
12681268-12691269-*browser support reality*:
12701270-- *Chrome/Edge*: Background Fetch API supported
12711271-- *Firefox/Safari*: not supported, fallback to reactive caching
12721272-- *mobile*: varies by platform and browser
12731273-12741274-*benefits when available*:
12751275-- true background downloading (survives app close, browser close)
12761276-- built-in download progress UI
12771277-- automatic retry on network failure
12781278-- no service worker time limits during download
12791279-12801280-*graceful degradation*:
12811281-- detect support, use when available
12821282-- fallback to reactive caching (download while streaming)
12831283-- user gets best experience possible on their platform
12841284-12851285-*** research todos :ai:claude:
12861286-12871287-high-level unanswered questions from architecture brainstorming:
12881288-12891289-**** sync and data management
12901290-***** TODO dexie sync capabilities vs rxdb for multi-device sync implementation
12911291-***** TODO webrtc p2p sync implementation patterns and reliability
12921292-***** TODO conflict resolution strategies for device-specific data in distributed sync
12931293-***** TODO content-addressed deduplication algorithms for rss/podcast content
12941294-**** client-side storage and caching
12951295-***** TODO opfs storage limits and cleanup strategies for client-side caching
12961296-***** TODO practical background fetch api limits and edge cases for podcast downloads
12971297-**** automation and intelligence
12981298-***** TODO llm-based regex generation for episode title parsing automation
12991299-***** TODO push notification subscription management and realm authentication
13001300-**** platform and browser capabilities
13011301-***** TODO browser audio api capabilities for podcast-specific features (speed, silence skip)
13021302-***** TODO progressive web app installation and platform-specific behaviors
13031303-13041304-# Local Variables:
13051305-# org-hierarchical-todo-statistics: nil
13061306-# org-checkbox-hierarchical-statistics: nil
13071307-# End:
115115+- invitee flow
116116+ - already generated identity
117117+ - qr code pops
118118+ - scanned by inviter, see inviter flow
119119+ - done button after
120120+ - camera pops, scan inviter's QR codes
121121+ - sends invitation+registration token to server
122122+ - added to the realm
123123+ - go to subsequent runs
+5-5
readme.org
···2929- Common
3030 - ES2024 Javascript, running on in modern browsers or [[https://nodejs.org][Node v24]]
3131 - [[https://github.com/panva/jose][~jose~]] for cross-platform webcrypto and JWT management
3232- - [[https://zod.dev/][Zod v4]] describes schema and builds transforamtion pipelines
3232+ - [[https://zod.dev/][Zod v4]] describes schema and builds transformation pipelines
3333- Backend
3434 - [[https://expressjs.com/][Express]] and Node's ~stdlib~ for HTTP and WebSocket servers
3535- Frontend
3636 - [[https://vite.dev/][Vite]] does FE builds
3737- - [[https://react.dev][React]] + [[https://zustand.docs.pmnd.rs][Zustand]] for UI
3737+ - [[https://preactjs.com/][Preact]] + [[https://zustand.docs.pmnd.rs][Zustand]] for UI
3838- Build & DX
3939 - [[https://github.com/google/wireit][Wireit]] does script dependencies and services
4040 - [[https://jsdoc.app/][JSDoc]], along with [[https://www.typescriptlang.org/docs/handbook/jsdoc-supported-types.html][Typescript's JSDoc support]] does typechecking
···4646- per-realm SQLite databases with node's native sqlite support
4747- docker compose for deployment with self-hosted realm storage
48484949-See [[./devlog.org]] for design and architecture thoughts.
4949+See [[./readme-devlog.org]] for design and architecture thoughts.
50505151** Scripts
52525353All scripts can have ~--watch~ passed as an argument to have ~wireit~ rerun when inputs change.
5454This is not useful for everything.
55555656-- ~npm run dev~ :: alias for ~npm run start:dev --watch~
5656+- ~npm run dev~ :: alias for ~npm run start:dev~
5757- ~npm run lint~ :: runs ~eslint~
5858- ~npm run types~ :: runs ~tsc~ (no emitting, just typechecking)
5959- ~npm run docs~ :: runs ~jsdoc~ to generate docs in ~./docs~
···7373This program is free software: you can redistribute it and/or modify it under the terms of
7474the **Affero General Public License verson 3 or later** (AGPLv3+).
75757676-Please see [[./license.txt]] for a copy of the full license.
7676+Please see [[./readme-license.txt]] for a copy of the full license.
+2-2
src/client/page-app.spec.jsx
···17171818 // Check the JSX structure without full rendering
1919 expect(component.type).toBe(Fragment)
2020- expect(component.props.children).toHaveProperty('type', 'h1')
2121- expect(component.props.children.props.children).toBe('whatever')
2020+ expect(component.props.children[0]).toHaveProperty('type', 'h1')
2121+ expect(component.props.children[0].props.children).toBe('whatever')
2222 })
2323})
2424