#+PROPERTY: COOKIE_DATA recursive #+STARTUP: overview * concepts [0/5] the skypod architecture is broken into pieces: ** p2p with realms In order to sync device and playback state, we need to have devices communicate with each other, which requires a signaling server, for peer discovery at the very least. *** realm A realm is a collection of known identities, verified with signed JWTs, where those identities can communicate outside of the server's purview. A realm is not publicly routable; to gain access, one must have the realm id and an invitation to the realm from an already existing member; new realms are created on demand with a random realmid supplied by the client. **** realm server A realm at the signalling sever is only a collection of known identity public keys, and the currently connected sockets. It acts mostly as a smart router; socket connection, authentication, and directed and realm-wide broadcast. **** realm client The realm client manages a full-mesh p2p network with other connected peers in the realm, and exposes the combined set of incoming messages as a stream that the client can do whatever with. ***** realm encryption In order to keep private data off of the server, the realm client takes on the additional task of maintaining a shared encryption key for the realm, which can be used to encrypt data going over broadcasts. ****** TODO key exchange protocol *** identity Identity in the realm system is just an id and keypair. The private key stays local to the installed device, and is used to send signed tokens over the wire, either to the realm server to manage authentication, or over other channels to other members in the realm. The /public/ key is stored by all members of the realm, and the server, in order to perform signature validation (which is also authentication). **** browser private key storage There is no good way to store private keys in a browser, but there are less bad ways. - private keys are ~CryptoKey~ objects, with ~{ exportable: false }~ - WebCrypto native ~CryptoKey~ are structured clonable, which means they can get saved to indexeddb At the end of the day this is a podcast app. ***** TODO are there other ways to do this? Could we use webauthn, or some way to use a yubikey to sign something? *** connection flow #+begin_src mermaid :file docs/readme/connection-flow.png sequenceDiagram participant Server as Server actor Client as Client participant PeerA as Existing Peer A participant PeerB as Existing Peer B Client<<->>Server: WebSocket Connection Note over Client, Server: Authentication Phase (3 second timeout) alt Registration (new realm) Client->>Server: JWT: preauth.register {pubkey} Server->>Server: validate JWT signature Server->>Server: ensureRegisteredRealm(realmid, identid, pubkey) else Authentication (existing realm) Client->>Server: JWT: preauth.authn {} Server->>Server: get publickey for issuer Server->>Server: validate JWT signature else Invitation Exchange Client->>Server: JWT: preauth.exchange {inviteJwt, pubkey} Server->>Server: get publickey for issuer Server->>Server: validate invitation JWT signature & nonce Server->>Server: admitToRealm(realmid, identid, pubkey) Server->>Server: identity admitted end Note over PeerB, Server: Authenticated! Exchange Peer Identities par do Server->>Client: preauth.authn response {peers[], identities{}} and Server--)PeerA: realm.rtc.peer-joined {identid, pubkey} and Server--)PeerB: realm.rtc.peer-joined {identid, pubkey} end Note over Client, PeerB: WebRTC Connections (initiated by Client) par do per connected peer Client->>+Server: realm.rtc.signal {signed_jwt, localid, remoteid} Server->>PeerA: realm.rtc.signal {signed_jwt, localid, remoteid} PeerA->>PeerA: verify JWT signature (pubkey from `peer-joined`) PeerA->>Server: realm.rtc.signal {signed_answer_jwt, localid, remoteid} Server->>-Client: realm.rtc.signal {signed_answer_jwt, localid, remoteid} Client->>Client: verify JWT signature (pubkey from `authn` response) Note over Client, PeerA: Direct P2P Connection Established Client->>PeerA: Direct WebRTC Connection loop every 30s PeerA<<->>Client: realm.rtc.ping/pong PeerA<<->>Client: application messages end end Note over PeerB, Server: Message Flow (Operational) Client->>PeerA: Application Data (p2p) Client->>PeerB: Application Data (p2p) Client->>Server: realm.broadcast {payload, recipients} Server->>PeerB: realm.broadcast {payload} Note over PeerB, Server: Connection Persistence & Error Handling alt Socket Error/Close Client->>Client: Connection destroyed Server->>PeerA: realm.rtc.peer-left {identid} Server->>PeerB: realm.rtc.peer-left {identid} PeerA->>PeerA: Disconnect peer PeerB->>PeerB: Disconnect peer else Peer Leaves PeerA->>Server: Connection closes Server->>Client: realm.rtc.peer-left {peerA_identid} Server->>PeerB: realm.rtc.peer-left {peerA_identid} Client->>Client: Disconnect from PeerA end #+end_src #+RESULTS: [[file:docs/readme/connection-flow.png]] *** message format A message consists of: - ~typ~ :: the message type, see below - ~msg~ :: a message key, used for discrimination on the payload - ~seq?~ :: an (optional) sequence number, which allows for request/reply semantics - ~dat?~ :: a payload, the schema for which depends on ~typ~ and ~msg~ (possibly omitted) See ~#common/protocol~ for the messages that have been defined. **** ~typ~ - ~evt~ :: events - one-way notifications - ~req~ :: requests - request-response pattern with sequence matching - ~res~ :: responses - correlated responses to requests by ~seq~ - ~err~ :: errors - structured errors with HTTP-style status codes **** examples #+begin_example { "typ": "req", "msg": "preauth.register", "seq": 1234, "dat": { "pubkey": { ... } } } #+end_example #+begin_example { "typ": "req", "msg": "preauth.authn", "seq": 1234 } #+end_example #+begin_example { "typ": "res", "msg": "preauth.authn", "seq": 1234, "dat": { "identities": { ... }, "peers": { ... } } } #+end_example *** message authentication A client authenticates to the system and to peers by exchanging signed JWTs as the first messages in the messaging protocol; for authenticating to the realm, and for exchanging webrtc signaling messages. **** TODO webrtc encryption? The websocket is encrypted by virtue of ~wss~ and the server's cert, but I'm not sure if the traffic over webrtc is encrypted; if it's not, we should figure that out. ** feed proxy server Due to ~CORS~, we'll need to help clients fetch the contents of feeds by running a caching proxy server for various HTTP requests. - help bypass ~CORS~ restrictions, so clients can access the content of the response - cache feeds, especially with regards to running transformations - perform transformations on responses: - text feeds: reader mode, detect reading time - podcast feeds: extract episode metadata, audio analysis for silence skips, etc - all feeds: extract title tags, etc. *** TODO open question: is the client able to not use the proxy? I'm not sure yet if we want the PWA to be able to pull feeds directly when the server isn't present. It would be much easier to keep it around, but ** feed management With a solid p2p WebRTC connection, we can use something like ~dexie~ or ~rxdb~ to get a synced document database that we use to manage feeds. * flow - user goes to https://skypod.accidental.cc - pwa runs, prompts to do full install for storage and offline - pwa is installed, sets up caches - first run - identity is generated (id + keypair per device) - do you want to sync to an existing install? - if yes, go to invitee flow - otherwise, new realm is generated and registered - pubkey and id get stored in the realm, to make future sync easier - subsequent runs - identity already exists, so we just go about our day - invitee flow - already generated identity - qr code pops - scanned by inviter, see inviter flow - done button after - camera pops, scan inviter's QR codes - sends invitation+registration token to server - added to the realm - go to subsequent runs * WebRTC Full-Mesh Implementation Plan ** Overview Implement a full-mesh WebRTC system where every client in a realm establishes direct peer-to-peer connections with all other clients. The existing WebSocket infrastructure serves as the signaling channel, leveraging the newly refactored broadcast protocol. ** Architecture Components *** Protocol Extensions (src/common/protocol/messages-rtc.js) Create new WebRTC message schemas: - rtc.offer - SDP offer with connectionId - rtc.answer - SDP answer with connectionId - rtc.ice-candidate - ICE candidate exchange - rtc.peer-state - Connection state updates - rtc.request-connection - Initiate connection with polite flag - rtc.peer-joined - Server message when peer joins (includes member list) - rtc.peer-left - Server message when peer leaves *** Server-Side Updates **** broadcastToRealm Function Enhancement Update the function signature to: - Take complete messages instead of payloads - Add skipSelf flag (default true) - For rtc.peer-joined, set skipSelf=false to include sender **** Handler Updates (handler-realm.js) - On client join: Broadcast rtc.peer-joined to ALL members (including self) - On client leave: Broadcast rtc.peer-left to remaining members - Existing broadcast mechanism handles WebRTC signaling perfectly *** WebRTC Utilities (src/common/webrtc.js) Core utilities for WebRTC: - RTC_CONFIG with STUN servers - DATA_CHANNEL_CONFIG for reliable messaging - PerfectNegotiation class for glare-free negotiation - ConnectionHealthMonitor for ping/pong health checks *** Client WebRTC Manager (src/client/webrtc-manager.js) Main orchestrator that: - Manages all peer connections - Handles incoming RTC messages - Routes signaling between peers - Emits events for UI updates - Provides public API for sending messages *** Client Peer Connection (src/client/peer-connection.js) Individual peer connection handler: - RTCPeerConnection lifecycle management - Perfect negotiation implementation - Data channel setup and messaging - Health monitoring with ping/pong - Automatic reconnection with exponential backoff - Connection state tracking *** UI Components **** PeerList Component - Shows all realm members - Connection status indicators - Real-time state updates **** MessageInterface Component - Send messages via WebRTC or server broadcast - Display incoming messages - Mode selection (P2P vs server relay) ** Connection Flow *** Initial Join 1. Client authenticates via WebSocket 2. Server sends realm.status 3. Server broadcasts rtc.peer-joined to ALL members 4. Client sees own join message with member list 5. Client initializes WebRTCManager 6. Client connects to all existing members *** Peer-to-Peer Connection 1. Initiator creates RTCPeerConnection (polite=true) 2. Creates data channel, triggering negotiation 3. Sends offer via realm.broadcast to target peer 4. Target creates RTCPeerConnection (polite=false) 5. Exchanges answer and ICE candidates 6. Data channel opens, health monitoring starts *** Reconnection 1. Health monitor detects issues or connection drops 2. Exponential backoff timer starts 3. New connection attempt with fresh connectionId 4. ICE restart or full renegotiation ** Key Design Decisions *** Perfect Negotiation Pattern Prevents glare when both peers try to connect simultaneously by using polite/impolite roles. *** Health Monitoring Proactive ping/pong messages detect connection issues before browser APIs, enabling faster recovery. *** Connection ID Tracking Each connection attempt has unique ID to ensure offer/answer pairs match during concurrent connections. *** Leveraging Existing Infrastructure WebRTC signaling is just another payload type in the existing broadcast system - no new server complexity. ** Implementation Order *** Phase 1: Core Infrastructure 1. Create protocol message schemas 2. Update broadcastToRealm function 3. Add peer join/leave broadcasts 4. Create WebRTC utilities module *** Phase 2: Client Connection Management 1. Implement WebRTCManager 2. Create PeerConnection class 3. Add perfect negotiation 4. Implement health monitoring *** Phase 3: UI Integration 1. Update main app to initialize WebRTC 2. Create PeerList component 3. Add MessageInterface with dual modes 4. Style connection indicators *** Phase 4: Robustness 1. Add reconnection logic 2. Implement ICE restart 3. Handle edge cases 4. Add comprehensive error handling ** Testing Strategy *** Unit Tests - Perfect negotiation scenarios - Health monitoring logic - Message routing *** Integration Tests - Full connection flow with mocks - Signaling message flow - State management *** E2E Tests - Real browser testing - Network condition simulation - Multi-peer scenarios *** Load Tests - Mesh scalability limits - Message throughput - Connection stability