# SWIM Protocol Library - Usage Guide This library provides a production-ready implementation of the SWIM (Scalable Weakly-consistent Infection-style Process Group Membership) protocol in OCaml 5. It handles cluster membership, failure detection, and messaging. ## Key Features - **Membership**: Automatic discovery and failure detection. - **Gossip**: Efficient state propagation (Alive/Suspect/Dead). - **Messaging**: - **Broadcast**: Eventual consistency (gossip-based) for cluster-wide updates. - **Direct Send**: High-throughput point-to-point UDP messaging. - **Security**: AES-256-GCM encryption. - **Zero-Copy**: Optimized buffer management for high performance. ## Getting Started ### 1. Define Configuration Start with `default_config` and customize as needed. ```ocaml open Swim.Types let config = { default_config with bind_port = 7946; node_name = Some "node-1"; secret_key = "your-32-byte-secret-key-must-be-32-bytes"; (* 32 bytes for AES-256 *) encryption_enabled = true; } ``` ### 2. Create and Start a Cluster Node Use `Cluster.create` within an Eio switch. ```ocaml module Cluster = Swim.Cluster let () = Eio_main.run @@ fun env -> Eio.Switch.run @@ fun sw -> (* Create environment wrapper *) let env_wrap = { stdenv = env; sw } in match Cluster.create ~sw ~env:env_wrap ~config with | Error `Invalid_key -> failwith "Invalid secret key" | Ok cluster -> (* Start background daemons (protocol loop, UDP receiver, TCP listener) *) Cluster.start cluster; Printf.printf "Node started!\n%!"; (* Keep running *) Eio.Fiber.await_cancel () ``` ### 3. Joining a Cluster To join an existing cluster, you need the address of at least one seed node. ```ocaml let seed_nodes = ["192.168.1.10:7946"] in match Cluster.join cluster ~seed_nodes with | Ok () -> Printf.printf "Joined cluster successfully\n" | Error `No_seeds_reachable -> Printf.printf "Failed to join cluster\n" ``` ## Messaging ### Broadcast (Gossip) Use `broadcast` to send data to **all** nodes. This uses the gossip protocol (piggybacking on membership messages). It is bandwidth-efficient but has higher latency and is eventually consistent. **Best for:** Configuration updates, low-frequency state sync. ```ocaml Cluster.broadcast cluster ~topic:"config-update" ~payload:"{\"version\": 2}" ``` ### Direct Send (Point-to-Point) Use `send` to send a message directly to a specific node via UDP. This is high-throughput and low-latency. **Best for:** RPC, high-volume data transfer, direct coordination. ```ocaml (* Send by Node ID *) let target_node_id = node_id_of_string "node-2" in Cluster.send cluster ~target:target_node_id ~topic:"ping" ~payload:"pong" (* Send by Address (if Node ID unknown) *) let addr = `Udp (Eio.Net.Ipaddr.of_raw "\192\168\001\010", 7946) in Cluster.send_to_addr cluster ~addr ~topic:"alert" ~payload:"alert-data" ``` ### Handling Messages Register a callback to handle incoming messages (both broadcast and direct). ```ocaml Cluster.on_message cluster (fun sender topic payload -> Printf.printf "Received '%s' from %s: %s\n" topic (node_id_to_string sender.id) payload ) ``` ## Membership Events Listen for node lifecycle events. ```ocaml Eio.Fiber.fork ~sw (fun () -> let stream = Cluster.events cluster in while true do match Eio.Stream.take stream with | Join node -> Printf.printf "Node joined: %s\n" (node_id_to_string node.id) | Leave node -> Printf.printf "Node left: %s\n" (node_id_to_string node.id) | Suspect_event node -> Printf.printf "Node suspected: %s\n" (node_id_to_string node.id) | Alive_event node -> Printf.printf "Node alive again: %s\n" (node_id_to_string node.id) | Update _ -> () done ) ``` ## Configuration Options | Field | Default | Description | |-------|---------|-------------| | `bind_addr` | "0.0.0.0" | Interface to bind UDP/TCP listeners. | | `bind_port` | 7946 | Port for SWIM protocol. | | `protocol_interval` | 1.0 | Seconds between probe rounds. Lower = faster failure detection, higher bandwidth. | | `probe_timeout` | 0.5 | Seconds to wait for Ack. | | `indirect_checks` | 3 | Number of peers to ask for indirect probes. | | `udp_buffer_size` | 1400 | Max UDP packet size (MTU). | | `secret_key` | (zeros) | 32-byte key for AES-256-GCM. | | `max_gossip_queue_depth` | 5000 | Max items in broadcast queue before dropping oldest (prevents leaks). | ## Performance Tips 1. **Buffer Pool**: The library uses zero-copy buffer pools. Ensure `send_buffer_count` and `recv_buffer_count` are sufficient for your load (default 16). 2. **Gossip Limit**: If broadcasting aggressively, `max_gossip_queue_depth` protects memory but may drop messages. Use `Direct Send` for high volume. 3. **Eio**: Run within an Eio domain/switch. The library is designed for OCaml 5 multicore.