this repo has no description
1# SWIM Protocol Library - Usage Guide
2
3This library provides a production-ready implementation of the SWIM (Scalable Weakly-consistent Infection-style Process Group Membership) protocol in OCaml 5. It handles cluster membership, failure detection, and messaging.
4
5## Key Features
6
7- **Membership**: Automatic discovery and failure detection.
8- **Gossip**: Efficient state propagation (Alive/Suspect/Dead).
9- **Messaging**:
10 - **Broadcast**: Eventual consistency (gossip-based) for cluster-wide updates.
11 - **Direct Send**: High-throughput point-to-point UDP messaging.
12- **Security**: AES-256-GCM encryption.
13- **Zero-Copy**: Optimized buffer management for high performance.
14
15## Getting Started
16
17### 1. Define Configuration
18
19Start with `default_config` and customize as needed.
20
21```ocaml
22open Swim.Types
23
24let config = {
25 default_config with
26 bind_port = 7946;
27 node_name = Some "node-1";
28 secret_key = "your-32-byte-secret-key-must-be-32-bytes"; (* 32 bytes for AES-256 *)
29 encryption_enabled = true;
30}
31```
32
33### 2. Create and Start a Cluster Node
34
35Use `Cluster.create` within an Eio switch.
36
37```ocaml
38module Cluster = Swim.Cluster
39
40let () =
41 Eio_main.run @@ fun env ->
42 Eio.Switch.run @@ fun sw ->
43
44 (* Create environment wrapper *)
45 let env_wrap = { stdenv = env; sw } in
46
47 match Cluster.create ~sw ~env:env_wrap ~config with
48 | Error `Invalid_key -> failwith "Invalid secret key"
49 | Ok cluster ->
50 (* Start background daemons (protocol loop, UDP receiver, TCP listener) *)
51 Cluster.start cluster;
52
53 Printf.printf "Node started!\n%!";
54
55 (* Keep running *)
56 Eio.Fiber.await_cancel ()
57```
58
59### 3. Joining a Cluster
60
61To join an existing cluster, you need the address of at least one seed node.
62
63```ocaml
64let seed_nodes = ["192.168.1.10:7946"] in
65match Cluster.join cluster ~seed_nodes with
66| Ok () -> Printf.printf "Joined cluster successfully\n"
67| Error `No_seeds_reachable -> Printf.printf "Failed to join cluster\n"
68```
69
70## Messaging
71
72### Broadcast (Gossip)
73Use `broadcast` to send data to **all** nodes. This uses the gossip protocol (piggybacking on membership messages). It is bandwidth-efficient but has higher latency and is eventually consistent.
74
75**Best for:** Configuration updates, low-frequency state sync.
76
77```ocaml
78Cluster.broadcast cluster
79 ~topic:"config-update"
80 ~payload:"{\"version\": 2}"
81```
82
83### Direct Send (Point-to-Point)
84Use `send` to send a message directly to a specific node via UDP. This is high-throughput and low-latency.
85
86**Best for:** RPC, high-volume data transfer, direct coordination.
87
88```ocaml
89(* Send by Node ID *)
90let target_node_id = node_id_of_string "node-2" in
91Cluster.send cluster
92 ~target:target_node_id
93 ~topic:"ping"
94 ~payload:"pong"
95
96(* Send by Address (if Node ID unknown) *)
97let addr = `Udp (Eio.Net.Ipaddr.of_raw "\192\168\001\010", 7946) in
98Cluster.send_to_addr cluster
99 ~addr
100 ~topic:"alert"
101 ~payload:"alert-data"
102```
103
104### Handling Messages
105Register a callback to handle incoming messages (both broadcast and direct).
106
107```ocaml
108Cluster.on_message cluster (fun sender topic payload ->
109 Printf.printf "Received '%s' from %s: %s\n"
110 topic
111 (node_id_to_string sender.id)
112 payload
113)
114```
115
116## Membership Events
117
118Listen for node lifecycle events.
119
120```ocaml
121Eio.Fiber.fork ~sw (fun () ->
122 let stream = Cluster.events cluster in
123 while true do
124 match Eio.Stream.take stream with
125 | Join node -> Printf.printf "Node joined: %s\n" (node_id_to_string node.id)
126 | Leave node -> Printf.printf "Node left: %s\n" (node_id_to_string node.id)
127 | Suspect_event node -> Printf.printf "Node suspected: %s\n" (node_id_to_string node.id)
128 | Alive_event node -> Printf.printf "Node alive again: %s\n" (node_id_to_string node.id)
129 | Update _ -> ()
130 done
131)
132```
133
134## Configuration Options
135
136| Field | Default | Description |
137|-------|---------|-------------|
138| `bind_addr` | "0.0.0.0" | Interface to bind UDP/TCP listeners. |
139| `bind_port` | 7946 | Port for SWIM protocol. |
140| `protocol_interval` | 1.0 | Seconds between probe rounds. Lower = faster failure detection, higher bandwidth. |
141| `probe_timeout` | 0.5 | Seconds to wait for Ack. |
142| `indirect_checks` | 3 | Number of peers to ask for indirect probes. |
143| `udp_buffer_size` | 1400 | Max UDP packet size (MTU). |
144| `secret_key` | (zeros) | 32-byte key for AES-256-GCM. |
145| `max_gossip_queue_depth` | 5000 | Max items in broadcast queue before dropping oldest (prevents leaks). |
146
147## Performance Tips
148
1491. **Buffer Pool**: The library uses zero-copy buffer pools. Ensure `send_buffer_count` and `recv_buffer_count` are sufficient for your load (default 16).
1502. **Gossip Limit**: If broadcasting aggressively, `max_gossip_queue_depth` protects memory but may drop messages. Use `Direct Send` for high volume.
1513. **Eio**: Run within an Eio domain/switch. The library is designed for OCaml 5 multicore.