[DEPRECATED] Go implementation of plcbundle

better docs

+2548 -596
+2 -1
.gitignore
··· 1 1 .DS_Store 2 - /plcbundle 2 + /plcbundle 3 + plc_bundles.json
+58 -588
README.md
··· 27 27 ``` 28 28 29 29 > ⚠️ **Preview Version - Do Not Use in Production!** 30 - > 31 - > This project and plcbundle specification is currently unstable and under heavy development. Things can break at any time. Bundle hashes or data formats may change. **Do not** use this for production systems. Please wait for the **`1.0`** release. 32 30 33 31 plcbundle archives AT Protocol's [DID PLC Directory](https://plc.directory/) operations into immutable, cryptographically-chained bundles of 10,000 operations. Each bundle is hashed (SHA-256), compressed (zstd), and linked to the previous bundle, creating a verifiable chain of DID operations. 34 32 35 - This repository contains a reference library and a CLI tool written in Go language. 33 + * 📄 [Technical Specification](./docs/specification.md) 34 + * 📚 [Library Documentation](./docs/library.md) 35 + * 💻 [CLI Guide](./docs/cli.md) 36 + * 📰 [Announcement Article](https://leaflet.pub/feb982b4-64cb-4549-9d25-d7e68cecb11a) 36 37 37 - The technical specification for the plcbundle V1 format, index, and creation process can be found in the [specification document](./SPECIFICATION.md). 38 + ## What is `plcbundle`? 38 39 39 - * [Article "Introducing plcbundle: A Transparent and Verifiable Way to Sync the AT Protocol's PLC Directory"](https://leaflet.pub/feb982b4-64cb-4549-9d25-d7e68cecb11a) 40 - * [Reference implementations in TypeScript, Python, Ruby](https://tangled.org/@atscan.net/plcbundle-js/blob/main/plcbundle.ts) 40 + plcbundle solves the problem of synchronizing and archiving PLC directory operations by: 41 41 42 - ## Features 42 + - **Bundling**: Groups 10,000 operations into compressed, immutable files 43 + - **Chaining**: Each bundle is cryptographically linked to the previous one 44 + - **Verifiable**: SHA-256 hashes ensure data integrity throughout the chain 45 + - **Efficient**: Zstandard compression with ~5x compression ratios 43 46 44 - - 📦 **Bundle Management**: Automatically organize PLC operations into compressed bundles (10,000 operations each) 45 - - 🔄 **Transparent Sync**: Fetch and cache PLC operations with automatic deduplication 46 - - 🗜️ **Efficient Storage**: Zstandard compression with configurable levels 47 - - ✅ **Integrity**: SHA-256 hash verification and blockchain-like chain validation 48 - - 🔍 **Indexing**: Fast bundle lookup and gap detection 49 - - 📊 **Export**: Query operations by time range 47 + ## Quick Start 50 48 51 - ## Installation 52 - 53 - ```bash 54 - go get tangled.org/atscan.net/plcbundle 55 - ``` 56 - 57 - For the CLI tool: 58 - 59 - ```bash 60 - go install tangled.org/atscan.net/plcbundle/cmd/plcbundle@latest 61 - ``` 62 - 63 - ## Quick Start (Library) 49 + ### As a Library 64 50 65 51 ```go 66 - package main 52 + import plcbundle "tangled.org/atscan.net/plcbundle" 67 53 68 - import ( 69 - "context" 70 - "log" 71 - "time" 72 - 73 - plcbundle "tangled.org/atscan.net/plcbundle" 74 - ) 75 - 76 - func main() { 77 - // Create a bundle manager 78 - mgr, err := plcbundle.New("./plc_data", "https://plc.directory") 79 - if err != nil { 80 - log.Fatal(err) 81 - } 82 - defer mgr.Close() 83 - 84 - // Fetch latest bundles 85 - ctx := context.Background() 86 - bundle, err := mgr.FetchNext(ctx) 87 - if err != nil { 88 - log.Fatal(err) 89 - } 90 - 91 - log.Printf("Fetched bundle %d with %d operations", 92 - bundle.BundleNumber, len(bundle.Operations)) 93 - } 94 - ``` 95 - 96 - ## Library Usage 97 - 98 - ### 1. Basic Setup 99 - 100 - ```go 101 - import ( 102 - "context" 103 - "plcbundle tangled.org/atscan.net/plcbundle" 104 - ) 105 - 106 - // Create manager with defaults 107 - mgr, err := plcbundle.New("./bundles", "https://plc.directory") 108 - if err != nil { 109 - log.Fatal(err) 110 - } 54 + mgr, _ := plcbundle.New("./plc_data", "https://plc.directory") 111 55 defer mgr.Close() 112 - ``` 113 - 114 - ### 2. Custom Configuration 115 56 116 - ```go 117 - import ( 118 - "tangled.org/atscan.net/plcbundle/bundle" 119 - "tangled.org/atscan.net/plcbundle/plc" 120 - ) 121 - 122 - // Custom config 123 - config := &bundle.Config{ 124 - BundleDir: "./my_bundles", 125 - CompressionLevel: bundle.CompressionBest, 126 - VerifyOnLoad: true, 127 - Logger: myCustomLogger, 128 - } 129 - 130 - // Custom PLC client with rate limiting 131 - plcClient := plc.NewClient("https://plc.directory", 132 - plc.WithRateLimit(60, time.Minute), // 60 req/min 133 - plc.WithTimeout(30*time.Second), 134 - ) 135 - 136 - mgr, err := bundle.NewManager(config, plcClient) 137 - ``` 138 - 139 - ### 3. Transparent Synchronization (Main Use Case) 140 - 141 - This is the primary pattern for keeping your local PLC mirror up-to-date: 142 - 143 - ```go 144 - package main 145 - 146 - import ( 147 - "context" 148 - "log" 149 - "time" 150 - 151 - plcbundle "tangled.org/atscan.net/plcbundle" 152 - ) 153 - 154 - type PLCSync struct { 155 - mgr *plcbundle.BundleManager 156 - ctx context.Context 157 - cancel context.CancelFunc 158 - } 159 - 160 - func NewPLCSync(bundleDir string) (*PLCSync, error) { 161 - mgr, err := plcbundle.New(bundleDir, "https://plc.directory") 162 - if err != nil { 163 - return nil, err 164 - } 165 - 166 - ctx, cancel := context.WithCancel(context.Background()) 167 - 168 - sync := &PLCSync{ 169 - mgr: mgr, 170 - ctx: ctx, 171 - cancel: cancel, 172 - } 173 - 174 - return sync, nil 175 - } 176 - 177 - func (s *PLCSync) Start(interval time.Duration) { 178 - ticker := time.NewTicker(interval) 179 - defer ticker.Stop() 180 - 181 - log.Println("Starting PLC synchronization...") 182 - 183 - for { 184 - select { 185 - case <-ticker.C: 186 - if err := s.Update(); err != nil { 187 - log.Printf("Update error: %v", err) 188 - } 189 - case <-s.ctx.Done(): 190 - return 191 - } 192 - } 193 - } 194 - 195 - func (s *PLCSync) Update() error { 196 - log.Println("Checking for new bundles...") 197 - 198 - for { 199 - bundle, err := s.mgr.FetchNext(s.ctx) 200 - if err != nil { 201 - // Check if we're caught up 202 - if isEndOfData(err) { 203 - log.Println("✓ Up to date!") 204 - return nil 205 - } 206 - return err 207 - } 208 - 209 - log.Printf("✓ Fetched bundle %06d (%d ops, %d DIDs)", 210 - bundle.BundleNumber, 211 - len(bundle.Operations), 212 - bundle.DIDCount) 213 - } 214 - } 215 - 216 - func (s *PLCSync) Stop() { 217 - s.cancel() 218 - s.mgr.Close() 219 - } 220 - 221 - func isEndOfData(err error) bool { 222 - return err != nil && 223 - (strings.Contains(err.Error(), "insufficient operations") || 224 - strings.Contains(err.Error(), "caught up")) 225 - } 226 - 227 - // Usage 228 - func main() { 229 - sync, err := NewPLCSync("./plc_bundles") 230 - if err != nil { 231 - log.Fatal(err) 232 - } 233 - defer sync.Stop() 234 - 235 - // Update every 5 minutes 236 - sync.Start(5 * time.Minute) 237 - } 238 - ``` 239 - 240 - ### 4. Getting Bundles 241 - 242 - ```go 243 - ctx := context.Background() 244 - 245 - // Get all bundles 246 - index := mgr.GetIndex() 247 - bundles := index.GetBundles() 248 - 249 - for _, meta := range bundles { 250 - log.Printf("Bundle %06d: %d ops, %s to %s", 251 - meta.BundleNumber, 252 - meta.OperationCount, 253 - meta.StartTime.Format(time.RFC3339), 254 - meta.EndTime.Format(time.RFC3339)) 255 - } 256 - 257 - // Load specific bundle 258 - bundle, err := mgr.Load(ctx, 1) 259 - if err != nil { 260 - log.Fatal(err) 261 - } 262 - 263 - log.Printf("Loaded %d operations", len(bundle.Operations)) 264 - ``` 265 - 266 - ### 5. Getting Operations from Bundles 267 - 268 - ```go 269 - // Load a bundle and iterate operations 270 - bundle, err := mgr.Load(ctx, 1) 271 - if err != nil { 272 - log.Fatal(err) 273 - } 274 - 275 - for _, op := range bundle.Operations { 276 - log.Printf("DID: %s, CID: %s, Time: %s", 277 - op.DID, 278 - op.CID, 279 - op.CreatedAt.Format(time.RFC3339)) 280 - 281 - // Access operation data 282 - if opType, ok := op.Operation["type"].(string); ok { 283 - log.Printf(" Type: %s", opType) 284 - } 285 - } 286 - ``` 287 - 288 - ### 6. Export Operations by Time Range 289 - 290 - ```go 291 - // Export operations after a specific time 292 - afterTime := time.Date(2024, 1, 1, 0, 0, 0, 0, time.UTC) 293 - operations, err := mgr.Export(ctx, afterTime, 5000) 294 - if err != nil { 295 - log.Fatal(err) 296 - } 297 - 298 - log.Printf("Exported %d operations", len(operations)) 299 - 300 - // Process operations 301 - for _, op := range operations { 302 - // Your processing logic 303 - processOperation(op) 304 - } 305 - ``` 306 - 307 - ### 7. Periodic Update Pattern 308 - 309 - ```go 310 - // Simple periodic updater 311 - func runPeriodicUpdate(mgr *plcbundle.BundleManager, interval time.Duration) { 312 - ticker := time.NewTicker(interval) 313 - defer ticker.Stop() 314 - 315 - for range ticker.C { 316 - ctx := context.Background() 317 - 318 - // Try to fetch next bundle 319 - bundle, err := mgr.FetchNext(ctx) 320 - if err != nil { 321 - if strings.Contains(err.Error(), "insufficient operations") { 322 - log.Println("Caught up!") 323 - continue 324 - } 325 - log.Printf("Error: %v", err) 326 - continue 327 - } 328 - 329 - log.Printf("New bundle %d: %d operations", 330 - bundle.BundleNumber, 331 - len(bundle.Operations)) 332 - 333 - // Process new operations 334 - for _, op := range bundle.Operations { 335 - handleOperation(op) 336 - } 337 - } 338 - } 339 - 340 - // Usage 341 - go runPeriodicUpdate(mgr, 10*time.Minute) 342 - ``` 343 - 344 - ### 8. Verify Integrity 345 - 346 - ```go 347 - // Verify specific bundle 348 - result, err := mgr.Verify(ctx, 1) 349 - if err != nil { 350 - log.Fatal(err) 351 - } 352 - 353 - if result.Valid { 354 - log.Println("✓ Bundle is valid") 355 - } else { 356 - log.Printf("✗ Invalid: %v", result.Error) 357 - } 358 - 359 - // Verify entire chain 360 - chainResult, err := mgr.VerifyChain(ctx) 361 - if err != nil { 362 - log.Fatal(err) 363 - } 364 - 365 - if chainResult.Valid { 366 - log.Printf("✓ Chain verified: %d bundles", chainResult.ChainLength) 367 - } else { 368 - log.Printf("✗ Chain broken at bundle %d: %s", 369 - chainResult.BrokenAt, 370 - chainResult.Error) 371 - } 57 + bundle, _ := mgr.FetchNext(context.Background()) 58 + // Process bundle.Operations 372 59 ``` 373 60 374 - ### 9. Scan Directory (Re-index) 375 - 376 - ```go 377 - // Scan directory and rebuild index from existing bundles 378 - result, err := mgr.Scan() 379 - if err != nil { 380 - log.Fatal(err) 381 - } 382 - 383 - log.Printf("Scanned %d bundles", result.BundleCount) 384 - if len(result.MissingGaps) > 0 { 385 - log.Printf("Warning: Missing bundles: %v", result.MissingGaps) 386 - } 387 - ``` 388 - 389 - ### 10. Complete Example: Background Sync Service 390 - 391 - ```go 392 - package main 393 - 394 - import ( 395 - "context" 396 - "log" 397 - "os" 398 - "os/signal" 399 - "syscall" 400 - "time" 401 - 402 - plcbundle "tangled.org/atscan.net/plcbundle" 403 - ) 404 - 405 - type PLCService struct { 406 - mgr *plcbundle.BundleManager 407 - updateCh chan struct{} 408 - stopCh chan struct{} 409 - } 410 - 411 - func NewPLCService(bundleDir string) (*PLCService, error) { 412 - mgr, err := plcbundle.New(bundleDir, "https://plc.directory") 413 - if err != nil { 414 - return nil, err 415 - } 416 - 417 - return &PLCService{ 418 - mgr: mgr, 419 - updateCh: make(chan struct{}, 1), 420 - stopCh: make(chan struct{}), 421 - }, nil 422 - } 423 - 424 - func (s *PLCService) Start() { 425 - log.Println("Starting PLC service...") 426 - 427 - // Initial scan 428 - if _, err := s.mgr.Scan(); err != nil { 429 - log.Printf("Scan warning: %v", err) 430 - } 431 - 432 - // Start update loop 433 - go s.updateLoop() 434 - 435 - // Periodic trigger 436 - go s.periodicTrigger(5 * time.Minute) 437 - } 61 + [See full library documentation →](./docs/library.md) 438 62 439 - func (s *PLCService) updateLoop() { 440 - for { 441 - select { 442 - case <-s.updateCh: 443 - s.fetchNewBundles() 444 - case <-s.stopCh: 445 - return 446 - } 447 - } 448 - } 63 + ### As a CLI Tool 449 64 450 - func (s *PLCService) periodicTrigger(interval time.Duration) { 451 - ticker := time.NewTicker(interval) 452 - defer ticker.Stop() 453 - 454 - for { 455 - select { 456 - case <-ticker.C: 457 - s.TriggerUpdate() 458 - case <-s.stopCh: 459 - return 460 - } 461 - } 462 - } 65 + ```bash 66 + # Install 67 + go install tangled.org/atscan.net/plcbundle/cmd/plcbundle@latest 463 68 464 - func (s *PLCService) TriggerUpdate() { 465 - select { 466 - case s.updateCh <- struct{}{}: 467 - default: 468 - // Update already in progress 469 - } 470 - } 471 - 472 - func (s *PLCService) fetchNewBundles() { 473 - ctx := context.Background() 474 - fetched := 0 475 - 476 - for { 477 - bundle, err := s.mgr.FetchNext(ctx) 478 - if err != nil { 479 - if isEndOfData(err) { 480 - if fetched > 0 { 481 - log.Printf("✓ Fetched %d new bundles", fetched) 482 - } 483 - return 484 - } 485 - log.Printf("Fetch error: %v", err) 486 - return 487 - } 488 - 489 - fetched++ 490 - log.Printf("Bundle %06d: %d operations", 491 - bundle.BundleNumber, 492 - len(bundle.Operations)) 493 - } 494 - } 495 - 496 - func (s *PLCService) GetBundles() []*plcbundle.BundleMetadata { 497 - return s.mgr.GetIndex().GetBundles() 498 - } 499 - 500 - func (s *PLCService) GetOperations(bundleNum int) ([]plcbundle.PLCOperation, error) { 501 - ctx := context.Background() 502 - bundle, err := s.mgr.Load(ctx, bundleNum) 503 - if err != nil { 504 - return nil, err 505 - } 506 - return bundle.Operations, nil 507 - } 508 - 509 - func (s *PLCService) Stop() { 510 - close(s.stopCh) 511 - s.mgr.Close() 512 - } 513 - 514 - func main() { 515 - service, err := NewPLCService("./plc_data") 516 - if err != nil { 517 - log.Fatal(err) 518 - } 519 - 520 - service.Start() 521 - 522 - // Wait for interrupt 523 - sigCh := make(chan os.Signal, 1) 524 - signal.Notify(sigCh, os.Interrupt, syscall.SIGTERM) 525 - <-sigCh 526 - 527 - log.Println("Shutting down...") 528 - service.Stop() 529 - } 530 - ``` 531 - 532 - ## CLI Tool Usage 533 - 534 - ### Fetch bundles 535 - 536 - ```bash 537 - # Fetch next bundle 69 + # Fetch bundles 538 70 plcbundle fetch 539 71 540 - # Fetch specific number of bundles 541 - plcbundle fetch -count 10 542 - 543 - # Fetch all available bundles 544 - plcbundle fetch -count 0 545 - ``` 546 - 547 - ### Scan directory 72 + # Clone from remote 73 + plcbundle clone https://plc.example.com 548 74 549 - ```bash 550 - # Scan and rebuild index 551 - plcbundle scan 552 - ``` 553 - 554 - ### Verify integrity 555 - 556 - ```bash 557 - # Verify specific bundle 558 - plcbundle verify -bundle 1 559 - 560 - # Verify entire chain 75 + # Verify integrity 561 76 plcbundle verify 562 - 563 - # Verbose output 564 - plcbundle verify -v 565 77 ``` 566 78 567 - ### Show information 79 + [See full CLI reference →](./docs/cli.md) 568 80 569 - ```bash 570 - # General info 571 - plcbundle info 81 + ## Key Features 572 82 573 - # Specific bundle info 574 - plcbundle info -bundle 1 575 - ``` 83 + - 📦 Automatic bundle management (10,000 operations each) 84 + - 🔄 Transparent synchronization with PLC directory 85 + - 🗜️ Efficient zstd compression 86 + - ✅ Cryptographic verification (SHA-256 + chain validation) 87 + - 🔍 Fast indexing and gap detection 88 + - 🌐 HTTP server for hosting bundles 89 + - 🔌 WebSocket streaming support 576 90 577 - ### Export operations 91 + ## Installation 578 92 579 93 ```bash 580 - # Export operations to stdout (JSONL) 581 - plcbundle export -count 1000 > operations.jsonl 94 + # Library 95 + go get tangled.org/atscan.net/plcbundle 582 96 583 - # Export after specific time 584 - plcbundle export -after "2024-01-01T00:00:00Z" -count 5000 97 + # CLI tool 98 + go install tangled.org/atscan.net/plcbundle/cmd/plcbundle@latest 585 99 ``` 586 100 587 - ### Backfill 101 + ## Use Cases 588 102 589 - ```bash 590 - # Fetch all bundles and stream to stdout 591 - plcbundle backfill > all_operations.jsonl 592 - 593 - # Start from specific bundle 594 - plcbundle backfill -start 100 -end 200 595 - ``` 596 - 597 - ## API Reference 598 - 599 - ### Types 600 - 601 - ```go 602 - type BundleManager struct { ... } 603 - type Bundle struct { 604 - BundleNumber int 605 - StartTime time.Time 606 - EndTime time.Time 607 - Operations []PLCOperation 608 - DIDCount int 609 - Hash string 610 - // ... 611 - } 612 - 613 - type PLCOperation struct { 614 - DID string 615 - Operation map[string]interface{} 616 - CID string 617 - CreatedAt time.Time 618 - RawJSON []byte 619 - } 620 - ``` 621 - 622 - ### Methods 623 - 624 - ```go 625 - // Create 626 - New(bundleDir, plcURL string) (*BundleManager, error) 103 + - **Archiving**: Create verifiable backups of PLC operations 104 + - **Mirroring**: Host your own PLC directory mirror 105 + - **Research**: Analyze historical DID operations 106 + - **Compliance**: Maintain tamper-evident audit trails 627 107 628 - // Sync 629 - FetchNext(ctx) (*Bundle, error) 630 - Export(ctx, afterTime, count) ([]PLCOperation, error) 108 + ## Security Model 631 109 632 - // Query 633 - Load(ctx, bundleNumber) (*Bundle, error) 634 - GetIndex() *Index 635 - GetInfo() map[string]interface{} 110 + Bundles are cryptographically chained but require external verification: 111 + - ✅ Verify against original PLC directory 112 + - ✅ Compare with multiple independent mirrors 113 + - ✅ Check published root and head hashes 114 + - ✅ Anyone can reproduce bundles from PLC directory 636 115 637 - // Verify 638 - Verify(ctx, bundleNumber) (*VerificationResult, error) 639 - VerifyChain(ctx) (*ChainVerificationResult, error) 116 + ## Reference Implementations 640 117 641 - // Manage 642 - Scan() (*DirectoryScanResult, error) 643 - Close() 644 - ``` 118 + - [TypeScript, Python, Ruby](https://tangled.org/@atscan.net/plcbundle-js/blob/main/plcbundle.ts) 645 119 646 - ## Configuration 120 + ## Documentation 647 121 648 - ```go 649 - type Config struct { 650 - BundleDir string // Storage directory 651 - CompressionLevel CompressionLevel // Compression level 652 - VerifyOnLoad bool // Verify hashes when loading 653 - Logger Logger // Custom logger 654 - } 655 - ``` 122 + - [Library Guide](./docs/library.md) - Comprehensive API documentation 123 + - [CLI Guide](./docs/cli.md) - Command-line tool usage 124 + - [Specification](./docs/specification.md) - Technical format specification 125 + <!--- [Examples](./docs/examples/) - Common patterns and recipes--> 656 126 657 127 ## License 658 128 ··· 660 130 661 131 ## Contributing 662 132 663 - Contributions welcome! Please open an issue or PR. 133 + Contributions welcome! Please open an issue or PR.
SECURITY docs/security.md
SPECIFICATION.md docs/specification.md
+858
docs/cli.md
··· 1 + # CLI Guide 2 + 3 + A practical guide to using the `plcbundle` command-line tool. 4 + 5 + ## Table of Contents 6 + 7 + - [Getting Started](#getting-started) 8 + - [Basic Workflows](#basic-workflows) 9 + - [Advanced Usage](#advanced-usage) 10 + - [Best Practices](#best-practices) 11 + - [Troubleshooting](#troubleshooting) 12 + 13 + --- 14 + 15 + ## Getting Started 16 + 17 + ### Installation 18 + 19 + ```bash 20 + go install tangled.org/atscan.net/plcbundle/cmd/plcbundle@latest 21 + ``` 22 + 23 + Verify it's installed: 24 + ```bash 25 + plcbundle version 26 + # plcbundle version dev 27 + ``` 28 + 29 + ### Your First Bundle 30 + 31 + Let's fetch your first bundle from the PLC directory: 32 + 33 + ```bash 34 + # Create a directory for your bundles 35 + mkdir my-plc-archive 36 + cd my-plc-archive 37 + 38 + # Fetch one bundle 39 + plcbundle fetch 40 + ``` 41 + 42 + You'll see output like: 43 + ``` 44 + Working in: /Users/you/my-plc-archive 45 + Starting from bundle 000001 46 + Fetching all available bundles... 47 + 48 + Preparing bundle 000001 (mempool: 0 ops)... 49 + Fetching more operations (have 0/10000)... 50 + Fetch #1: requesting 1000 operations (mempool: 0) 51 + Added 1000 new operations (mempool now: 1000) 52 + 53 + ... (continues fetching) ... 54 + 55 + ✓ Bundle 000001 ready (10000 ops, mempool: 0 remaining) 56 + ✓ Saved bundle 000001 (10000 operations, 8543 DIDs) 57 + 58 + ✓ Fetch complete: 1 bundles retrieved 59 + ``` 60 + 61 + **What just happened?** 62 + 63 + plcbundle created two files: 64 + - `000001.jsonl.zst` - Your first bundle (10,000 PLC operations, compressed) 65 + - `index.json` - Index tracking bundle metadata and hashes 66 + 67 + ### Understanding the Files 68 + 69 + **Bundle files** (`000001.jsonl.zst`): 70 + - Contain exactly 10,000 operations each 71 + - Compressed with zstd (~5x compression) 72 + - Named with 6-digit zero-padding 73 + - Immutable once created 74 + 75 + **Index file** (`index.json`): 76 + - Maps bundle numbers to metadata 77 + - Contains cryptographic hashes 78 + - Tracks the bundle chain 79 + - Updated when bundles are added 80 + 81 + **Mempool files** (`plc_mempool_*.jsonl`): 82 + - Temporary staging for operations 83 + - Auto-managed by plcbundle 84 + - Safe to ignore (they clean up automatically) 85 + 86 + ### Check What You Have 87 + 88 + ```bash 89 + plcbundle info 90 + ``` 91 + 92 + You'll see a summary: 93 + ``` 94 + ═══════════════════════════════════════════════════════════ 95 + PLC Bundle Repository Overview 96 + ═══════════════════════════════════════════════════════════ 97 + 98 + 📊 Summary 99 + Bundles: 1 100 + Range: 000001 → 000001 101 + Compressed: 505 KB 102 + Operations: 10,000 records 103 + 104 + 🔐 Chain Hashes 105 + Root (bundle 000001): 106 + 8f4e3a2d9c1b7f5e3a2d9c1b7f5e3a2d9c1b7f5e3a2d9c1b7f5e3a2d9c1b7f5e 107 + ``` 108 + 109 + --- 110 + 111 + ## Basic Workflows 112 + 113 + ### Workflow 1: Building a Complete Archive 114 + 115 + **Goal:** Download all historical PLC bundles to create a complete archive. 116 + 117 + ```bash 118 + # Fetch all available bundles 119 + plcbundle fetch -count 0 120 + ``` 121 + 122 + The `-count 0` means "fetch everything available". This will: 123 + - Start from bundle 1 (or continue from where you left off) 124 + - Keep fetching until caught up 125 + - Create bundles of 10,000 operations each 126 + - Stop when no more complete bundles can be formed 127 + 128 + **Tip:** This can take a while. You can interrupt with Ctrl+C and resume later - just run the same command again. 129 + 130 + ### Workflow 2: Keeping Up-to-Date 131 + 132 + **Goal:** Regularly sync new operations as they arrive. 133 + 134 + ```bash 135 + # Run periodically (cron, systemd timer, etc.) 136 + plcbundle fetch 137 + ``` 138 + 139 + Or use a simple script: 140 + ```bash 141 + #!/bin/bash 142 + # sync-plc.sh 143 + cd /path/to/plc_data 144 + plcbundle fetch 145 + ``` 146 + 147 + Run it every 5-10 minutes: 148 + ```bash 149 + # Crontab 150 + */10 * * * * /path/to/sync-plc.sh 151 + ``` 152 + 153 + **What happens when there aren't enough operations?** 154 + 155 + If fewer than 10,000 new operations exist, they're stored in the mempool: 156 + ```bash 157 + plcbundle mempool 158 + # Mempool Status: 159 + # Operations: 3,482 160 + # Progress: 34.8% (3482/10000) 161 + # Need 6,518 more operations 162 + ``` 163 + 164 + The bundle will be created automatically once 10,000 operations arrive. 165 + 166 + ### Workflow 3: Cloning from Another Server 167 + 168 + **Goal:** Quickly sync bundles from an existing plcbundle server instead of fetching from PLC. 169 + 170 + **Why?** Much faster! Downloading pre-made bundles is faster than fetching and bundling operations yourself. 171 + 172 + ```bash 173 + # Clone from a public mirror 174 + plcbundle clone https://plc.example.com 175 + ``` 176 + 177 + With progress tracking: 178 + ``` 179 + Cloning from: https://plc.example.com 180 + Remote has 8,547 bundles 181 + Downloading 8,547 bundles (4.2 GB) 182 + 183 + [████████░░░░░░] 68.2% | 5,829/8,547 | 42.5/s | 21.3 MB/s | ETA: 1m 4s 184 + ``` 185 + 186 + **Resume after interruption:** 187 + 188 + Press Ctrl+C and the download stops gracefully: 189 + ``` 190 + ⚠️ Interrupt received! Finishing current downloads and saving progress... 191 + 192 + ⚠️ Download interrupted by user 193 + 194 + Results: 195 + Downloaded: 5,829 196 + Total size: 2.9 GB 197 + 198 + ✓ Progress saved. Re-run the clone command to resume. 199 + ``` 200 + 201 + Just run the same command again: 202 + ```bash 203 + plcbundle clone https://plc.example.com 204 + # Skips 5,829 existing bundles, downloads the rest 205 + ``` 206 + 207 + **Speed it up with more workers:** 208 + ```bash 209 + plcbundle clone https://plc.example.com -workers 16 210 + ``` 211 + 212 + ### Workflow 4: Verifying Your Archive 213 + 214 + **Goal:** Ensure your bundles are intact and unmodified. 215 + 216 + ```bash 217 + # Verify the entire chain 218 + plcbundle verify 219 + ``` 220 + 221 + Output: 222 + ``` 223 + Verifying chain of 100 bundles... 224 + [ 10%] Verified 10/100 bundles... 225 + [ 50%] Verified 50/100 bundles... 226 + [100%] Verified 100/100 bundles... 227 + 228 + ✓ Chain is valid (100 bundles verified) 229 + Chain head: 8f4e3a2d9c1b7f5e... 230 + ``` 231 + 232 + **What's being verified?** 233 + 234 + 1. **File integrity**: Each bundle's compressed data matches its hash 235 + 2. **Chain integrity**: Each bundle correctly links to the previous one 236 + 3. **Complete chain**: No broken links from bundle 1 to the last 237 + 238 + **Verify just one bundle:** 239 + ```bash 240 + plcbundle verify -bundle 42 241 + # ✓ Bundle 000042 is valid 242 + ``` 243 + 244 + ### Workflow 5: Sharing Your Archive 245 + 246 + **Goal:** Run an HTTP server so others can clone your bundles. 247 + 248 + ```bash 249 + plcbundle serve 250 + ``` 251 + 252 + Output: 253 + ``` 254 + Starting plcbundle HTTP server... 255 + Directory: /Users/you/plc_data 256 + Listening: http://127.0.0.1:8080 257 + Sync mode: disabled 258 + Bundles available: 100 259 + 260 + Press Ctrl+C to stop 261 + ``` 262 + 263 + Now others can clone from you: 264 + ```bash 265 + # From another machine 266 + plcbundle clone http://your-server:8080 267 + ``` 268 + 269 + **Serve on a different port:** 270 + ```bash 271 + plcbundle serve -port 9000 -host 0.0.0.0 272 + # Listening on all interfaces, port 9000 273 + ``` 274 + 275 + **Auto-sync while serving:** 276 + ```bash 277 + plcbundle serve -sync -sync-interval 5m 278 + # Automatically fetches new bundles every 5 minutes 279 + ``` 280 + 281 + --- 282 + 283 + ## Advanced Usage 284 + 285 + ### Working with Mempool 286 + 287 + The mempool is a staging area for operations waiting to form a complete bundle. 288 + 289 + **Check mempool status:** 290 + ```bash 291 + plcbundle mempool 292 + ``` 293 + 294 + Output: 295 + ``` 296 + Mempool Status: 297 + Target bundle: 000101 298 + Operations: 7,234 299 + Can create bundle: false (need 10000) 300 + Progress: 72.3% (7234/10000) 301 + [████████████████████████████░░░░░░░░░░░░] 302 + First operation: 2024-12-01 10:23:45 303 + Last operation: 2024-12-15 15:47:23 304 + ``` 305 + 306 + **Export mempool operations:** 307 + ```bash 308 + plcbundle mempool -export > recent_operations.jsonl 309 + # Exported 7,234 operations from mempool 310 + ``` 311 + 312 + **Clear mempool (use with caution):** 313 + ```bash 314 + plcbundle mempool -clear 315 + # ⚠ This will clear 7,234 operations from the mempool. 316 + # Are you sure? [y/N]: y 317 + # ✓ Mempool cleared 318 + ``` 319 + 320 + **When to clear mempool:** 321 + - After corrupted operations 322 + - When restarting from scratch 323 + - During development/testing 324 + 325 + **Validate mempool chronology:** 326 + ```bash 327 + plcbundle mempool -validate 328 + # ✓ Mempool validation passed 329 + ``` 330 + 331 + ### Exporting Operations 332 + 333 + **Goal:** Extract operations from bundles for analysis or processing. 334 + 335 + **Export recent operations:** 336 + ```bash 337 + plcbundle export -count 5000 > operations.jsonl 338 + # Exported 5000 operations 339 + ``` 340 + 341 + **Export operations after a specific time:** 342 + ```bash 343 + plcbundle export -after "2024-01-01T00:00:00Z" -count 10000 > jan_2024.jsonl 344 + ``` 345 + 346 + **Stream all operations (backfill):** 347 + ```bash 348 + plcbundle backfill > all_operations.jsonl 349 + # This streams ALL operations from all bundles 350 + ``` 351 + 352 + **Backfill specific range:** 353 + ```bash 354 + plcbundle backfill -start 1 -end 100 > first_100_bundles.jsonl 355 + ``` 356 + 357 + ### Rebuilding the Index 358 + 359 + **Goal:** Regenerate `index.json` from bundle files. 360 + 361 + **When you need this:** 362 + - Downloaded bundle files manually 363 + - Corrupted index file 364 + - Migrated from another system 365 + - After specification updates (during preview) 366 + 367 + ```bash 368 + plcbundle rebuild 369 + ``` 370 + 371 + Output: 372 + ``` 373 + Rebuilding index from: /Users/you/plc_data 374 + Using 8 workers 375 + Found 100 bundle files 376 + 377 + Processing bundles: 378 + [████████████████████████████████████████] 100% | 100/100 | 25.0/s 379 + 380 + ✓ Index rebuilt in 4.2s 381 + Total bundles: 100 382 + Compressed size: 50.5 MB 383 + Uncompressed size: 252.3 MB 384 + Average speed: 23.8 bundles/sec 385 + 386 + ✓ Chain verified: All bundles linked correctly 387 + ``` 388 + 389 + **Speed it up:** 390 + ```bash 391 + plcbundle rebuild -workers 16 392 + ``` 393 + 394 + ### Comparing with Remote 395 + 396 + **Goal:** Check differences between your local archive and a remote server. 397 + 398 + ```bash 399 + plcbundle compare https://plc.example.com 400 + ``` 401 + 402 + Output: 403 + ``` 404 + Comparison Results 405 + ══════════════════ 406 + 407 + Summary 408 + ─────── 409 + Local bundles: 95 410 + Target bundles: 100 411 + Common bundles: 95 412 + Missing bundles: 5 ⚠️ 413 + Hash mismatches: 0 ✓ 414 + 415 + Missing Bundles (in target but not local) 416 + ────────────────────────────────────────── 417 + 000096 418 + 000097 419 + 000098 420 + 000099 421 + 000100 422 + 423 + ✗ Indexes have differences 424 + ``` 425 + 426 + **Auto-fetch missing bundles:** 427 + ```bash 428 + plcbundle compare https://plc.example.com --fetch-missing 429 + ``` 430 + 431 + This will: 432 + 1. Compare indexes 433 + 2. Show differences 434 + 3. Download missing bundles 435 + 4. Update your index 436 + 437 + ### Serving with WebSocket Streaming 438 + 439 + **Goal:** Provide real-time streaming of operations via WebSocket. 440 + 441 + ```bash 442 + plcbundle serve -sync -websocket 443 + ``` 444 + 445 + Output: 446 + ``` 447 + Starting plcbundle HTTP server... 448 + Listening: http://127.0.0.1:8080 449 + Sync mode: ENABLED 450 + WebSocket: ENABLED (ws://127.0.0.1:8080/ws) 451 + ``` 452 + 453 + **Connect with websocat:** 454 + ```bash 455 + # Stream all operations from the beginning 456 + websocat ws://localhost:8080/ws 457 + 458 + # Stream from cursor 100,000 459 + websocat 'ws://localhost:8080/ws?cursor=100000' 460 + ``` 461 + 462 + **What's a cursor?** 463 + 464 + Cursor = `(bundle_number * 10000) + position_in_bundle` 465 + 466 + Example: 467 + - Bundle 1, position 0 = cursor 0 468 + - Bundle 1, position 500 = cursor 500 469 + - Bundle 10, position 0 = cursor 90,000 470 + - Bundle 10, position 2,345 = cursor 92,345 471 + 472 + **Stream and process:** 473 + ```bash 474 + websocat ws://localhost:8080/ws | jq 'select(.did | startswith("did:plc:"))' 475 + # Filter operations in real-time with jq 476 + ``` 477 + 478 + --- 479 + 480 + ## Best Practices 481 + 482 + ### 1. Regular Verification 483 + 484 + Verify your archive periodically: 485 + ```bash 486 + # Weekly cron job 487 + 0 0 * * 0 cd /path/to/plc_data && plcbundle verify 488 + ``` 489 + 490 + ### 2. Backup Strategy 491 + 492 + **Back up the index frequently:** 493 + ```bash 494 + cp index.json index.json.backup 495 + ``` 496 + 497 + **Why?** The index is small but critical. Bundles can be rebuilt, but the index tracks everything. 498 + 499 + **Back up bundles incrementally:** 500 + ```bash 501 + # Rsync to backup server 502 + rsync -av --progress *.jsonl.zst backup-server:/plc_archive/ 503 + ``` 504 + 505 + ### 3. Storage Planning 506 + 507 + **Size estimates:** 508 + - ~500 KB per bundle (compressed) 509 + - 10,000 operations per bundle 510 + - ~50 bytes per operation (compressed) 511 + 512 + **Current PLC size** (check with): 513 + ```bash 514 + curl https://plc.directory/export?count=1 | jq -r '.createdAt' 515 + # Then estimate: ~10,000 bundles as of 2025 = ~5 GB 516 + ``` 517 + 518 + ### 4. Monitoring 519 + 520 + **Create a health check script:** 521 + ```bash 522 + #!/bin/bash 523 + # health-check.sh 524 + 525 + cd /path/to/plc_data 526 + 527 + # Check last bundle age 528 + LAST_BUNDLE=$(plcbundle info | grep "Last Op" | cut -d: -f2-) 529 + AGE=$(plcbundle info | grep "Age" | cut -d: -f2-) 530 + 531 + echo "Last operation: $LAST_BUNDLE" 532 + echo "Age: $AGE" 533 + 534 + # Verify chain 535 + if ! plcbundle verify -bundle $(plcbundle info | grep "Last bundle" | awk '{print $3}'); then 536 + echo "ERROR: Verification failed!" 537 + exit 1 538 + fi 539 + 540 + echo "Status: OK" 541 + ``` 542 + 543 + ### 5. Network Efficiency 544 + 545 + **Use clone instead of fetch when possible:** 546 + ```bash 547 + # Slower: Fetch from PLC directory 548 + plcbundle fetch 549 + 550 + # Faster: Clone from mirror 551 + plcbundle clone https://plc-mirror.example.com 552 + ``` 553 + 554 + **Rate limit considerations:** 555 + 556 + The PLC directory rate limits API requests. plcbundle handles this automatically with: 557 + - Exponential backoff 558 + - Automatic retry 559 + - ~90 requests/minute limit 560 + 561 + ### 6. Disk Space Management 562 + 563 + **Check sizes:** 564 + ```bash 565 + plcbundle info | grep -E "(Compressed|Bundles)" 566 + # Bundles: 100 567 + # Compressed: 50.5 MB 568 + ``` 569 + 570 + **Clean old mempools (safe):** 571 + ```bash 572 + # Old mempools might linger if process was killed 573 + rm plc_mempool_*.jsonl 574 + plcbundle fetch # Will recreate if needed 575 + ``` 576 + 577 + --- 578 + 579 + ## Troubleshooting 580 + 581 + ### Bundle Count Mismatch 582 + 583 + **Problem:** 584 + ``` 585 + Found 100 bundle files but index only has 95 entries 586 + ``` 587 + 588 + **Solution:** 589 + ```bash 590 + plcbundle rebuild 591 + ``` 592 + 593 + This rescans all bundles and rebuilds the index. 594 + 595 + --- 596 + 597 + ### Hash Verification Failed 598 + 599 + **Problem:** 600 + ``` 601 + ✗ Bundle 000042 hash verification failed 602 + Expected hash: a1b2c3d4... 603 + Actual hash: f6e5d4c3... 604 + ``` 605 + 606 + **Possible causes:** 607 + 1. Corrupted file during download 608 + 2. Bundle was modified 609 + 3. Downloaded from untrusted source 610 + 611 + **Solution:** 612 + 613 + If you have a trusted source: 614 + ```bash 615 + # Re-download specific bundle 616 + plcbundle clone https://trusted-mirror.com -workers 1 617 + # (It will skip existing and re-download corrupt ones) 618 + ``` 619 + 620 + Or fetch fresh: 621 + ```bash 622 + # Delete corrupted bundle 623 + rm 000042.jsonl.zst 624 + 625 + # Rebuild index (marks it as missing) 626 + plcbundle rebuild 627 + 628 + # Fetch it again 629 + plcbundle fetch 630 + ``` 631 + 632 + --- 633 + 634 + ### Chain Broken Error 635 + 636 + **Problem:** 637 + ``` 638 + ✗ Chain broken at bundle 000050 639 + Expected parent: a1b2c3d4... 640 + Actual parent: f6e5d4c3... 641 + ``` 642 + 643 + **Meaning:** Bundle 50's parent hash doesn't match bundle 49's hash. 644 + 645 + **Solution:** 646 + 647 + This means bundles came from incompatible sources. You need a consistent chain: 648 + 649 + ```bash 650 + # Start fresh or clone from one trusted source 651 + rm *.jsonl.zst index.json 652 + plcbundle clone https://trusted-mirror.com 653 + ``` 654 + 655 + --- 656 + 657 + ### Out of Disk Space 658 + 659 + **Problem:** 660 + ``` 661 + Error saving bundle: no space left on device 662 + ``` 663 + 664 + **Solution:** 665 + 666 + Check space: 667 + ```bash 668 + df -h . 669 + ``` 670 + 671 + Free up space or move to larger disk: 672 + ```bash 673 + # Move to new location 674 + mv /old/path/* /new/large/disk/ 675 + cd /new/large/disk/ 676 + plcbundle info # Verify it works 677 + ``` 678 + 679 + --- 680 + 681 + ### Fetch Stuck / No Progress 682 + 683 + **Problem:** `plcbundle fetch` runs but doesn't create bundles. 684 + 685 + **Check mempool:** 686 + ```bash 687 + plcbundle mempool 688 + # Operations: 3,482 689 + # Need 6,518 more operations 690 + ``` 691 + 692 + **Meaning:** Not enough operations yet for a complete bundle. 693 + 694 + **Solutions:** 695 + 696 + 1. **Wait** - More operations will arrive 697 + 2. **Check PLC connectivity:** 698 + ```bash 699 + curl https://plc.directory/export?count=1 700 + ``` 701 + 702 + 3. **Check rate limits:** 703 + ```bash 704 + # Look for 429 errors in output 705 + plcbundle fetch -count 1 706 + ``` 707 + 708 + --- 709 + 710 + ### Port Already in Use 711 + 712 + **Problem:** 713 + ``` 714 + Server error: listen tcp :8080: bind: address already in use 715 + ``` 716 + 717 + **Solution:** 718 + 719 + Use different port: 720 + ```bash 721 + plcbundle serve -port 9000 722 + ``` 723 + 724 + Or find what's using port 8080: 725 + ```bash 726 + # macOS/Linux 727 + lsof -i :8080 728 + 729 + # Kill it if needed 730 + kill <PID> 731 + ``` 732 + 733 + --- 734 + 735 + ### WebSocket Connection Drops 736 + 737 + **Problem:** WebSocket streaming stops after a few minutes. 738 + 739 + **Causes:** 740 + - Reverse proxy timeout 741 + - Network timeout 742 + - Client timeout 743 + 744 + **Solutions:** 745 + 746 + 1. **Increase client timeout** (if using websocat): 747 + ```bash 748 + websocat -t ws://localhost:8080/ws 749 + ``` 750 + 751 + 2. **Configure reverse proxy** (nginx): 752 + ```nginx 753 + location /ws { 754 + proxy_pass http://localhost:8080; 755 + proxy_http_version 1.1; 756 + proxy_set_header Upgrade $http_upgrade; 757 + proxy_set_header Connection "upgrade"; 758 + proxy_read_timeout 86400; # 24 hours 759 + } 760 + ``` 761 + 762 + --- 763 + 764 + ### Memory Usage High 765 + 766 + **Problem:** `plcbundle` using lots of RAM during rebuild. 767 + 768 + **Cause:** Large bundles being processed simultaneously. 769 + 770 + **Solution:** 771 + 772 + Reduce workers: 773 + ```bash 774 + plcbundle rebuild -workers 2 775 + ``` 776 + 777 + Or increase system limits: 778 + ```bash 779 + # Check current limits 780 + ulimit -a 781 + 782 + # Increase if needed (Linux) 783 + ulimit -v 4000000 # 4GB virtual memory 784 + ``` 785 + 786 + --- 787 + 788 + ### Can't Clone from Remote 789 + 790 + **Problem:** 791 + ``` 792 + Error loading remote index: failed to download: connection refused 793 + ``` 794 + 795 + **Checklist:** 796 + 797 + 1. **Is URL correct?** 798 + ```bash 799 + curl https://remote-server.com/index.json 800 + ``` 801 + 802 + 2. **Is server running?** 803 + ```bash 804 + # On remote server 805 + plcbundle serve 806 + ``` 807 + 808 + 3. **Firewall blocking?** 809 + ```bash 810 + # Test connectivity 811 + telnet remote-server.com 8080 812 + ``` 813 + 814 + 4. **HTTPS certificate issues?** 815 + ```bash 816 + # Test with curl 817 + curl -v https://remote-server.com/index.json 818 + ``` 819 + 820 + --- 821 + 822 + ## Quick Reference 823 + 824 + ```bash 825 + # Sync 826 + plcbundle fetch # Fetch next bundle 827 + plcbundle fetch -count 0 # Fetch all available 828 + plcbundle clone <url> # Clone from remote 829 + 830 + # Manage 831 + plcbundle info # Show repository info 832 + plcbundle info -bundle 42 # Show specific bundle 833 + plcbundle rebuild # Rebuild index 834 + plcbundle verify # Verify chain 835 + 836 + # Export 837 + plcbundle export -count 1000 # Export operations 838 + plcbundle backfill > all.jsonl # Export everything 839 + plcbundle mempool -export > mem.jsonl # Export mempool 840 + 841 + # Serve 842 + plcbundle serve # Basic HTTP server 843 + plcbundle serve -sync -websocket # Full-featured server 844 + 845 + # Utilities 846 + plcbundle compare <url> # Compare with remote 847 + plcbundle mempool # Check mempool status 848 + plcbundle version # Show version 849 + ``` 850 + 851 + --- 852 + 853 + ## Getting Help 854 + 855 + **Command help:** 856 + ```bash 857 + plcbundle fetch -h 858 + ```
+1630
docs/library.md
··· 1 + # Library Guide 2 + 3 + A practical guide to using plcbundle as a Go library in your applications. 4 + 5 + ## Table of Contents 6 + 7 + - [Getting Started](#getting-started) 8 + - [Core Concepts](#core-concepts) 9 + - [Common Patterns](#common-patterns) 10 + - [Building Applications](#building-applications) 11 + - [Advanced Usage](#advanced-usage) 12 + - [Best Practices](#best-practices) 13 + - [API Reference](#api-reference) 14 + 15 + --- 16 + 17 + ## Getting Started 18 + 19 + ### Installation 20 + 21 + ```bash 22 + go get tangled.org/atscan.net/plcbundle 23 + ``` 24 + 25 + ### Your First Program 26 + 27 + Create a simple program to fetch and display bundle information: 28 + 29 + ```go 30 + package main 31 + 32 + import ( 33 + "context" 34 + "log" 35 + 36 + plcbundle "tangled.org/atscan.net/plcbundle" 37 + ) 38 + 39 + func main() { 40 + // Create a manager 41 + mgr, err := plcbundle.New("./plc_data", "https://plc.directory") 42 + if err != nil { 43 + log.Fatal(err) 44 + } 45 + defer mgr.Close() 46 + 47 + // Get repository info 48 + info := mgr.GetInfo() 49 + log.Printf("Bundle directory: %s", info["bundle_dir"]) 50 + 51 + // Get index stats 52 + index := mgr.GetIndex() 53 + stats := index.GetStats() 54 + log.Printf("Total bundles: %d", stats["bundle_count"]) 55 + } 56 + ``` 57 + 58 + Run it: 59 + ```bash 60 + go run main.go 61 + # 2025/01/15 10:30:00 Bundle directory: ./plc_data 62 + # 2025/01/15 10:30:00 Total bundles: 0 63 + ``` 64 + 65 + ### Fetching Your First Bundle 66 + 67 + Let's fetch a bundle from the PLC directory: 68 + 69 + ```go 70 + package main 71 + 72 + import ( 73 + "context" 74 + "log" 75 + 76 + plcbundle "tangled.org/atscan.net/plcbundle" 77 + ) 78 + 79 + func main() { 80 + mgr, err := plcbundle.New("./plc_data", "https://plc.directory") 81 + if err != nil { 82 + log.Fatal(err) 83 + } 84 + defer mgr.Close() 85 + 86 + ctx := context.Background() 87 + 88 + // Fetch next bundle 89 + log.Println("Fetching bundle...") 90 + bundle, err := mgr.FetchNext(ctx) 91 + if err != nil { 92 + log.Fatal(err) 93 + } 94 + 95 + log.Printf("✓ Fetched bundle %d", bundle.BundleNumber) 96 + log.Printf(" Operations: %d", len(bundle.Operations)) 97 + log.Printf(" Unique DIDs: %d", bundle.DIDCount) 98 + log.Printf(" Time range: %s to %s", 99 + bundle.StartTime.Format("2006-01-02"), 100 + bundle.EndTime.Format("2006-01-02")) 101 + } 102 + ``` 103 + 104 + **What's happening here?** 105 + 106 + 1. `plcbundle.New()` creates a manager that handles all bundle operations 107 + 2. `FetchNext()` automatically: 108 + - Fetches operations from PLC directory 109 + - Creates a bundle when 10,000 operations are collected 110 + - Saves the bundle to disk 111 + - Updates the index 112 + - Returns the bundle object 113 + 114 + ### Reading Bundles 115 + 116 + Once you have bundles, you can load and read them: 117 + 118 + ```go 119 + package main 120 + 121 + import ( 122 + "context" 123 + "log" 124 + 125 + plcbundle "tangled.org/atscan.net/plcbundle" 126 + ) 127 + 128 + func main() { 129 + mgr, err := plcbundle.New("./plc_data", "") 130 + if err != nil { 131 + log.Fatal(err) 132 + } 133 + defer mgr.Close() 134 + 135 + ctx := context.Background() 136 + 137 + // Load bundle 1 138 + bundle, err := mgr.Load(ctx, 1) 139 + if err != nil { 140 + log.Fatal(err) 141 + } 142 + 143 + log.Printf("Bundle %d loaded", bundle.BundleNumber) 144 + 145 + // Iterate through operations 146 + for i, op := range bundle.Operations { 147 + if i >= 5 { 148 + break // Just show first 5 149 + } 150 + log.Printf("%d. DID: %s, CID: %s", i+1, op.DID, op.CID) 151 + } 152 + } 153 + ``` 154 + 155 + --- 156 + 157 + ## Core Concepts 158 + 159 + ### The Manager 160 + 161 + The `Manager` is your main entry point. It handles: 162 + - Bundle storage and retrieval 163 + - Index management 164 + - PLC directory synchronization 165 + - Verification 166 + - Mempool management 167 + 168 + **Creating a manager:** 169 + 170 + ```go 171 + // Simple creation 172 + mgr, err := plcbundle.New("./bundles", "https://plc.directory") 173 + 174 + // Custom configuration 175 + config := plcbundle.DefaultConfig("./bundles") 176 + config.VerifyOnLoad = true 177 + config.AutoRebuild = true 178 + 179 + plcClient := plcbundle.NewPLCClient("https://plc.directory") 180 + mgr, err := plcbundle.NewManager(config, plcClient) 181 + ``` 182 + 183 + ### Bundles 184 + 185 + A bundle contains exactly 10,000 operations: 186 + 187 + ```go 188 + type Bundle struct { 189 + BundleNumber int // Sequential number (1, 2, 3...) 190 + StartTime time.Time // First operation timestamp 191 + EndTime time.Time // Last operation timestamp 192 + Operations []plc.PLCOperation // The 10,000 operations 193 + DIDCount int // Unique DIDs in bundle 194 + Hash string // Chain hash (includes history) 195 + ContentHash string // This bundle's content hash 196 + Parent string // Previous bundle's chain hash 197 + CompressedSize int64 // File size on disk 198 + UncompressedSize int64 // Original JSONL size 199 + } 200 + ``` 201 + 202 + ### The Index 203 + 204 + The index tracks all bundles and their metadata: 205 + 206 + ```go 207 + index := mgr.GetIndex() 208 + 209 + // Get all bundles 210 + bundles := index.GetBundles() 211 + for _, meta := range bundles { 212 + log.Printf("Bundle %d: %s to %s", 213 + meta.BundleNumber, 214 + meta.StartTime.Format("2006-01-02"), 215 + meta.EndTime.Format("2006-01-02")) 216 + } 217 + 218 + // Get specific bundle metadata 219 + meta, err := index.GetBundle(42) 220 + 221 + // Get last bundle 222 + lastBundle := index.GetLastBundle() 223 + ``` 224 + 225 + ### Operations 226 + 227 + Each operation represents a DID PLC directory event: 228 + 229 + ```go 230 + type PLCOperation struct { 231 + DID string // The DID (did:plc:...) 232 + Operation map[string]interface{} // The operation data 233 + CID string // Content identifier 234 + Nullified interface{} // nil, false, or CID string 235 + CreatedAt time.Time // When it was created 236 + RawJSON []byte // Original JSON bytes 237 + } 238 + 239 + // Check if operation was nullified 240 + if op.IsNullified() { 241 + log.Printf("Operation %s was nullified by %s", op.CID, op.GetNullifyingCID()) 242 + } 243 + ``` 244 + 245 + --- 246 + 247 + ## Common Patterns 248 + 249 + ### Pattern 1: Transparent Sync Service 250 + 251 + **Goal:** Keep a local PLC mirror continuously synchronized. 252 + 253 + This is the most common use case - maintaining an up-to-date copy of the PLC directory. 254 + 255 + ```go 256 + package main 257 + 258 + import ( 259 + "context" 260 + "log" 261 + "os" 262 + "os/signal" 263 + "syscall" 264 + "time" 265 + 266 + plcbundle "tangled.org/atscan.net/plcbundle" 267 + ) 268 + 269 + type SyncService struct { 270 + mgr *plcbundle.Manager 271 + interval time.Duration 272 + stop chan struct{} 273 + } 274 + 275 + func NewSyncService(bundleDir string, interval time.Duration) (*SyncService, error) { 276 + mgr, err := plcbundle.New(bundleDir, "https://plc.directory") 277 + if err != nil { 278 + return nil, err 279 + } 280 + 281 + return &SyncService{ 282 + mgr: mgr, 283 + interval: interval, 284 + stop: make(chan struct{}), 285 + }, nil 286 + } 287 + 288 + func (s *SyncService) Start() { 289 + log.Println("Starting sync service...") 290 + 291 + // Initial sync 292 + s.sync() 293 + 294 + // Periodic sync 295 + ticker := time.NewTicker(s.interval) 296 + defer ticker.Stop() 297 + 298 + for { 299 + select { 300 + case <-ticker.C: 301 + s.sync() 302 + case <-s.stop: 303 + log.Println("Sync service stopped") 304 + return 305 + } 306 + } 307 + } 308 + 309 + func (s *SyncService) sync() { 310 + ctx, cancel := context.WithTimeout(context.Background(), 10*time.Minute) 311 + defer cancel() 312 + 313 + log.Println("Checking for new bundles...") 314 + 315 + fetched := 0 316 + for { 317 + bundle, err := s.mgr.FetchNext(ctx) 318 + if err != nil { 319 + if isInsufficientOps(err) { 320 + if fetched > 0 { 321 + log.Printf("✓ Synced %d new bundles", fetched) 322 + } else { 323 + log.Println("✓ Up to date") 324 + } 325 + return 326 + } 327 + log.Printf("Error: %v", err) 328 + return 329 + } 330 + 331 + fetched++ 332 + log.Printf("✓ Fetched bundle %d (%d ops, %d DIDs)", 333 + bundle.BundleNumber, len(bundle.Operations), bundle.DIDCount) 334 + } 335 + } 336 + 337 + func (s *SyncService) Stop() { 338 + close(s.stop) 339 + s.mgr.Close() 340 + } 341 + 342 + func isInsufficientOps(err error) bool { 343 + return err != nil && 344 + (strings.Contains(err.Error(), "insufficient operations") || 345 + strings.Contains(err.Error(), "no more available")) 346 + } 347 + 348 + func main() { 349 + service, err := NewSyncService("./plc_data", 5*time.Minute) 350 + if err != nil { 351 + log.Fatal(err) 352 + } 353 + 354 + // Start service in background 355 + go service.Start() 356 + 357 + // Wait for interrupt 358 + sigChan := make(chan os.Signal, 1) 359 + signal.Notify(sigChan, os.Interrupt, syscall.SIGTERM) 360 + <-sigChan 361 + 362 + log.Println("Shutting down...") 363 + service.Stop() 364 + } 365 + ``` 366 + 367 + **Usage:** 368 + ```bash 369 + go run main.go 370 + # Starting sync service... 371 + # Checking for new bundles... 372 + # ✓ Fetched bundle 8548 (10000 ops, 8234 DIDs) 373 + # ✓ Fetched bundle 8549 (10000 ops, 8156 DIDs) 374 + # ✓ Up to date 375 + # ... (repeats every 5 minutes) 376 + ``` 377 + 378 + ### Pattern 2: Reading and Processing Operations 379 + 380 + **Goal:** Process all historical operations for analysis. 381 + 382 + ```go 383 + package main 384 + 385 + import ( 386 + "context" 387 + "log" 388 + 389 + plcbundle "tangled.org/atscan.net/plcbundle" 390 + ) 391 + 392 + type OperationProcessor struct { 393 + mgr *plcbundle.Manager 394 + } 395 + 396 + func NewOperationProcessor(bundleDir string) (*OperationProcessor, error) { 397 + mgr, err := plcbundle.New(bundleDir, "") 398 + if err != nil { 399 + return nil, err 400 + } 401 + 402 + return &OperationProcessor{mgr: mgr}, nil 403 + } 404 + 405 + func (p *OperationProcessor) ProcessAll() error { 406 + ctx := context.Background() 407 + 408 + index := p.mgr.GetIndex() 409 + bundles := index.GetBundles() 410 + 411 + log.Printf("Processing %d bundles...", len(bundles)) 412 + 413 + totalOps := 0 414 + uniqueDIDs := make(map[string]bool) 415 + 416 + for _, meta := range bundles { 417 + // Load bundle 418 + bundle, err := p.mgr.Load(ctx, meta.BundleNumber) 419 + if err != nil { 420 + return err 421 + } 422 + 423 + // Process operations 424 + for _, op := range bundle.Operations { 425 + totalOps++ 426 + uniqueDIDs[op.DID] = true 427 + 428 + // Your processing logic here 429 + p.processOperation(op) 430 + } 431 + 432 + if meta.BundleNumber % 100 == 0 { 433 + log.Printf("Processed bundle %d...", meta.BundleNumber) 434 + } 435 + } 436 + 437 + log.Printf("✓ Processed %d operations from %d unique DIDs", 438 + totalOps, len(uniqueDIDs)) 439 + 440 + return nil 441 + } 442 + 443 + func (p *OperationProcessor) processOperation(op plcbundle.PLCOperation) { 444 + // Example: Extract PDS endpoints 445 + if services, ok := op.Operation["services"].(map[string]interface{}); ok { 446 + if pds, ok := services["atproto_pds"].(map[string]interface{}); ok { 447 + if endpoint, ok := pds["endpoint"].(string); ok { 448 + log.Printf("DID %s uses PDS: %s", op.DID, endpoint) 449 + } 450 + } 451 + } 452 + } 453 + 454 + func main() { 455 + processor, err := NewOperationProcessor("./plc_data") 456 + if err != nil { 457 + log.Fatal(err) 458 + } 459 + 460 + if err := processor.ProcessAll(); err != nil { 461 + log.Fatal(err) 462 + } 463 + } 464 + ``` 465 + 466 + ### Pattern 3: Time-Based Queries 467 + 468 + **Goal:** Export operations from a specific time period. 469 + 470 + ```go 471 + package main 472 + 473 + import ( 474 + "context" 475 + "encoding/json" 476 + "log" 477 + "os" 478 + "time" 479 + 480 + plcbundle "tangled.org/atscan.net/plcbundle" 481 + ) 482 + 483 + func exportOperationsSince(bundleDir string, since time.Time, limit int) error { 484 + mgr, err := plcbundle.New(bundleDir, "") 485 + if err != nil { 486 + return err 487 + } 488 + defer mgr.Close() 489 + 490 + ctx := context.Background() 491 + 492 + // Export operations after timestamp 493 + ops, err := mgr.Export(ctx, since, limit) 494 + if err != nil { 495 + return err 496 + } 497 + 498 + log.Printf("Exporting %d operations...", len(ops)) 499 + 500 + // Write as JSONL to stdout 501 + encoder := json.NewEncoder(os.Stdout) 502 + for _, op := range ops { 503 + if err := encoder.Encode(op); err != nil { 504 + return err 505 + } 506 + } 507 + 508 + return nil 509 + } 510 + 511 + func main() { 512 + // Export operations from the last 7 days 513 + since := time.Now().AddDate(0, 0, -7) 514 + 515 + if err := exportOperationsSince("./plc_data", since, 50000); err != nil { 516 + log.Fatal(err) 517 + } 518 + } 519 + ``` 520 + 521 + **Output to file:** 522 + ```bash 523 + go run main.go > last_7_days.jsonl 524 + ``` 525 + 526 + ### Pattern 4: Verification Service 527 + 528 + **Goal:** Periodically verify bundle integrity. 529 + 530 + ```go 531 + package main 532 + 533 + import ( 534 + "context" 535 + "log" 536 + "time" 537 + 538 + plcbundle "tangled.org/atscan.net/plcbundle" 539 + ) 540 + 541 + type VerificationService struct { 542 + mgr *plcbundle.Manager 543 + interval time.Duration 544 + } 545 + 546 + func NewVerificationService(bundleDir string, interval time.Duration) (*VerificationService, error) { 547 + mgr, err := plcbundle.New(bundleDir, "") 548 + if err != nil { 549 + return nil, err 550 + } 551 + 552 + return &VerificationService{ 553 + mgr: mgr, 554 + interval: interval, 555 + }, nil 556 + } 557 + 558 + func (v *VerificationService) Start() { 559 + ticker := time.NewTicker(v.interval) 560 + defer ticker.Stop() 561 + 562 + // Verify immediately on start 563 + v.verify() 564 + 565 + for range ticker.C { 566 + v.verify() 567 + } 568 + } 569 + 570 + func (v *VerificationService) verify() { 571 + ctx, cancel := context.WithTimeout(context.Background(), 30*time.Minute) 572 + defer cancel() 573 + 574 + log.Println("Starting chain verification...") 575 + start := time.Now() 576 + 577 + result, err := v.mgr.VerifyChain(ctx) 578 + if err != nil { 579 + log.Printf("❌ Verification error: %v", err) 580 + return 581 + } 582 + 583 + elapsed := time.Since(start) 584 + 585 + if result.Valid { 586 + log.Printf("✅ Chain verified: %d bundles, took %s", 587 + result.ChainLength, elapsed.Round(time.Second)) 588 + 589 + // Get head hash 590 + index := v.mgr.GetIndex() 591 + if last := index.GetLastBundle(); last != nil { 592 + log.Printf(" Head hash: %s...", last.Hash[:16]) 593 + } 594 + } else { 595 + log.Printf("❌ Chain broken at bundle %d: %s", 596 + result.BrokenAt, result.Error) 597 + 598 + // Alert or take action 599 + v.handleBrokenChain(result) 600 + } 601 + } 602 + 603 + func (v *VerificationService) handleBrokenChain(result *plcbundle.ChainVerificationResult) { 604 + // Send alert, trigger re-sync, etc. 605 + log.Printf("⚠️ ALERT: Chain integrity compromised!") 606 + // TODO: Implement your alerting logic 607 + } 608 + 609 + func main() { 610 + service, err := NewVerificationService("./plc_data", 24*time.Hour) 611 + if err != nil { 612 + log.Fatal(err) 613 + } 614 + 615 + log.Println("Verification service started (daily checks)") 616 + service.Start() 617 + } 618 + ``` 619 + 620 + ### Pattern 5: Custom HTTP API 621 + 622 + **Goal:** Build a custom API on top of your bundle archive. 623 + 624 + ```go 625 + package main 626 + 627 + import ( 628 + "encoding/json" 629 + "log" 630 + "net/http" 631 + "strconv" 632 + 633 + plcbundle "tangled.org/atscan.net/plcbundle" 634 + ) 635 + 636 + type API struct { 637 + mgr *plcbundle.Manager 638 + } 639 + 640 + func NewAPI(bundleDir string) (*API, error) { 641 + mgr, err := plcbundle.New(bundleDir, "") 642 + if err != nil { 643 + return nil, err 644 + } 645 + 646 + return &API{mgr: mgr}, nil 647 + } 648 + 649 + func (api *API) handleStats(w http.ResponseWriter, r *http.Request) { 650 + index := api.mgr.GetIndex() 651 + stats := index.GetStats() 652 + 653 + response := map[string]interface{}{ 654 + "bundles": stats["bundle_count"], 655 + "first": stats["first_bundle"], 656 + "last": stats["last_bundle"], 657 + "total_size": stats["total_size"], 658 + "start_time": stats["start_time"], 659 + "end_time": stats["end_time"], 660 + "updated_at": stats["updated_at"], 661 + } 662 + 663 + w.Header().Set("Content-Type", "application/json") 664 + json.NewEncoder(w).Encode(response) 665 + } 666 + 667 + func (api *API) handleOperations(w http.ResponseWriter, r *http.Request) { 668 + bundleNumStr := r.URL.Query().Get("bundle") 669 + if bundleNumStr == "" { 670 + http.Error(w, "bundle parameter required", http.StatusBadRequest) 671 + return 672 + } 673 + 674 + bundleNum, err := strconv.Atoi(bundleNumStr) 675 + if err != nil { 676 + http.Error(w, "invalid bundle number", http.StatusBadRequest) 677 + return 678 + } 679 + 680 + ctx := r.Context() 681 + bundle, err := api.mgr.Load(ctx, bundleNum) 682 + if err != nil { 683 + http.Error(w, err.Error(), http.StatusNotFound) 684 + return 685 + } 686 + 687 + w.Header().Set("Content-Type", "application/x-ndjson") 688 + encoder := json.NewEncoder(w) 689 + for _, op := range bundle.Operations { 690 + encoder.Encode(op) 691 + } 692 + } 693 + 694 + func (api *API) handleDID(w http.ResponseWriter, r *http.Request) { 695 + did := r.URL.Query().Get("did") 696 + if did == "" { 697 + http.Error(w, "did parameter required", http.StatusBadRequest) 698 + return 699 + } 700 + 701 + ctx := r.Context() 702 + 703 + // Search through bundles for this DID 704 + var operations []plcbundle.PLCOperation 705 + 706 + index := api.mgr.GetIndex() 707 + bundles := index.GetBundles() 708 + 709 + for _, meta := range bundles { 710 + bundle, err := api.mgr.Load(ctx, meta.BundleNumber) 711 + if err != nil { 712 + continue 713 + } 714 + 715 + for _, op := range bundle.Operations { 716 + if op.DID == did { 717 + operations = append(operations, op) 718 + } 719 + } 720 + } 721 + 722 + w.Header().Set("Content-Type", "application/json") 723 + json.NewEncoder(w).Encode(map[string]interface{}{ 724 + "did": did, 725 + "operations": operations, 726 + "count": len(operations), 727 + }) 728 + } 729 + 730 + func main() { 731 + api, err := NewAPI("./plc_data") 732 + if err != nil { 733 + log.Fatal(err) 734 + } 735 + 736 + http.HandleFunc("/stats", api.handleStats) 737 + http.HandleFunc("/operations", api.handleOperations) 738 + http.HandleFunc("/did", api.handleDID) 739 + 740 + log.Println("API listening on :8080") 741 + log.Fatal(http.ListenAndServe(":8080", nil)) 742 + } 743 + ``` 744 + 745 + **Usage:** 746 + ```bash 747 + # Get stats 748 + curl http://localhost:8080/stats 749 + 750 + # Get operations from bundle 1 751 + curl http://localhost:8080/operations?bundle=1 752 + 753 + # Get all operations for a DID 754 + curl http://localhost:8080/did?did=did:plc:example123 755 + ``` 756 + 757 + --- 758 + 759 + ## Building Applications 760 + 761 + ### Application 1: PDS Discovery Tool 762 + 763 + Find all PDS endpoints in the network: 764 + 765 + ```go 766 + package main 767 + 768 + import ( 769 + "context" 770 + "fmt" 771 + "log" 772 + 773 + plcbundle "tangled.org/atscan.net/plcbundle" 774 + ) 775 + 776 + type PDSTracker struct { 777 + mgr *plcbundle.Manager 778 + endpoints map[string]int // endpoint -> count 779 + } 780 + 781 + func NewPDSTracker(bundleDir string) (*PDSTracker, error) { 782 + mgr, err := plcbundle.New(bundleDir, "") 783 + if err != nil { 784 + return nil, err 785 + } 786 + 787 + return &PDSTracker{ 788 + mgr: mgr, 789 + endpoints: make(map[string]int), 790 + }, nil 791 + } 792 + 793 + func (pt *PDSTracker) Scan() error { 794 + ctx := context.Background() 795 + 796 + index := pt.mgr.GetIndex() 797 + bundles := index.GetBundles() 798 + 799 + log.Printf("Scanning %d bundles for PDS endpoints...", len(bundles)) 800 + 801 + for _, meta := range bundles { 802 + bundle, err := pt.mgr.Load(ctx, meta.BundleNumber) 803 + if err != nil { 804 + return err 805 + } 806 + 807 + for _, op := range bundle.Operations { 808 + if endpoint := pt.extractPDS(op); endpoint != "" { 809 + pt.endpoints[endpoint]++ 810 + } 811 + } 812 + } 813 + 814 + return nil 815 + } 816 + 817 + func (pt *PDSTracker) extractPDS(op plcbundle.PLCOperation) string { 818 + services, ok := op.Operation["services"].(map[string]interface{}) 819 + if !ok { 820 + return "" 821 + } 822 + 823 + pds, ok := services["atproto_pds"].(map[string]interface{}) 824 + if !ok { 825 + return "" 826 + } 827 + 828 + endpoint, ok := pds["endpoint"].(string) 829 + if !ok { 830 + return "" 831 + } 832 + 833 + return endpoint 834 + } 835 + 836 + func (pt *PDSTracker) PrintResults() { 837 + log.Printf("\nFound %d unique PDS endpoints:\n", len(pt.endpoints)) 838 + 839 + // Sort by count 840 + type endpointCount struct { 841 + endpoint string 842 + count int 843 + } 844 + 845 + var sorted []endpointCount 846 + for endpoint, count := range pt.endpoints { 847 + sorted = append(sorted, endpointCount{endpoint, count}) 848 + } 849 + 850 + sort.Slice(sorted, func(i, j int) bool { 851 + return sorted[i].count > sorted[j].count 852 + }) 853 + 854 + // Print top 20 855 + for i, ec := range sorted { 856 + if i >= 20 { 857 + break 858 + } 859 + fmt.Printf("%3d. %s (%d DIDs)\n", i+1, ec.endpoint, ec.count) 860 + } 861 + } 862 + 863 + func main() { 864 + tracker, err := NewPDSTracker("./plc_data") 865 + if err != nil { 866 + log.Fatal(err) 867 + } 868 + 869 + if err := tracker.Scan(); err != nil { 870 + log.Fatal(err) 871 + } 872 + 873 + tracker.PrintResults() 874 + } 875 + ``` 876 + 877 + ### Application 2: DID History Viewer 878 + 879 + View the complete history of a DID: 880 + 881 + ```go 882 + package main 883 + 884 + import ( 885 + "context" 886 + "encoding/json" 887 + "fmt" 888 + "log" 889 + "os" 890 + 891 + plcbundle "tangled.org/atscan.net/plcbundle" 892 + ) 893 + 894 + type DIDHistory struct { 895 + DID string `json:"did"` 896 + Operations []plcbundle.PLCOperation `json:"operations"` 897 + FirstSeen time.Time `json:"first_seen"` 898 + LastSeen time.Time `json:"last_seen"` 899 + OpCount int `json:"operation_count"` 900 + } 901 + 902 + func getDIDHistory(bundleDir, did string) (*DIDHistory, error) { 903 + mgr, err := plcbundle.New(bundleDir, "") 904 + if err != nil { 905 + return nil, err 906 + } 907 + defer mgr.Close() 908 + 909 + ctx := context.Background() 910 + 911 + history := &DIDHistory{ 912 + DID: did, 913 + Operations: make([]plcbundle.PLCOperation, 0), 914 + } 915 + 916 + index := mgr.GetIndex() 917 + bundles := index.GetBundles() 918 + 919 + log.Printf("Searching for DID %s...", did) 920 + 921 + for _, meta := range bundles { 922 + bundle, err := mgr.Load(ctx, meta.BundleNumber) 923 + if err != nil { 924 + continue 925 + } 926 + 927 + for _, op := range bundle.Operations { 928 + if op.DID == did { 929 + history.Operations = append(history.Operations, op) 930 + } 931 + } 932 + } 933 + 934 + if len(history.Operations) == 0 { 935 + return nil, fmt.Errorf("DID not found") 936 + } 937 + 938 + // Set timestamps 939 + history.FirstSeen = history.Operations[0].CreatedAt 940 + history.LastSeen = history.Operations[len(history.Operations)-1].CreatedAt 941 + history.OpCount = len(history.Operations) 942 + 943 + return history, nil 944 + } 945 + 946 + func main() { 947 + if len(os.Args) < 2 { 948 + log.Fatal("Usage: did-history <did>") 949 + } 950 + 951 + did := os.Args[1] 952 + 953 + history, err := getDIDHistory("./plc_data", did) 954 + if err != nil { 955 + log.Fatal(err) 956 + } 957 + 958 + // Print as JSON 959 + encoder := json.NewEncoder(os.Stdout) 960 + encoder.SetIndent("", " ") 961 + encoder.Encode(history) 962 + } 963 + ``` 964 + 965 + ### Application 3: Real-time Monitor 966 + 967 + Monitor new operations as they arrive: 968 + 969 + ```go 970 + package main 971 + 972 + import ( 973 + "context" 974 + "log" 975 + "time" 976 + 977 + plcbundle "tangled.org/atscan.net/plcbundle" 978 + ) 979 + 980 + type Monitor struct { 981 + mgr *plcbundle.Manager 982 + lastSeen int // Last bundle number processed 983 + pollInterval time.Duration 984 + } 985 + 986 + func NewMonitor(bundleDir string, pollInterval time.Duration) (*Monitor, error) { 987 + mgr, err := plcbundle.New(bundleDir, "https://plc.directory") 988 + if err != nil { 989 + return nil, err 990 + } 991 + 992 + // Get current position 993 + index := mgr.GetIndex() 994 + lastBundle := index.GetLastBundle() 995 + lastSeen := 0 996 + if lastBundle != nil { 997 + lastSeen = lastBundle.BundleNumber 998 + } 999 + 1000 + return &Monitor{ 1001 + mgr: mgr, 1002 + lastSeen: lastSeen, 1003 + pollInterval: pollInterval, 1004 + }, nil 1005 + } 1006 + 1007 + func (m *Monitor) Start() { 1008 + log.Println("Monitor started, watching for new bundles...") 1009 + 1010 + ticker := time.NewTicker(m.pollInterval) 1011 + defer ticker.Stop() 1012 + 1013 + for range ticker.C { 1014 + m.check() 1015 + } 1016 + } 1017 + 1018 + func (m *Monitor) check() { 1019 + ctx := context.Background() 1020 + 1021 + // Try to fetch next bundle 1022 + bundle, err := m.mgr.FetchNext(ctx) 1023 + if err != nil { 1024 + // Not an error if no new bundle available 1025 + return 1026 + } 1027 + 1028 + // New bundle! 1029 + log.Printf("🔔 New bundle: %d", bundle.BundleNumber) 1030 + log.Printf(" Operations: %d", len(bundle.Operations)) 1031 + log.Printf(" DIDs: %d", bundle.DIDCount) 1032 + log.Printf(" Time: %s", bundle.EndTime.Format("2006-01-02 15:04:05")) 1033 + 1034 + // Process new operations 1035 + m.processNewOperations(bundle) 1036 + 1037 + m.lastSeen = bundle.BundleNumber 1038 + } 1039 + 1040 + func (m *Monitor) processNewOperations(bundle *plcbundle.Bundle) { 1041 + for _, op := range bundle.Operations { 1042 + // Check for interesting operations 1043 + if op.IsNullified() { 1044 + log.Printf(" ⚠️ Nullified: %s", op.DID) 1045 + } 1046 + 1047 + // Check for new DIDs (operation type "create") 1048 + if opType, ok := op.Operation["type"].(string); ok && opType == "create" { 1049 + log.Printf(" ➕ New DID: %s", op.DID) 1050 + } 1051 + } 1052 + } 1053 + 1054 + func main() { 1055 + monitor, err := NewMonitor("./plc_data", 30*time.Second) 1056 + if err != nil { 1057 + log.Fatal(err) 1058 + } 1059 + 1060 + monitor.Start() 1061 + } 1062 + ``` 1063 + 1064 + --- 1065 + 1066 + ## Advanced Usage 1067 + 1068 + ### Custom Configuration 1069 + 1070 + Full control over bundle manager behavior: 1071 + 1072 + ```go 1073 + package main 1074 + 1075 + import ( 1076 + "log" 1077 + "runtime" 1078 + "time" 1079 + 1080 + "tangled.org/atscan.net/plcbundle/bundle" 1081 + "tangled.org/atscan.net/plcbundle/plc" 1082 + plcbundle "tangled.org/atscan.net/plcbundle" 1083 + ) 1084 + 1085 + func main() { 1086 + // Custom configuration 1087 + config := &bundle.Config{ 1088 + BundleDir: "./my_bundles", 1089 + VerifyOnLoad: true, // Verify hashes when loading 1090 + AutoRebuild: true, // Auto-rebuild index if needed 1091 + RebuildWorkers: runtime.NumCPU(), // Parallel workers for rebuild 1092 + Logger: &MyCustomLogger{}, // Custom logger 1093 + 1094 + // Progress callback for rebuild 1095 + RebuildProgress: func(current, total int) { 1096 + if current%100 == 0 { 1097 + log.Printf("Rebuild: %d/%d (%.1f%%)", 1098 + current, total, float64(current)/float64(total)*100) 1099 + } 1100 + }, 1101 + } 1102 + 1103 + // Custom PLC client with rate limiting 1104 + plcClient := plc.NewClient("https://plc.directory", 1105 + plc.WithRateLimit(60, time.Minute), // 60 req/min 1106 + plc.WithTimeout(30*time.Second), // 30s timeout 1107 + plc.WithLogger(&MyCustomLogger{}), // Custom logger 1108 + ) 1109 + 1110 + // Create manager 1111 + mgr, err := bundle.NewManager(config, plcClient) 1112 + if err != nil { 1113 + log.Fatal(err) 1114 + } 1115 + defer mgr.Close() 1116 + 1117 + log.Println("Manager created with custom configuration") 1118 + } 1119 + 1120 + // Custom logger implementation 1121 + type MyCustomLogger struct{} 1122 + 1123 + func (l *MyCustomLogger) Printf(format string, v ...interface{}) { 1124 + // Add custom formatting, filtering, etc. 1125 + log.Printf("[PLCBUNDLE] "+format, v...) 1126 + } 1127 + 1128 + func (l *MyCustomLogger) Println(v ...interface{}) { 1129 + log.Println(append([]interface{}{"[PLCBUNDLE]"}, v...)...) 1130 + } 1131 + ``` 1132 + 1133 + ### Streaming Data 1134 + 1135 + Stream bundle data without loading everything into memory: 1136 + 1137 + ```go 1138 + package main 1139 + 1140 + import ( 1141 + "bufio" 1142 + "context" 1143 + "encoding/json" 1144 + "io" 1145 + "log" 1146 + 1147 + plcbundle "tangled.org/atscan.net/plcbundle" 1148 + ) 1149 + 1150 + func streamBundle(mgr *plcbundle.Manager, bundleNumber int) error { 1151 + ctx := context.Background() 1152 + 1153 + // Get decompressed stream 1154 + reader, err := mgr.StreamDecompressed(ctx, bundleNumber) 1155 + if err != nil { 1156 + return err 1157 + } 1158 + defer reader.Close() 1159 + 1160 + // Read line by line (JSONL) 1161 + scanner := bufio.NewScanner(reader) 1162 + 1163 + // Set buffer size for large lines 1164 + buf := make([]byte, 0, 64*1024) 1165 + scanner.Buffer(buf, 1024*1024) 1166 + 1167 + lineNum := 0 1168 + for scanner.Scan() { 1169 + lineNum++ 1170 + 1171 + var op plcbundle.PLCOperation 1172 + if err := json.Unmarshal(scanner.Bytes(), &op); err != nil { 1173 + log.Printf("Warning: failed to parse line %d: %v", lineNum, err) 1174 + continue 1175 + } 1176 + 1177 + // Process operation without storing all in memory 1178 + processOperation(op) 1179 + } 1180 + 1181 + return scanner.Err() 1182 + } 1183 + 1184 + func processOperation(op plcbundle.PLCOperation) { 1185 + // Your processing logic 1186 + log.Printf("Processing: %s", op.DID) 1187 + } 1188 + 1189 + func main() { 1190 + mgr, err := plcbundle.New("./plc_data", "") 1191 + if err != nil { 1192 + log.Fatal(err) 1193 + } 1194 + defer mgr.Close() 1195 + 1196 + // Stream bundle 1 1197 + if err := streamBundle(mgr, 1); err != nil { 1198 + log.Fatal(err) 1199 + } 1200 + } 1201 + ``` 1202 + 1203 + ### Parallel Processing 1204 + 1205 + Process multiple bundles concurrently: 1206 + 1207 + ```go 1208 + package main 1209 + 1210 + import ( 1211 + "context" 1212 + "log" 1213 + "sync" 1214 + 1215 + plcbundle "tangled.org/atscan.net/plcbundle" 1216 + ) 1217 + 1218 + func processParallel(mgr *plcbundle.Manager, workers int) error { 1219 + ctx := context.Background() 1220 + 1221 + index := mgr.GetIndex() 1222 + bundles := index.GetBundles() 1223 + 1224 + // Create job channel 1225 + jobs := make(chan int, len(bundles)) 1226 + results := make(chan error, len(bundles)) 1227 + 1228 + // Start workers 1229 + var wg sync.WaitGroup 1230 + for w := 0; w < workers; w++ { 1231 + wg.Add(1) 1232 + go func() { 1233 + defer wg.Done() 1234 + for bundleNum := range jobs { 1235 + if err := processBundle(ctx, mgr, bundleNum); err != nil { 1236 + results <- err 1237 + } else { 1238 + results <- nil 1239 + } 1240 + } 1241 + }() 1242 + } 1243 + 1244 + // Send jobs 1245 + for _, meta := range bundles { 1246 + jobs <- meta.BundleNumber 1247 + } 1248 + close(jobs) 1249 + 1250 + // Wait for completion 1251 + go func() { 1252 + wg.Wait() 1253 + close(results) 1254 + }() 1255 + 1256 + // Collect results 1257 + errors := 0 1258 + for err := range results { 1259 + if err != nil { 1260 + log.Printf("Error: %v", err) 1261 + errors++ 1262 + } 1263 + } 1264 + 1265 + if errors > 0 { 1266 + return fmt.Errorf("%d bundles failed processing", errors) 1267 + } 1268 + 1269 + return nil 1270 + } 1271 + 1272 + func processBundle(ctx context.Context, mgr *plcbundle.Manager, bundleNum int) error { 1273 + bundle, err := mgr.Load(ctx, bundleNum) 1274 + if err != nil { 1275 + return err 1276 + } 1277 + 1278 + // Process operations 1279 + for _, op := range bundle.Operations { 1280 + // Your logic here 1281 + _ = op 1282 + } 1283 + 1284 + log.Printf("Processed bundle %d", bundleNum) 1285 + return nil 1286 + } 1287 + 1288 + func main() { 1289 + mgr, err := plcbundle.New("./plc_data", "") 1290 + if err != nil { 1291 + log.Fatal(err) 1292 + } 1293 + defer mgr.Close() 1294 + 1295 + // Process with 8 workers 1296 + if err := processParallel(mgr, 8); err != nil { 1297 + log.Fatal(err) 1298 + } 1299 + } 1300 + ``` 1301 + 1302 + ### Working with Mempool 1303 + 1304 + Access operations before they're bundled: 1305 + 1306 + ```go 1307 + package main 1308 + 1309 + import ( 1310 + "log" 1311 + 1312 + plcbundle "tangled.org/atscan.net/plcbundle" 1313 + ) 1314 + 1315 + func main() { 1316 + mgr, err := plcbundle.New("./plc_data", "https://plc.directory") 1317 + if err != nil { 1318 + log.Fatal(err) 1319 + } 1320 + defer mgr.Close() 1321 + 1322 + // Get mempool stats 1323 + stats := mgr.GetMempoolStats() 1324 + 1325 + count := stats["count"].(int) 1326 + targetBundle := stats["target_bundle"].(int) 1327 + canCreate := stats["can_create_bundle"].(bool) 1328 + 1329 + log.Printf("Mempool status:") 1330 + log.Printf(" Target bundle: %d", targetBundle) 1331 + log.Printf(" Operations: %d/%d", count, plcbundle.BUNDLE_SIZE) 1332 + log.Printf(" Ready: %v", canCreate) 1333 + 1334 + if count > 0 { 1335 + // Get mempool operations 1336 + ops, err := mgr.GetMempoolOperations() 1337 + if err != nil { 1338 + log.Fatal(err) 1339 + } 1340 + 1341 + log.Printf("Latest unbundled operations:") 1342 + for i, op := range ops { 1343 + if i >= 5 { 1344 + break 1345 + } 1346 + log.Printf(" %d. %s (%s)", i+1, op.DID, op.CreatedAt.Format("15:04:05")) 1347 + } 1348 + } 1349 + 1350 + // Validate chronological order 1351 + if err := mgr.ValidateMempool(); err != nil { 1352 + log.Printf("⚠️ Mempool validation failed: %v", err) 1353 + } else { 1354 + log.Println("✓ Mempool validated") 1355 + } 1356 + } 1357 + ``` 1358 + 1359 + --- 1360 + 1361 + ## Best Practices 1362 + 1363 + ### 1. Always Close the Manager 1364 + 1365 + Use `defer` to ensure cleanup: 1366 + 1367 + ```go 1368 + mgr, err := plcbundle.New("./plc_data", "https://plc.directory") 1369 + if err != nil { 1370 + return err 1371 + } 1372 + defer mgr.Close() // Always close! 1373 + ``` 1374 + 1375 + ### 2. Handle Context Cancellation 1376 + 1377 + Support graceful shutdown: 1378 + 1379 + ```go 1380 + ctx, cancel := context.WithCancel(context.Background()) 1381 + defer cancel() 1382 + 1383 + // Listen for interrupt 1384 + sigChan := make(chan os.Signal, 1) 1385 + signal.Notify(sigChan, os.Interrupt, syscall.SIGTERM) 1386 + 1387 + go func() { 1388 + <-sigChan 1389 + log.Println("Interrupt received, stopping...") 1390 + cancel() 1391 + }() 1392 + 1393 + // Use context in operations 1394 + bundle, err := mgr.FetchNext(ctx) 1395 + if err == context.Canceled { 1396 + log.Println("Operation cancelled gracefully") 1397 + return nil 1398 + } 1399 + ``` 1400 + 1401 + ### 3. Check Errors Properly 1402 + 1403 + Distinguish between different error types: 1404 + 1405 + ```go 1406 + bundle, err := mgr.FetchNext(ctx) 1407 + if err != nil { 1408 + // Check if it's just "caught up" 1409 + if strings.Contains(err.Error(), "insufficient operations") { 1410 + log.Println("No new bundles available (caught up)") 1411 + return nil 1412 + } 1413 + 1414 + // Real error 1415 + return fmt.Errorf("fetch failed: %w", err) 1416 + } 1417 + ``` 1418 + 1419 + ### 4. Use Streaming for Large Datasets 1420 + 1421 + Don't load everything into memory: 1422 + 1423 + ```go 1424 + // ❌ Bad: Loads all operations into memory 1425 + index := mgr.GetIndex() 1426 + var allOps []plcbundle.PLCOperation 1427 + for _, meta := range index.GetBundles() { 1428 + bundle, _ := mgr.Load(ctx, meta.BundleNumber) 1429 + allOps = append(allOps, bundle.Operations...) 1430 + } 1431 + 1432 + // ✅ Good: Process one bundle at a time 1433 + for _, meta := range index.GetBundles() { 1434 + bundle, _ := mgr.Load(ctx, meta.BundleNumber) 1435 + for _, op := range bundle.Operations { 1436 + processOperation(op) 1437 + } 1438 + } 1439 + ``` 1440 + 1441 + ### 5. Enable Verification in Production 1442 + 1443 + ```go 1444 + config := plcbundle.DefaultConfig("./plc_data") 1445 + config.VerifyOnLoad = true // Verify hashes when loading 1446 + 1447 + mgr, err := plcbundle.NewManager(config, plcClient) 1448 + ``` 1449 + 1450 + ### 6. Log Appropriately 1451 + 1452 + Implement custom logger for production: 1453 + 1454 + ```go 1455 + type ProductionLogger struct { 1456 + logger *zap.Logger 1457 + } 1458 + 1459 + func (l *ProductionLogger) Printf(format string, v ...interface{}) { 1460 + l.logger.Sugar().Infof(format, v...) 1461 + } 1462 + 1463 + func (l *ProductionLogger) Println(v ...interface{}) { 1464 + l.logger.Sugar().Info(v...) 1465 + } 1466 + ``` 1467 + 1468 + ### 7. Handle Rate Limits 1469 + 1470 + Configure PLC client appropriately: 1471 + 1472 + ```go 1473 + // Production: Be conservative 1474 + plcClient := plc.NewClient("https://plc.directory", 1475 + plc.WithRateLimit(60, time.Minute), // 60 req/min max 1476 + plc.WithTimeout(60*time.Second), 1477 + ) 1478 + 1479 + // Development: Can be more aggressive (but respectful) 1480 + plcClient := plc.NewClient("https://plc.directory", 1481 + plc.WithRateLimit(90, time.Minute), 1482 + plc.WithTimeout(30*time.Second), 1483 + ) 1484 + ``` 1485 + 1486 + --- 1487 + 1488 + ## API Reference 1489 + 1490 + ### Manager Methods 1491 + 1492 + ```go 1493 + // Creation 1494 + New(bundleDir, plcURL string) (*Manager, error) 1495 + NewManager(config *Config, plcClient *PLCClient) (*Manager, error) 1496 + 1497 + // Lifecycle 1498 + Close() 1499 + 1500 + // Fetching 1501 + FetchNext(ctx) (*Bundle, error) 1502 + 1503 + // Loading 1504 + Load(ctx, bundleNumber int) (*Bundle, error) 1505 + 1506 + // Verification 1507 + Verify(ctx, bundleNumber int) (*VerificationResult, error) 1508 + VerifyChain(ctx) (*ChainVerificationResult, error) 1509 + 1510 + // Exporting 1511 + Export(ctx, afterTime time.Time, count int) ([]PLCOperation, error) 1512 + 1513 + // Streaming 1514 + StreamRaw(ctx, bundleNumber int) (io.ReadCloser, error) 1515 + StreamDecompressed(ctx, bundleNumber int) (io.ReadCloser, error) 1516 + 1517 + // Index 1518 + GetIndex() *Index 1519 + ScanBundle(path string, bundleNumber int) (*BundleMetadata, error) 1520 + Scan() (*DirectoryScanResult, error) 1521 + 1522 + // Mempool 1523 + GetMempoolStats() map[string]interface{} 1524 + GetMempoolOperations() ([]PLCOperation, error) 1525 + ValidateMempool() error 1526 + ClearMempool() error 1527 + 1528 + // Info 1529 + GetInfo() map[string]interface{} 1530 + IsBundleIndexed(bundleNumber int) bool 1531 + ``` 1532 + 1533 + ### Index Methods 1534 + 1535 + ```go 1536 + // Creation 1537 + NewIndex() *Index 1538 + LoadIndex(path string) (*Index, error) 1539 + 1540 + // Persistence 1541 + Save(path string) error 1542 + 1543 + // Queries 1544 + GetBundle(bundleNumber int) (*BundleMetadata, error) 1545 + GetLastBundle() *BundleMetadata 1546 + GetBundles() []*BundleMetadata 1547 + GetBundleRange(start, end int) []*BundleMetadata 1548 + 1549 + // Stats 1550 + Count() int 1551 + FindGaps() []int 1552 + GetStats() map[string]interface{} 1553 + ``` 1554 + 1555 + ### Configuration Types 1556 + 1557 + ```go 1558 + type Config struct { 1559 + BundleDir string 1560 + VerifyOnLoad bool 1561 + AutoRebuild bool 1562 + RebuildWorkers int 1563 + RebuildProgress func(current, total int) 1564 + Logger Logger 1565 + } 1566 + 1567 + type Logger interface { 1568 + Printf(format string, v ...interface{}) 1569 + Println(v ...interface{}) 1570 + } 1571 + ``` 1572 + 1573 + --- 1574 + 1575 + ## Troubleshooting 1576 + 1577 + ### Bundle Not Found Error 1578 + 1579 + ```go 1580 + bundle, err := mgr.Load(ctx, 999) 1581 + if err != nil { 1582 + if strings.Contains(err.Error(), "not in index") { 1583 + // Bundle doesn't exist 1584 + log.Printf("Bundle 999 hasn't been fetched yet") 1585 + } 1586 + } 1587 + ``` 1588 + 1589 + ### Insufficient Operations Error 1590 + 1591 + ```go 1592 + bundle, err := mgr.FetchNext(ctx) 1593 + if err != nil { 1594 + if strings.Contains(err.Error(), "insufficient operations") { 1595 + // Not enough operations for a complete bundle 1596 + // Check mempool 1597 + stats := mgr.GetMempoolStats() 1598 + count := stats["count"].(int) 1599 + log.Printf("Only %d operations available (need %d)", count, plcbundle.BUNDLE_SIZE) 1600 + } 1601 + } 1602 + ``` 1603 + 1604 + ### Memory Usage 1605 + 1606 + If processing large numbers of bundles: 1607 + 1608 + ```go 1609 + // Force garbage collection between bundles 1610 + for _, meta := range index.GetBundles() { 1611 + bundle, _ := mgr.Load(ctx, meta.BundleNumber) 1612 + processBundle(bundle) 1613 + 1614 + runtime.GC() // Help garbage collector 1615 + } 1616 + ``` 1617 + 1618 + --- 1619 + 1620 + ## Examples Repository 1621 + 1622 + Find complete, runnable examples at: 1623 + - https://github.com/plcbundle/examples 1624 + 1625 + Including: 1626 + - Complete sync service 1627 + - API server 1628 + - Analysis tools 1629 + - Monitoring services 1630 +
-7
plc_bundles.json
··· 1 - { 2 - "version": "1.0", 3 - "last_bundle": 0, 4 - "updated_at": "2025-10-28T20:48:55.692096Z", 5 - "total_size_bytes": 0, 6 - "bundles": [] 7 - }