A container registry that uses the AT Protocol for manifest storage and S3 for blob storage.
atcr.io
docker
container
atproto
go
1# ATCR Quota System
2
3This document describes ATCR's storage quota implementation, inspired by Harbor's proven approach to per-project blob tracking with deduplication.
4
5## Table of Contents
6
7- [Overview](#overview)
8- [Harbor's Approach (Reference Implementation)](#harbors-approach-reference-implementation)
9- [Storage Options](#storage-options)
10- [Quota Data Model](#quota-data-model)
11- [Push Flow (Detailed)](#push-flow-detailed)
12- [Delete Flow](#delete-flow)
13- [Garbage Collection](#garbage-collection)
14- [Quota Reconciliation](#quota-reconciliation)
15- [Configuration](#configuration)
16- [Trade-offs & Design Decisions](#trade-offs--design-decisions)
17- [Future Enhancements](#future-enhancements)
18
19## Overview
20
21ATCR implements per-user storage quotas to:
221. **Limit storage consumption** on shared hold services
232. **Track actual S3 costs** (what new data was added)
243. **Benefit from deduplication** (users only pay once per layer)
254. **Provide transparency** (show users their storage usage)
26
27**Key principle:** Users pay for layers they've uploaded, but only ONCE per layer regardless of how many images reference it.
28
29### Example Scenario
30
31```
32Alice pushes myapp:v1 (layers A, B, C - each 100MB)
33→ Alice's quota: +300MB (all new layers)
34
35Alice pushes myapp:v2 (layers A, B, D)
36→ Layers A, B already claimed by Alice
37→ Layer D is new (100MB)
38→ Alice's quota: +100MB (only D is new)
39→ Total: 400MB
40
41Bob pushes his-app:latest (layers A, E)
42→ Layer A already exists in S3 (uploaded by Alice)
43→ Bob claims it for first time → +100MB to Bob's quota
44→ Layer E is new → +100MB to Bob's quota
45→ Bob's quota: 200MB
46
47Physical S3 storage: 500MB (A, B, C, D, E)
48Claimed storage: 600MB (Alice: 400MB, Bob: 200MB)
49Deduplication savings: 100MB (layer A shared)
50```
51
52## Harbor's Approach (Reference Implementation)
53
54Harbor is built on distribution/distribution (same as ATCR) and implements quotas as middleware. Their approach:
55
56### Key Insights from Harbor
57
581. **"Shared blobs are only computed once per project"**
59 - Each project tracks which blobs it has uploaded
60 - Same blob used in multiple images counts only once per project
61 - Different projects claiming the same blob each pay for it
62
632. **Quota checked when manifest is pushed**
64 - Blobs upload first (presigned URLs, can't intercept)
65 - Manifest pushed last → quota check happens here
66 - Can reject manifest if quota exceeded (orphaned blobs cleaned by GC)
67
683. **Middleware-based implementation**
69 - distribution/distribution has NO built-in quota support
70 - Harbor added it as request preprocessing middleware
71 - Uses database (PostgreSQL) or Redis for quota storage
72
734. **Per-project ownership model**
74 - Blobs are physically deduplicated globally
75 - Quota accounting is logical (per-project claims)
76 - Total claimed storage can exceed physical storage
77
78### References
79
80- Harbor Quota Documentation: https://goharbor.io/docs/1.10/administration/configure-project-quotas/
81- Harbor Source: https://github.com/goharbor/harbor (see `src/controller/quota`)
82
83## Storage Options
84
85The hold service needs to store quota data somewhere. Two options:
86
87### Option 1: S3-Based Storage (Recommended for BYOS)
88
89Store quota metadata alongside blobs in the same S3 bucket:
90
91```
92Bucket structure:
93/docker/registry/v2/blobs/sha256/ab/abc123.../data ← actual blobs
94/atcr/quota/did:plc:alice.json ← quota tracking
95/atcr/quota/did:plc:bob.json
96```
97
98**Pros:**
99- ✅ No separate database needed
100- ✅ Single S3 bucket (better UX - no second bucket to configure)
101- ✅ Quota data lives with the blobs
102- ✅ Hold service stays relatively stateless
103- ✅ Works with any S3-compatible service (Storj, Minio, Upcloud, Fly.io)
104
105**Cons:**
106- ❌ Slower than local database (network round-trip)
107- ❌ Eventual consistency issues
108- ❌ Race conditions on concurrent updates
109- ❌ Extra S3 API costs (GET/PUT per upload)
110
111**Performance:**
112- Each blob upload: 1 HEAD (blob exists?) + 1 GET (quota) + 1 PUT (update quota)
113- Typical latency: 100-200ms total overhead
114- For high-throughput registries, consider SQLite
115
116### Option 2: SQLite Database (Recommended for Shared Holds)
117
118Local database in hold service:
119
120```bash
121/var/lib/atcr/hold-quota.db
122```
123
124**Pros:**
125- ✅ Fast local queries (no network latency)
126- ✅ ACID transactions (no race conditions)
127- ✅ Efficient for high-throughput registries
128- ✅ Can use foreign keys and joins
129
130**Cons:**
131- ❌ Makes hold service stateful (persistent volume needed)
132- ❌ Not ideal for ephemeral BYOS deployments
133- ❌ Backup/restore complexity
134- ❌ Multi-instance scaling requires shared database
135
136**Schema:**
137```sql
138CREATE TABLE user_quotas (
139 did TEXT PRIMARY KEY,
140 quota_limit INTEGER NOT NULL DEFAULT 10737418240, -- 10GB
141 quota_used INTEGER NOT NULL DEFAULT 0,
142 updated_at TIMESTAMP
143);
144
145CREATE TABLE claimed_layers (
146 did TEXT NOT NULL,
147 digest TEXT NOT NULL,
148 size INTEGER NOT NULL,
149 claimed_at TIMESTAMP,
150 PRIMARY KEY(did, digest)
151);
152```
153
154### Recommendation
155
156- **BYOS (user-owned holds):** S3-based (keeps hold service ephemeral)
157- **Shared holds (multi-user):** SQLite (better performance and consistency)
158- **High-traffic production:** SQLite or PostgreSQL (Harbor uses this)
159
160## Quota Data Model
161
162### Quota File Format (S3-based)
163
164```json
165{
166 "did": "did:plc:alice123",
167 "limit": 10737418240,
168 "used": 5368709120,
169 "claimed_layers": {
170 "sha256:abc123...": 104857600,
171 "sha256:def456...": 52428800,
172 "sha256:789ghi...": 209715200
173 },
174 "last_updated": "2025-10-09T12:34:56Z",
175 "version": 1
176}
177```
178
179**Fields:**
180- `did`: User's ATProto DID
181- `limit`: Maximum storage in bytes (default: 10GB)
182- `used`: Current storage usage in bytes (sum of claimed_layers)
183- `claimed_layers`: Map of digest → size for all layers user has uploaded
184- `last_updated`: Timestamp of last quota update
185- `version`: Schema version for future migrations
186
187### Why Track Individual Layers?
188
189**Q: Can't we just track a counter?**
190
191**A: We need layer tracking for:**
192
1931. **Deduplication detection**
194 - Check if user already claimed a layer → free upload
195 - Example: Updating an image reuses most layers
196
1972. **Accurate deletes**
198 - When manifest deleted, only decrement unclaimed layers
199 - User may have 5 images sharing layer A - deleting 1 image doesn't free layer A
200
2013. **Quota reconciliation**
202 - Verify quota matches reality by listing user's manifests
203 - Recalculate from layers in manifests vs claimed_layers map
204
2054. **Auditing**
206 - "Show me what I'm storing"
207 - Users can see which layers consume their quota
208
209## Push Flow (Detailed)
210
211### Step-by-Step: User Pushes Image
212
213```
214┌──────────┐ ┌──────────┐ ┌──────────┐
215│ Client │ │ Hold │ │ S3 │
216│ (Docker) │ │ Service │ │ Bucket │
217└──────────┘ └──────────┘ └──────────┘
218 │ │ │
219 │ 1. PUT /v2/.../blobs/ │ │
220 │ upload?digest=sha256:abc│ │
221 ├───────────────────────────>│ │
222 │ │ │
223 │ │ 2. Check if blob exists │
224 │ │ (Stat/HEAD request) │
225 │ ├───────────────────────────>│
226 │ │<───────────────────────────┤
227 │ │ 200 OK (exists) or │
228 │ │ 404 Not Found │
229 │ │ │
230 │ │ 3. Read user quota │
231 │ │ GET /atcr/quota/{did} │
232 │ ├───────────────────────────>│
233 │ │<───────────────────────────┤
234 │ │ quota.json │
235 │ │ │
236 │ │ 4. Calculate quota impact │
237 │ │ - If digest in │
238 │ │ claimed_layers: 0 │
239 │ │ - Else: size │
240 │ │ │
241 │ │ 5. Check quota limit │
242 │ │ used + impact <= limit? │
243 │ │ │
244 │ │ 6. Update quota │
245 │ │ PUT /atcr/quota/{did} │
246 │ ├───────────────────────────>│
247 │ │<───────────────────────────┤
248 │ │ 200 OK │
249 │ │ │
250 │ 7. Presigned URL │ │
251 │<───────────────────────────┤ │
252 │ {url: "https://s3..."} │ │
253 │ │ │
254 │ 8. Upload blob to S3 │ │
255 ├────────────────────────────┼───────────────────────────>│
256 │ │ │
257 │ 9. 200 OK │ │
258 │<───────────────────────────┼────────────────────────────┤
259 │ │ │
260```
261
262### Implementation (Pseudocode)
263
264```go
265// cmd/hold/main.go - HandlePutPresignedURL
266
267func (s *HoldService) HandlePutPresignedURL(w http.ResponseWriter, r *http.Request) {
268 var req PutPresignedURLRequest
269 json.NewDecoder(r.Body).Decode(&req)
270
271 // Step 1: Check if blob already exists in S3
272 blobPath := fmt.Sprintf("/docker/registry/v2/blobs/%s/%s/%s/data",
273 algorithm, digest[:2], digest)
274
275 _, err := s.driver.Stat(ctx, blobPath)
276 blobExists := (err == nil)
277
278 // Step 2: Read quota from S3 (or SQLite)
279 quota, err := s.quotaManager.GetQuota(req.DID)
280 if err != nil {
281 // First upload - create quota with defaults
282 quota = &Quota{
283 DID: req.DID,
284 Limit: s.config.QuotaDefaultLimit,
285 Used: 0,
286 ClaimedLayers: make(map[string]int64),
287 }
288 }
289
290 // Step 3: Calculate quota impact
291 quotaImpact := req.Size // Default: assume new layer
292
293 if _, alreadyClaimed := quota.ClaimedLayers[req.Digest]; alreadyClaimed {
294 // User already uploaded this layer before
295 quotaImpact = 0
296 log.Printf("Layer %s already claimed by %s, no quota impact",
297 req.Digest, req.DID)
298 } else if blobExists {
299 // Blob exists in S3 (uploaded by another user)
300 // But this user is claiming it for first time
301 // Still counts against their quota
302 log.Printf("Layer %s exists globally but new to %s, quota impact: %d",
303 req.Digest, req.DID, quotaImpact)
304 } else {
305 // Brand new blob - will be uploaded to S3
306 log.Printf("New layer %s for %s, quota impact: %d",
307 req.Digest, req.DID, quotaImpact)
308 }
309
310 // Step 4: Check quota limit
311 if quota.Used + quotaImpact > quota.Limit {
312 http.Error(w, fmt.Sprintf(
313 "quota exceeded: used=%d, impact=%d, limit=%d",
314 quota.Used, quotaImpact, quota.Limit,
315 ), http.StatusPaymentRequired) // 402
316 return
317 }
318
319 // Step 5: Update quota (optimistic - before upload completes)
320 quota.Used += quotaImpact
321 if quotaImpact > 0 {
322 quota.ClaimedLayers[req.Digest] = req.Size
323 }
324 quota.LastUpdated = time.Now()
325
326 if err := s.quotaManager.SaveQuota(quota); err != nil {
327 http.Error(w, "failed to update quota", http.StatusInternalServerError)
328 return
329 }
330
331 // Step 6: Generate presigned URL
332 presignedURL, err := s.getUploadURL(ctx, req.Digest, req.Size, req.DID)
333 if err != nil {
334 // Rollback quota update on error
335 quota.Used -= quotaImpact
336 delete(quota.ClaimedLayers, req.Digest)
337 s.quotaManager.SaveQuota(quota)
338
339 http.Error(w, "failed to generate presigned URL", http.StatusInternalServerError)
340 return
341 }
342
343 // Step 7: Return presigned URL + quota info
344 resp := PutPresignedURLResponse{
345 URL: presignedURL,
346 ExpiresAt: time.Now().Add(15 * time.Minute),
347 QuotaInfo: QuotaInfo{
348 Used: quota.Used,
349 Limit: quota.Limit,
350 Available: quota.Limit - quota.Used,
351 Impact: quotaImpact,
352 AlreadyClaimed: quotaImpact == 0,
353 },
354 }
355
356 w.Header().Set("Content-Type", "application/json")
357 json.NewEncoder(w).Encode(resp)
358}
359```
360
361### Race Condition Handling
362
363**Problem:** Two concurrent uploads of the same blob
364
365```
366Time User A User B
3670ms Upload layer X (100MB)
36810ms Upload layer X (100MB)
36920ms Check exists: NO Check exists: NO
37030ms Quota impact: 100MB Quota impact: 100MB
37140ms Update quota A: +100MB Update quota B: +100MB
37250ms Generate presigned URL Generate presigned URL
373100ms Upload to S3 completes Upload to S3 (overwrites A's)
374```
375
376**Result:** Both users charged 100MB, but only 100MB stored in S3.
377
378**Mitigation strategies:**
379
3801. **Accept eventual consistency** (recommended for S3-based)
381 - Run periodic reconciliation to fix discrepancies
382 - Small inconsistency window (minutes) is acceptable
383 - Reconciliation uses PDS as source of truth
384
3852. **Optimistic locking** (S3 ETags)
386 ```go
387 // Use S3 ETags for conditional writes
388 oldETag := getQuotaFileETag(did)
389 err := putQuotaFileWithCondition(quota, oldETag)
390 if err == PreconditionFailed {
391 // Retry with fresh read
392 }
393 ```
394
3953. **Database transactions** (SQLite-based)
396 ```sql
397 BEGIN TRANSACTION;
398 SELECT * FROM user_quotas WHERE did = ? FOR UPDATE;
399 UPDATE user_quotas SET used = used + ? WHERE did = ?;
400 COMMIT;
401 ```
402
403## Delete Flow
404
405### Manifest Deletion via AppView UI
406
407When a user deletes a manifest through the AppView web interface:
408
409```
410┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐
411│ User │ │ AppView │ │ Hold │ │ PDS │
412│ UI │ │ Database │ │ Service │ │ │
413└──────────┘ └──────────┘ └──────────┘ └──────────┘
414 │ │ │ │
415 │ DELETE manifest │ │ │
416 ├─────────────────────>│ │ │
417 │ │ │ │
418 │ │ 1. Get manifest │ │
419 │ │ and layers │ │
420 │ │ │ │
421 │ │ 2. Check which │ │
422 │ │ layers still │ │
423 │ │ referenced by │ │
424 │ │ user's other │ │
425 │ │ manifests │ │
426 │ │ │ │
427 │ │ 3. DELETE manifest │ │
428 │ │ from PDS │ │
429 │ ├──────────────────────┼─────────────────────>│
430 │ │ │ │
431 │ │ 4. POST /quota/decrement │
432 │ ├─────────────────────>│ │
433 │ │ {layers: [...]} │ │
434 │ │ │ │
435 │ │ │ 5. Update quota │
436 │ │ │ Remove unclaimed │
437 │ │ │ layers │
438 │ │ │ │
439 │ │ 6. 200 OK │ │
440 │ │<─────────────────────┤ │
441 │ │ │ │
442 │ │ 7. Delete from DB │ │
443 │ │ │ │
444 │ 8. Success │ │ │
445 │<─────────────────────┤ │ │
446 │ │ │ │
447```
448
449### AppView Implementation
450
451```go
452// pkg/appview/handlers/manifest.go
453
454func (h *ManifestHandler) DeleteManifest(w http.ResponseWriter, r *http.Request) {
455 did := r.Context().Value("auth.did").(string)
456 repository := chi.URLParam(r, "repository")
457 digest := chi.URLParam(r, "digest")
458
459 // Step 1: Get manifest and its layers from database
460 manifest, err := db.GetManifest(h.db, digest)
461 if err != nil {
462 http.Error(w, "manifest not found", 404)
463 return
464 }
465
466 layers, err := db.GetLayersForManifest(h.db, manifest.ID)
467 if err != nil {
468 http.Error(w, "failed to get layers", 500)
469 return
470 }
471
472 // Step 2: For each layer, check if user still references it
473 // in other manifests
474 layersToDecrement := []LayerInfo{}
475
476 for _, layer := range layers {
477 // Query: does this user have other manifests using this layer?
478 stillReferenced, err := db.CheckLayerReferencedByUser(
479 h.db, did, repository, layer.Digest, manifest.ID,
480 )
481
482 if err != nil {
483 http.Error(w, "failed to check layer references", 500)
484 return
485 }
486
487 if !stillReferenced {
488 // This layer is no longer used by user
489 layersToDecrement = append(layersToDecrement, LayerInfo{
490 Digest: layer.Digest,
491 Size: layer.Size,
492 })
493 }
494 }
495
496 // Step 3: Delete manifest from user's PDS
497 atprotoClient := atproto.NewClient(manifest.PDSEndpoint, did, accessToken)
498 err = atprotoClient.DeleteRecord(ctx, atproto.ManifestCollection, manifestRKey)
499 if err != nil {
500 http.Error(w, "failed to delete from PDS", 500)
501 return
502 }
503
504 // Step 4: Notify hold service to decrement quota
505 if len(layersToDecrement) > 0 {
506 holdClient := &http.Client{}
507
508 decrementReq := QuotaDecrementRequest{
509 DID: did,
510 Layers: layersToDecrement,
511 }
512
513 body, _ := json.Marshal(decrementReq)
514 resp, err := holdClient.Post(
515 manifest.HoldEndpoint + "/quota/decrement",
516 "application/json",
517 bytes.NewReader(body),
518 )
519
520 if err != nil || resp.StatusCode != 200 {
521 log.Printf("Warning: failed to update quota on hold service: %v", err)
522 // Continue anyway - GC reconciliation will fix it
523 }
524 }
525
526 // Step 5: Delete from AppView database
527 err = db.DeleteManifest(h.db, did, repository, digest)
528 if err != nil {
529 http.Error(w, "failed to delete from database", 500)
530 return
531 }
532
533 w.WriteHeader(http.StatusNoContent)
534}
535```
536
537### Hold Service Decrement Endpoint
538
539```go
540// cmd/hold/main.go
541
542type QuotaDecrementRequest struct {
543 DID string `json:"did"`
544 Layers []LayerInfo `json:"layers"`
545}
546
547type LayerInfo struct {
548 Digest string `json:"digest"`
549 Size int64 `json:"size"`
550}
551
552func (s *HoldService) HandleQuotaDecrement(w http.ResponseWriter, r *http.Request) {
553 var req QuotaDecrementRequest
554 if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
555 http.Error(w, "invalid request", 400)
556 return
557 }
558
559 // Read current quota
560 quota, err := s.quotaManager.GetQuota(req.DID)
561 if err != nil {
562 http.Error(w, "quota not found", 404)
563 return
564 }
565
566 // Decrement quota for each layer
567 for _, layer := range req.Layers {
568 if size, claimed := quota.ClaimedLayers[layer.Digest]; claimed {
569 // Remove from claimed layers
570 delete(quota.ClaimedLayers, layer.Digest)
571 quota.Used -= size
572
573 log.Printf("Decremented quota for %s: layer %s (%d bytes)",
574 req.DID, layer.Digest, size)
575 } else {
576 log.Printf("Warning: layer %s not in claimed_layers for %s",
577 layer.Digest, req.DID)
578 }
579 }
580
581 // Ensure quota.Used doesn't go negative (defensive)
582 if quota.Used < 0 {
583 log.Printf("Warning: quota.Used went negative for %s, resetting to 0", req.DID)
584 quota.Used = 0
585 }
586
587 // Save updated quota
588 quota.LastUpdated = time.Now()
589 if err := s.quotaManager.SaveQuota(quota); err != nil {
590 http.Error(w, "failed to save quota", 500)
591 return
592 }
593
594 // Return updated quota info
595 json.NewEncoder(w).Encode(map[string]any{
596 "used": quota.Used,
597 "limit": quota.Limit,
598 })
599}
600```
601
602### SQL Query: Check Layer References
603
604```sql
605-- pkg/appview/db/queries.go
606
607-- Check if user still references this layer in other manifests
608SELECT COUNT(*)
609FROM layers l
610JOIN manifests m ON l.manifest_id = m.id
611WHERE m.did = ? -- User's DID
612 AND l.digest = ? -- Layer digest
613 AND m.id != ? -- Exclude the manifest being deleted
614```
615
616## Garbage Collection
617
618### Background: Orphaned Blobs
619
620Orphaned blobs accumulate when:
6211. Manifest push fails after blobs uploaded (presigned URLs bypass hold)
6222. Quota exceeded - manifest rejected, blobs already in S3
6233. User deletes manifest - blobs no longer referenced
624
625**GC periodically cleans these up.**
626
627### GC Cron Implementation
628
629Similar to AppView's backfill worker, the hold service can run periodic GC:
630
631```go
632// cmd/hold/gc/gc.go
633
634type GarbageCollector struct {
635 driver storagedriver.StorageDriver
636 appviewURL string
637 holdURL string
638 quotaManager *quota.Manager
639}
640
641// Run garbage collection
642func (gc *GarbageCollector) Run(ctx context.Context) error {
643 log.Println("Starting garbage collection...")
644
645 // Step 1: Get list of referenced blobs from AppView
646 referenced, err := gc.getReferencedBlobs()
647 if err != nil {
648 return fmt.Errorf("failed to get referenced blobs: %w", err)
649 }
650
651 referencedSet := make(map[string]bool)
652 for _, digest := range referenced {
653 referencedSet[digest] = true
654 }
655
656 log.Printf("AppView reports %d referenced blobs", len(referenced))
657
658 // Step 2: Walk S3 blobs
659 deletedCount := 0
660 reclaimedBytes := int64(0)
661
662 err = gc.driver.Walk(ctx, "/docker/registry/v2/blobs", func(fileInfo storagedriver.FileInfo) error {
663 if fileInfo.IsDir() {
664 return nil // Skip directories
665 }
666
667 // Extract digest from path
668 // Path: /docker/registry/v2/blobs/sha256/ab/abc123.../data
669 digest := extractDigestFromPath(fileInfo.Path())
670
671 if !referencedSet[digest] {
672 // Unreferenced blob - delete it
673 size := fileInfo.Size()
674
675 if err := gc.driver.Delete(ctx, fileInfo.Path()); err != nil {
676 log.Printf("Failed to delete blob %s: %v", digest, err)
677 return nil // Continue anyway
678 }
679
680 deletedCount++
681 reclaimedBytes += size
682
683 log.Printf("GC: Deleted unreferenced blob %s (%d bytes)", digest, size)
684 }
685
686 return nil
687 })
688
689 if err != nil {
690 return fmt.Errorf("failed to walk blobs: %w", err)
691 }
692
693 log.Printf("GC complete: deleted %d blobs, reclaimed %d bytes",
694 deletedCount, reclaimedBytes)
695
696 return nil
697}
698
699// Get referenced blobs from AppView
700func (gc *GarbageCollector) getReferencedBlobs() ([]string, error) {
701 // Query AppView for all blobs referenced by manifests
702 // stored in THIS hold service
703 url := fmt.Sprintf("%s/internal/blobs/referenced?hold=%s",
704 gc.appviewURL, url.QueryEscape(gc.holdURL))
705
706 resp, err := http.Get(url)
707 if err != nil {
708 return nil, err
709 }
710 defer resp.Body.Close()
711
712 var result struct {
713 Blobs []string `json:"blobs"`
714 }
715
716 if err := json.NewDecoder(resp.Body).Decode(&result); err != nil {
717 return nil, err
718 }
719
720 return result.Blobs, nil
721}
722```
723
724### AppView Internal API
725
726```go
727// pkg/appview/handlers/internal.go
728
729// Get all referenced blobs for a specific hold
730func (h *InternalHandler) GetReferencedBlobs(w http.ResponseWriter, r *http.Request) {
731 holdEndpoint := r.URL.Query().Get("hold")
732 if holdEndpoint == "" {
733 http.Error(w, "missing hold parameter", 400)
734 return
735 }
736
737 // Query database for all layers in manifests stored in this hold
738 query := `
739 SELECT DISTINCT l.digest
740 FROM layers l
741 JOIN manifests m ON l.manifest_id = m.id
742 WHERE m.hold_endpoint = ?
743 `
744
745 rows, err := h.db.Query(query, holdEndpoint)
746 if err != nil {
747 http.Error(w, "database error", 500)
748 return
749 }
750 defer rows.Close()
751
752 blobs := []string{}
753 for rows.Next() {
754 var digest string
755 if err := rows.Scan(&digest); err != nil {
756 continue
757 }
758 blobs = append(blobs, digest)
759 }
760
761 json.NewEncoder(w).Encode(map[string]any{
762 "blobs": blobs,
763 "count": len(blobs),
764 "hold": holdEndpoint,
765 })
766}
767```
768
769### GC Cron Schedule
770
771```go
772// cmd/hold/main.go
773
774func main() {
775 // ... service setup ...
776
777 // Start GC cron if enabled
778 if os.Getenv("GC_ENABLED") == "true" {
779 gcInterval := 24 * time.Hour // Daily by default
780
781 go func() {
782 ticker := time.NewTicker(gcInterval)
783 defer ticker.Stop()
784
785 for range ticker.C {
786 if err := garbageCollector.Run(context.Background()); err != nil {
787 log.Printf("GC error: %v", err)
788 }
789 }
790 }()
791
792 log.Printf("GC cron started: runs every %v", gcInterval)
793 }
794
795 // Start server...
796}
797```
798
799## Quota Reconciliation
800
801### PDS as Source of Truth
802
803**Key insight:** Manifest records in PDS are publicly readable (no OAuth needed for reads).
804
805Each manifest contains:
806- Repository name
807- Digest
808- Layers array with digest + size
809- Hold endpoint
810
811The hold service can query the PDS to calculate the user's true quota:
812
813```
8141. List all io.atcr.manifest records for user
8152. Filter manifests where holdEndpoint == this hold service
8163. Extract unique layers (deduplicate by digest)
8174. Sum layer sizes = true quota usage
8185. Compare to quota file
8196. Fix discrepancies
820```
821
822### Implementation
823
824```go
825// cmd/hold/quota/reconcile.go
826
827type Reconciler struct {
828 quotaManager *Manager
829 atprotoResolver *atproto.Resolver
830 holdURL string
831}
832
833// ReconcileUser recalculates quota from PDS manifests
834func (r *Reconciler) ReconcileUser(ctx context.Context, did string) error {
835 log.Printf("Reconciling quota for %s", did)
836
837 // Step 1: Resolve user's PDS endpoint
838 identity, err := r.atprotoResolver.ResolveIdentity(ctx, did)
839 if err != nil {
840 return fmt.Errorf("failed to resolve DID: %w", err)
841 }
842
843 // Step 2: Create unauthenticated ATProto client
844 // (manifest records are public - no OAuth needed)
845 client := atproto.NewClient(identity.PDSEndpoint, did, "")
846
847 // Step 3: List all manifest records for this user
848 manifests, err := client.ListRecords(ctx, atproto.ManifestCollection, 1000)
849 if err != nil {
850 return fmt.Errorf("failed to list manifests: %w", err)
851 }
852
853 // Step 4: Filter manifests stored in THIS hold service
854 // and extract unique layers
855 uniqueLayers := make(map[string]int64) // digest -> size
856
857 for _, record := range manifests {
858 var manifest atproto.ManifestRecord
859 if err := json.Unmarshal(record.Value, &manifest); err != nil {
860 log.Printf("Warning: failed to parse manifest: %v", err)
861 continue
862 }
863
864 // Only count manifests stored in this hold
865 if manifest.HoldEndpoint != r.holdURL {
866 continue
867 }
868
869 // Add config blob
870 if manifest.Config.Digest != "" {
871 uniqueLayers[manifest.Config.Digest] = manifest.Config.Size
872 }
873
874 // Add layer blobs
875 for _, layer := range manifest.Layers {
876 uniqueLayers[layer.Digest] = layer.Size
877 }
878 }
879
880 // Step 5: Calculate true quota usage
881 trueUsage := int64(0)
882 for _, size := range uniqueLayers {
883 trueUsage += size
884 }
885
886 log.Printf("User %s true usage from PDS: %d bytes (%d unique layers)",
887 did, trueUsage, len(uniqueLayers))
888
889 // Step 6: Compare with current quota file
890 quota, err := r.quotaManager.GetQuota(did)
891 if err != nil {
892 log.Printf("No existing quota for %s, creating new", did)
893 quota = &Quota{
894 DID: did,
895 Limit: r.quotaManager.DefaultLimit,
896 ClaimedLayers: make(map[string]int64),
897 }
898 }
899
900 // Step 7: Fix discrepancies
901 if quota.Used != trueUsage || len(quota.ClaimedLayers) != len(uniqueLayers) {
902 log.Printf("Quota mismatch for %s: recorded=%d, actual=%d (diff=%d)",
903 did, quota.Used, trueUsage, trueUsage - quota.Used)
904
905 // Update quota to match PDS truth
906 quota.Used = trueUsage
907 quota.ClaimedLayers = uniqueLayers
908 quota.LastUpdated = time.Now()
909
910 if err := r.quotaManager.SaveQuota(quota); err != nil {
911 return fmt.Errorf("failed to save reconciled quota: %w", err)
912 }
913
914 log.Printf("Reconciled quota for %s: %d bytes", did, trueUsage)
915 } else {
916 log.Printf("Quota for %s is accurate", did)
917 }
918
919 return nil
920}
921
922// ReconcileAll reconciles all users (run periodically)
923func (r *Reconciler) ReconcileAll(ctx context.Context) error {
924 // Get list of all users with quota files
925 users, err := r.quotaManager.ListUsers()
926 if err != nil {
927 return err
928 }
929
930 log.Printf("Starting reconciliation for %d users", len(users))
931
932 for _, did := range users {
933 if err := r.ReconcileUser(ctx, did); err != nil {
934 log.Printf("Failed to reconcile %s: %v", did, err)
935 // Continue with other users
936 }
937 }
938
939 log.Println("Reconciliation complete")
940 return nil
941}
942```
943
944### Reconciliation Cron
945
946```go
947// cmd/hold/main.go
948
949func main() {
950 // ... setup ...
951
952 // Start reconciliation cron
953 if os.Getenv("QUOTA_RECONCILE_ENABLED") == "true" {
954 reconcileInterval := 24 * time.Hour // Daily
955
956 go func() {
957 ticker := time.NewTicker(reconcileInterval)
958 defer ticker.Stop()
959
960 for range ticker.C {
961 if err := reconciler.ReconcileAll(context.Background()); err != nil {
962 log.Printf("Reconciliation error: %v", err)
963 }
964 }
965 }()
966
967 log.Printf("Quota reconciliation cron started: runs every %v", reconcileInterval)
968 }
969
970 // ... start server ...
971}
972```
973
974### Why PDS as Source of Truth Works
975
9761. **Manifests are canonical** - If manifest exists in PDS, user owns those layers
9772. **Public reads** - No OAuth needed, just resolve DID → PDS endpoint
9783. **ATProto durability** - PDS is user's authoritative data store
9794. **AppView is cache** - AppView database might lag or have inconsistencies
9805. **Reconciliation fixes drift** - Periodic sync from PDS ensures accuracy
981
982**Example reconciliation scenarios:**
983
984- **Orphaned quota entries:** User deleted manifest from PDS, but hold quota still has it
985 → Reconciliation removes from claimed_layers
986
987- **Missing quota entries:** User pushed manifest, but quota update failed
988 → Reconciliation adds to claimed_layers
989
990- **Race condition duplicates:** Two concurrent pushes double-counted a layer
991 → Reconciliation fixes to actual usage
992
993## Configuration
994
995### Hold Service Environment Variables
996
997```bash
998# .env.hold
999
1000# ============================================================================
1001# Quota Configuration
1002# ============================================================================
1003
1004# Enable quota enforcement
1005QUOTA_ENABLED=true
1006
1007# Default quota limit per user (bytes)
1008# 10GB = 10737418240
1009# 50GB = 53687091200
1010# 100GB = 107374182400
1011QUOTA_DEFAULT_LIMIT=10737418240
1012
1013# Storage backend for quota data
1014# Options: s3, sqlite
1015QUOTA_STORAGE_BACKEND=s3
1016
1017# For S3-based storage:
1018# Quota files stored in same bucket as blobs
1019QUOTA_STORAGE_PREFIX=/atcr/quota/
1020
1021# For SQLite-based storage:
1022QUOTA_DB_PATH=/var/lib/atcr/hold-quota.db
1023
1024# ============================================================================
1025# Garbage Collection
1026# ============================================================================
1027
1028# Enable periodic garbage collection
1029GC_ENABLED=true
1030
1031# GC interval (default: 24h)
1032GC_INTERVAL=24h
1033
1034# AppView URL for GC reference checking
1035APPVIEW_URL=https://atcr.io
1036
1037# ============================================================================
1038# Quota Reconciliation
1039# ============================================================================
1040
1041# Enable quota reconciliation from PDS
1042QUOTA_RECONCILE_ENABLED=true
1043
1044# Reconciliation interval (default: 24h)
1045QUOTA_RECONCILE_INTERVAL=24h
1046
1047# ============================================================================
1048# Hold Service Identity (Required)
1049# ============================================================================
1050
1051# Public URL of this hold service
1052HOLD_PUBLIC_URL=https://hold1.example.com
1053
1054# Owner DID (for auto-registration)
1055HOLD_OWNER=did:plc:xyz123
1056```
1057
1058### AppView Configuration
1059
1060```bash
1061# .env.appview
1062
1063# Internal API endpoint for hold services
1064# Used for GC reference checking
1065ATCR_INTERNAL_API_ENABLED=true
1066
1067# Optional: authentication token for internal APIs
1068ATCR_INTERNAL_API_TOKEN=secret123
1069```
1070
1071## Trade-offs & Design Decisions
1072
1073### 1. Claimed Storage vs Physical Storage
1074
1075**Decision:** Track claimed storage (logical accounting)
1076
1077**Why:**
1078- Predictable for users: "you pay for what you upload"
1079- No complex cross-user dependencies
1080- Delete always gives you quota back
1081- Matches Harbor's proven model
1082
1083**Trade-off:**
1084- Total claimed can exceed physical storage
1085- Users might complain "I uploaded 10GB but S3 only has 6GB"
1086
1087**Mitigation:**
1088- Show deduplication savings metric
1089- Educate users: "You claimed 10GB, but deduplication saved 4GB"
1090
1091### 2. S3 vs SQLite for Quota Storage
1092
1093**Decision:** Support both, recommend based on use case
1094
1095**S3 Pros:**
1096- No database to manage
1097- Quota data lives with blobs
1098- Better for ephemeral BYOS
1099
1100**SQLite Pros:**
1101- Faster (no network)
1102- ACID transactions (no race conditions)
1103- Better for high-traffic shared holds
1104
1105**Trade-off:**
1106- S3: eventual consistency, race conditions
1107- SQLite: stateful service, scaling challenges
1108
1109**Mitigation:**
1110- Reconciliation fixes S3 inconsistencies
1111- SQLite can use shared DB for multi-instance
1112
1113### 3. Optimistic Quota Update
1114
1115**Decision:** Update quota BEFORE upload completes
1116
1117**Why:**
1118- Prevent race conditions (two users uploading simultaneously)
1119- Can reject before presigned URL generated
1120- Simpler flow
1121
1122**Trade-off:**
1123- If upload fails, quota already incremented (user "paid" for nothing)
1124
1125**Mitigation:**
1126- Reconciliation from PDS fixes orphaned quota entries
1127- Acceptable for MVP (upload failures are rare)
1128
1129### 4. AppView as Intermediary
1130
1131**Decision:** AppView notifies hold service on deletes
1132
1133**Why:**
1134- AppView already has manifest/layer database
1135- Can efficiently check if layer still referenced
1136- Hold service doesn't need to query PDS on every delete
1137
1138**Trade-off:**
1139- AppView → Hold dependency
1140- Network hop on delete
1141
1142**Mitigation:**
1143- If notification fails, reconciliation fixes quota
1144- Eventually consistent is acceptable
1145
1146### 5. PDS as Source of Truth
1147
1148**Decision:** Use PDS manifests for reconciliation
1149
1150**Why:**
1151- Manifests in PDS are canonical user data
1152- Public reads (no OAuth for reconciliation)
1153- AppView database might lag or be inconsistent
1154
1155**Trade-off:**
1156- Reconciliation requires PDS queries (slower)
1157- Limited to 1000 manifests per query
1158
1159**Mitigation:**
1160- Run reconciliation daily (not real-time)
1161- Paginate if user has >1000 manifests
1162
1163## Future Enhancements
1164
1165### 1. Quota API Endpoints
1166
1167```
1168GET /quota/usage - Get current user's quota
1169GET /quota/breakdown - Get storage by repository
1170POST /quota/limit - Update user's quota limit (admin)
1171GET /quota/stats - Get hold-wide statistics
1172```
1173
1174### 2. Quota Alerts
1175
1176Notify users when approaching limit:
1177- Email/webhook at 80%, 90%, 95%
1178- Reject uploads at 100% (currently implemented)
1179- Grace period: allow 105% temporarily
1180
1181### 3. Tiered Quotas
1182
1183Different limits based on user tier:
1184- Free: 10GB
1185- Pro: 100GB
1186- Enterprise: unlimited
1187
1188### 4. Quota Purchasing
1189
1190Allow users to buy additional storage:
1191- Stripe integration
1192- $0.10/GB/month pricing
1193- Dynamic limit updates
1194
1195### 5. Cross-Hold Deduplication
1196
1197If multiple holds share same S3 bucket:
1198- Track blob ownership globally
1199- Split costs proportionally
1200- More complex, but maximizes deduplication
1201
1202### 6. Manifest-Based Quota (Alternative Model)
1203
1204Instead of tracking layers, track manifests:
1205- Simpler: just count manifest sizes
1206- No deduplication benefits for users
1207- Might be acceptable for some use cases
1208
1209### 7. Redis-Based Quota (High Performance)
1210
1211For high-traffic registries:
1212- Use Redis instead of S3/SQLite
1213- Sub-millisecond quota checks
1214- Harbor-proven approach
1215
1216### 8. Quota Visualizations
1217
1218Web UI showing:
1219- Storage usage over time
1220- Top consumers by repository
1221- Deduplication savings graph
1222- Layer size distribution
1223
1224## Appendix: SQL Queries
1225
1226### Check if User Still References Layer
1227
1228```sql
1229-- After deleting manifest, check if user has other manifests using this layer
1230SELECT COUNT(*)
1231FROM layers l
1232JOIN manifests m ON l.manifest_id = m.id
1233WHERE m.did = ? -- User's DID
1234 AND l.digest = ? -- Layer digest to check
1235 AND m.id != ? -- Exclude the manifest being deleted
1236```
1237
1238### Get All Unique Layers for User
1239
1240```sql
1241-- Calculate true quota usage for a user
1242SELECT DISTINCT l.digest, l.size
1243FROM layers l
1244JOIN manifests m ON l.manifest_id = m.id
1245WHERE m.did = ?
1246 AND m.hold_endpoint = ?
1247```
1248
1249### Get Referenced Blobs for Hold
1250
1251```sql
1252-- For GC: get all blobs still referenced by any user of this hold
1253SELECT DISTINCT l.digest
1254FROM layers l
1255JOIN manifests m ON l.manifest_id = m.id
1256WHERE m.hold_endpoint = ?
1257```
1258
1259### Get Storage Stats by Repository
1260
1261```sql
1262-- User's storage broken down by repository
1263SELECT
1264 m.repository,
1265 COUNT(DISTINCT m.id) as manifest_count,
1266 COUNT(DISTINCT l.digest) as unique_layers,
1267 SUM(l.size) as total_size
1268FROM manifests m
1269JOIN layers l ON l.manifest_id = m.id
1270WHERE m.did = ?
1271 AND m.hold_endpoint = ?
1272GROUP BY m.repository
1273ORDER BY total_size DESC
1274```
1275
1276## References
1277
1278- **Harbor Quotas:** https://goharbor.io/docs/1.10/administration/configure-project-quotas/
1279- **Harbor Source:** https://github.com/goharbor/harbor
1280- **ATProto Spec:** https://atproto.com/specs/record
1281- **OCI Distribution Spec:** https://github.com/opencontainers/distribution-spec
1282- **S3 API Reference:** https://docs.aws.amazon.com/AmazonS3/latest/API/
1283- **Distribution GC:** https://github.com/distribution/distribution/blob/main/registry/storage/garbagecollect.go
1284
1285---
1286
1287**Document Version:** 1.0
1288**Last Updated:** 2025-10-09
1289**Author:** Generated from implementation research and Harbor analysis