A container registry that uses the AT Protocol for manifest storage and S3 for blob storage. atcr.io
docker container atproto go

clean up documentation

evan.jarrett.net 64a05d40 4dc66c09

verified
+118 -4099
+21 -20
CLAUDE.md
··· 92 92 ``` 93 93 1. Client: docker push atcr.io/alice/myapp:latest 94 94 2. HTTP Request → /v2/alice/myapp/manifests/latest 95 - 3. Registry Middleware (pkg/middleware/registry.go) 95 + 3. Registry Middleware (pkg/appview/middleware/registry.go) 96 96 → Resolves "alice" to DID and PDS endpoint 97 97 → Queries alice's sailor profile for defaultHold 98 98 → If not set, checks alice's io.atcr.hold records 99 99 → Falls back to AppView's default_storage_endpoint 100 100 → Stores DID/PDS/storage endpoint in context 101 - 4. Repository Middleware (pkg/middleware/repository.go) 101 + 4. Routing Repository (pkg/appview/storage/routing_repository.go) 102 102 → Creates RoutingRepository 103 103 → Returns ATProto ManifestStore for manifests 104 104 → Returns ProxyBlobStore for blobs ··· 151 151 152 152 ### Middleware System 153 153 154 - ATCR uses two levels of middleware: 154 + ATCR uses middleware and routing to handle requests: 155 155 156 - #### 1. Registry Middleware (`pkg/middleware/registry.go`) 156 + #### 1. Registry Middleware (`pkg/appview/middleware/registry.go`) 157 157 - Wraps `distribution.Namespace` 158 158 - Intercepts `Repository(name)` calls 159 159 - Performs name resolution (alice → did:plc:xyz → pds.example.com) 160 160 - Queries PDS for `io.atcr.hold` records to find storage endpoint 161 161 - Stores resolved identity and storage endpoint in context 162 162 163 - #### 2. Repository Middleware (`pkg/middleware/repository.go`) 164 - - Wraps `distribution.Repository` 163 + #### 2. Auth Middleware (`pkg/appview/middleware/auth.go`) 164 + - Validates JWT tokens from Docker clients 165 + - Extracts DID from token claims 166 + - Injects authenticated identity into context 167 + 168 + #### 3. Routing Repository (`pkg/appview/storage/routing_repository.go`) 169 + - Implements `distribution.Repository` 165 170 - Returns custom `Manifests()` and `Blobs()` implementations 166 171 - Routes manifests to ATProto, blobs to S3 or BYOS 167 172 ··· 208 213 209 214 **Authentication Flow:** 210 215 ``` 211 - 1. User runs: docker-credential-atcr configure 212 - 2. Helper generates ECDSA P-256 DPoP key 216 + 1. User configures Docker to use the credential helper (adds to config.json) 217 + 2. On first docker push/pull, helper generates ECDSA P-256 DPoP key 213 218 3. Resolve handle → DID → PDS endpoint 214 219 4. Discover OAuth server metadata from PDS 215 220 5. PAR request with DPoP header → get request_uri ··· 217 222 7. Exchange code for token with DPoP proof 218 223 8. Save: access token, refresh token, DPoP key, DID, handle 219 224 220 - Later (docker push): 225 + Later (subsequent docker push): 221 226 9. Docker calls credential helper 222 227 10. Helper loads token, refreshes if needed 223 228 11. Helper calls /auth/exchange with OAuth token + handle ··· 266 271 - Digest-based addressing (sha256:abc123 → record key) 267 272 - Converts between OCI and ATProto formats 268 273 269 - #### Storage Layer (`pkg/storage/`) 274 + #### Storage Layer (`pkg/appview/storage/`) 270 275 271 276 **routing_repository.go**: Routes content by type 272 277 - `Manifests()` → returns ATProto ManifestStore (caches instance for hold endpoint extraction) ··· 283 288 - **NOTE:** Simple in-memory cache for MVP. For production: use Redis or similar 284 289 - Prevents expensive ATProto lookups on every blob request 285 290 286 - **s3_blob_store.go**: S3 blob storage wrapper 287 - - Wraps distribution's built-in S3 driver 288 - - Inherits full `distribution.BlobStore` interface 289 - - Used for default shared storage 290 - 291 291 **proxy_blob_store.go**: External storage proxy 292 292 - Calls user's storage service for presigned URLs 293 293 - Issues HTTP redirects for blob uploads/downloads 294 294 - Implements full `distribution.BlobStore` interface 295 + - Supports multipart uploads for large blobs 295 296 - Used when user has `io.atcr.hold` record 296 297 297 298 #### AppView Web UI (`pkg/appview/`) ··· 484 485 ### Development Notes 485 486 486 487 **General:** 487 - - Middleware is registered via `init()` functions in `pkg/middleware/` 488 - - Import `_ "atcr.io/pkg/middleware"` in main.go to register middleware 488 + - Middleware is in `pkg/appview/middleware/` (auth.go, registry.go) 489 + - Storage routing is in `pkg/appview/storage/` (routing_repository.go, proxy_blob_store.go, hold_cache.go) 489 490 - Storage drivers imported as `_ "github.com/distribution/distribution/v3/registry/storage/driver/s3-aws"` 490 - - Storage service reuses distribution's driver factory for multi-backend support 491 + - Hold service reuses distribution's driver factory for multi-backend support 491 492 492 493 **OAuth implementation:** 493 494 - Client (`pkg/auth/oauth/client.go`) encapsulates all OAuth configuration ··· 514 515 4. Update client methods if needed 515 516 516 517 **Modifying storage routing**: 517 - 1. Edit `pkg/storage/routing_repository.go` 518 + 1. Edit `pkg/appview/storage/routing_repository.go` 518 519 2. Update `Blobs()` method to change routing logic 519 520 3. Consider context values: `storage.endpoint`, `atproto.did` 520 521 521 522 **Changing name resolution**: 522 523 1. Modify `pkg/atproto/resolver.go` for DID/handle resolution 523 - 2. Update `pkg/middleware/registry.go` if changing routing logic 524 + 2. Update `pkg/appview/middleware/registry.go` if changing routing logic 524 525 3. Remember: `findStorageEndpoint()` queries PDS for `io.atcr.hold` records 525 526 526 527 **Working with OAuth client**:
+88 -278
README.md
··· 1 1 # ATCR - ATProto Container Registry 2 2 3 - A container registry that uses the AT Protocol (ATProto) for manifest storage and S3 for blob storage. 3 + An OCI-compliant container registry that uses the AT Protocol for manifest storage and S3 for blob storage. 4 4 5 - ## Overview 5 + ## What is ATCR? 6 6 7 - ATCR is an OCI-compliant container registry that integrates with the AT Protocol ecosystem. It stores container image manifests as ATProto records in Personal Data Servers (PDS) while keeping the actual image layers in S3-compatible storage. 7 + ATCR integrates container registries with the AT Protocol ecosystem. Container image manifests are stored as ATProto records in your Personal Data Server (PDS), while layers are stored in S3-compatible storage. 8 + 9 + **Image names use your ATProto identity:** 10 + ``` 11 + atcr.io/alice.bsky.social/myapp:latest 12 + atcr.io/did:plc:xyz123/myapp:latest 13 + ``` 8 14 9 - ### Architecture 15 + ## Architecture 10 16 11 - ATCR consists of three main components: 17 + **Three components:** 12 18 13 - 1. **AppView** - OCI registry server + web UI 19 + 1. **AppView** - Registry API + web UI 14 20 - Serves OCI Distribution API (Docker push/pull) 15 - - Resolves identities (handle/DID → PDS endpoint) 16 - - Routes manifests to user's PDS, blobs to storage 17 - - Web interface for browsing and search 18 - - SQLite database for stars, pulls, metadata 21 + - Resolves handles/DIDs to PDS endpoints 22 + - Routes manifests to PDS, blobs to storage 23 + - Web interface for browsing/search 19 24 20 - 2. **Hold Service** - Optional storage service (BYOS) 21 - - Lightweight HTTP server for presigned URLs 22 - - Supports S3, Storj, Minio, filesystem, etc. 23 - - Authorization via ATProto records 24 - - Users can deploy their own hold 25 + 2. **Hold Service** - Storage service (optional BYOS) 26 + - Generates presigned URLs for S3/Storj/Minio/etc. 27 + - Users can deploy their own storage 25 28 26 - 3. **Credential Helper** - Client-side OAuth 29 + 3. **Credential Helper** - Client authentication 27 30 - ATProto OAuth with DPoP 28 - - Exchanges OAuth token for registry JWT 29 - - Seamless Docker integration 31 + - Automatic authentication on first push/pull 30 32 31 - **Storage Model:** 32 - - **Manifests** → ATProto records in user PDSs (small JSON metadata) 33 - - **Blobs/Layers** → S3 or user's hold service (large binary data) 34 - - **Name Resolution** → Supports both handles and DIDs 35 - - `atcr.io/alice.bsky.social/myimage:latest` 36 - - `atcr.io/did:plc:xyz123/myimage:latest` 33 + **Storage model:** 34 + - Manifests → ATProto records (small JSON) 35 + - Blobs → S3 or BYOS (large binaries) 37 36 38 37 ## Features 39 38 40 - ### Core Registry 41 - - **OCI Distribution Spec compliant** - Works with Docker, containerd, podman 42 - - **ATProto-native manifest storage** - Manifests stored as records in user PDSs 43 - - **Hybrid storage** - Small manifests in ATProto, large blobs in S3/BYOS 44 - - **DID/handle resolution** - Supports both handles and DIDs for image names 45 - - **Decentralized ownership** - Users own their manifest data via their PDS 46 - 47 - ### Web Interface 48 - - **Repository browser** - Browse and search container images 49 - - **Star repositories** - Favorite images for quick access 50 - - **Pull tracking** - View popularity and usage metrics 51 - - **OAuth authentication** - Sign in with your ATProto identity 52 - - **User profiles** - Manage your default storage hold 53 - 54 - ### Authentication 55 - - **ATProto OAuth with DPoP** - Cryptographic proof-of-possession tokens 56 - - **Docker credential helper** - Seamless `docker push/pull` workflow 57 - - **Token exchange** - OAuth tokens converted to registry JWTs 39 + - ✅ **OCI-compliant** - Works with Docker, containerd, podman 40 + - ✅ **Decentralized** - You own your manifest data via your PDS 41 + - ✅ **ATProto OAuth** - Secure authentication with DPoP 42 + - ✅ **BYOS** - Deploy your own storage service 43 + - ✅ **Web UI** - Browse, search, star repositories 44 + - ✅ **Multi-backend** - S3, Storj, Minio, Azure, GCS, filesystem 58 45 59 - ### Storage 60 - - **BYOS (Bring Your Own Storage)** - Deploy your own hold service 61 - - **Multi-backend support** - S3, Storj, Minio, Azure, GCS, filesystem 62 - - **Presigned URLs** - Direct client-to-storage uploads/downloads 63 - - **Hold discovery** - Automatic routing based on user preferences 46 + ## Quick Start 64 47 65 - ## Building 48 + ### Using the Registry 66 49 50 + **1. Install credential helper:** 67 51 ```bash 68 - # Build all binaries locally 69 - go build -o atcr-appview ./cmd/appview 70 - go build -o atcr-hold ./cmd/hold 71 - go build -o docker-credential-atcr ./cmd/credential-helper 72 - 73 - # Build Docker images 74 - docker build -t atcr.io/appview:latest . 75 - docker build -f Dockerfile.hold -t atcr.io/hold:latest . 52 + curl -fsSL https://atcr.io/install.sh | bash 76 53 ``` 77 54 78 - **Manual setup:** 79 - ```bash 80 - # 1. Create directories 81 - sudo mkdir -p /var/lib/atcr/{blobs,hold,auth} 82 - sudo chown -R $USER:$USER /var/lib/atcr 83 - 84 - # 2. Build binaries 85 - go build -o atcr-appview ./cmd/appview 86 - go build -o atcr-hold ./cmd/hold 87 - 88 - # 3. Configure environment 89 - cp .env.example .env 90 - # Edit .env - set ATPROTO_HANDLE and HOLD_PUBLIC_URL 91 - export $(cat .env | xargs) 92 - 93 - # 4. Start services 94 - # Terminal 1: 95 - ./atcr-appview serve config/config.yml 96 - # Terminal 2 (will prompt for OAuth): 97 - ./atcr-hold config/hold.yml 98 - # Follow OAuth URL in logs to authorize 99 - 100 - # 5. Test with Docker 101 - docker tag alpine:latest localhost:5000/alice/alpine:test 102 - docker push localhost:5000/alice/alpine:test 103 - docker pull localhost:5000/alice/alpine:test 55 + **2. Configure Docker** (add to `~/.docker/config.json`): 56 + ```json 57 + { 58 + "credHelpers": { 59 + "atcr.io": "atcr" 60 + } 61 + } 104 62 ``` 105 63 106 - ## Running 107 - 108 - ### Local Development 109 - 110 - **Configure environment:** 64 + **3. Push/pull images:** 111 65 ```bash 112 - # Copy and edit .env file 113 - cp .env.example .env 114 - # Edit .env with: 115 - # - ATPROTO_HANDLE (your Bluesky handle) 116 - # - HOLD_PUBLIC_URL (e.g., http://127.0.0.1:8080 or https://hold1.atcr.io) 117 - # - HOLD_AUTO_REGISTER=true 118 - 119 - # Load environment 120 - export $(cat .env | xargs) 66 + docker tag myapp:latest atcr.io/yourhandle/myapp:latest 67 + docker push atcr.io/yourhandle/myapp:latest # Authenticates automatically 68 + docker pull atcr.io/yourhandle/myapp:latest 121 69 ``` 122 70 123 - **AppView:** 124 - ```bash 125 - ./atcr-appview serve config/config.yml 126 - ``` 71 + See **[INSTALLATION.md](./INSTALLATION.md)** for detailed installation instructions. 127 72 128 - **Hold (Storage Service):** 129 - ```bash 130 - # Starts OAuth flow to register in your PDS 131 - ./atcr-hold config/hold.yml 132 - # Follow the OAuth URL in the logs to authorize 133 - ``` 73 + ### Running Your Own AppView 134 74 135 - ### Docker 136 - 137 - **Run with Docker Compose:** 75 + **Using Docker Compose:** 138 76 ```bash 77 + cp .env.appview.example .env.appview 78 + # Edit .env.appview with your configuration 139 79 docker-compose up -d 140 80 ``` 141 81 142 - **Or run containers separately:** 143 - 144 - **AppView:** 145 - ```bash 146 - docker run -d \ 147 - --name atcr-appview \ 148 - -p 5000:5000 \ 149 - -e ATPROTO_DID=did:plc:your-did \ 150 - -e ATPROTO_ACCESS_TOKEN=your-access-token \ 151 - -e AWS_ACCESS_KEY_ID=your-aws-key \ 152 - -e AWS_SECRET_ACCESS_KEY=your-aws-secret \ 153 - -v $(pwd)/config/config.yml:/etc/atcr/config.yml \ 154 - atcr.io/appview:latest 155 - ``` 156 - 157 - **Hold (Storage Service):** 158 - ```bash 159 - docker run -d \ 160 - --name atcr-hold \ 161 - -p 8080:8080 \ 162 - -e AWS_ACCESS_KEY_ID=your-aws-key \ 163 - -e AWS_SECRET_ACCESS_KEY=your-aws-secret \ 164 - -v $(pwd)/config/hold.yml:/etc/atcr/hold.yml \ 165 - atcr.io/hold:latest 166 - ``` 167 - 168 - ### Kubernetes 169 - 170 - ```yaml 171 - apiVersion: apps/v1 172 - kind: Deployment 173 - metadata: 174 - name: atcr-appview 175 - spec: 176 - replicas: 3 177 - selector: 178 - matchLabels: 179 - app: atcr-appview 180 - template: 181 - metadata: 182 - labels: 183 - app: atcr-appview 184 - spec: 185 - containers: 186 - - name: appview 187 - image: atcr.io/appview:latest 188 - ports: 189 - - containerPort: 5000 190 - env: 191 - - name: ATPROTO_DID 192 - valueFrom: 193 - secretKeyRef: 194 - name: atcr-secrets 195 - key: did 196 - - name: ATPROTO_ACCESS_TOKEN 197 - valueFrom: 198 - secretKeyRef: 199 - name: atcr-secrets 200 - key: access-token 201 - volumeMounts: 202 - - name: config 203 - mountPath: /etc/atcr 204 - volumes: 205 - - name: config 206 - configMap: 207 - name: atcr-config 208 - ``` 209 - 210 - ## Configuration 211 - 212 - See `config/config.yml` for full configuration options. 213 - 214 - Key settings: 215 - - **storage.s3**: S3 bucket configuration for blob storage 216 - - **middleware.repository**: ATProto routing middleware 217 - - **middleware.registry**: Name resolution middleware 218 - 219 - ## Installing Credential Helper 220 - 221 - **Quick Install:** 222 - 82 + **Local development:** 223 83 ```bash 224 - # Linux/macOS 225 - curl -fsSL https://atcr.io/install.sh | bash 84 + # Build 85 + go build -o bin/atcr-appview ./cmd/appview 86 + go build -o bin/atcr-hold ./cmd/hold 226 87 227 - # Windows (PowerShell as Administrator) 228 - iwr -useb https://atcr.io/install.ps1 | iex 229 - ``` 230 - 231 - For detailed installation instructions (Homebrew, manual install, etc.), see **[INSTALLATION.md](./INSTALLATION.md)**. 232 - 233 - **Configure Docker:** 88 + # Configure 89 + cp .env.appview.example .env.appview 90 + # Edit .env.appview - set ATCR_DEFAULT_HOLD 91 + source .env.appview 234 92 235 - ```bash 236 - # Add to ~/.docker/config.json 237 - { 238 - "credHelpers": { 239 - "atcr.io": "atcr" 240 - } 241 - } 93 + # Run 94 + ./bin/atcr-appview serve 242 95 ``` 243 96 244 - ## Usage 97 + See **[deploy/README.md](./deploy/README.md)** for production deployment. 245 98 246 - ### Authenticate 99 + ## Development 247 100 248 - ```bash 249 - # Auto-authentication on first push/pull 250 - docker push atcr.io/yourhandle/myapp:latest 251 - ``` 252 - 253 - ### Pushing an Image 101 + ### Building from Source 254 102 255 103 ```bash 256 - # Tag your image 257 - docker tag myapp:latest atcr.io/alice/myapp:latest 104 + # Build all binaries 105 + go build -o bin/atcr-appview ./cmd/appview 106 + go build -o bin/atcr-hold ./cmd/hold 107 + go build -o bin/docker-credential-atcr ./cmd/credential-helper 258 108 259 - # Push to ATCR (credential helper handles auth) 260 - docker push atcr.io/alice/myapp:latest 109 + # Run tests 110 + go test ./... 111 + go test -race ./... 261 112 ``` 262 113 263 - ### Pulling an Image 264 - 265 - ```bash 266 - # Pull from ATCR 267 - docker pull atcr.io/alice/myapp:latest 268 - ``` 269 - 270 - ### Web Interface 271 - 272 - Visit the AppView URL (default: http://localhost:5000) to: 273 - - Browse repositories 274 - - Search for images 275 - - Star your favorites 276 - - View pull statistics 277 - - Manage your storage settings 278 - 279 - ## Development 280 - 281 114 ### Project Structure 282 115 283 116 ``` 284 - atcr.io/ 285 - ├── cmd/ 286 - │ ├── appview/ # AppView entrypoint (registry + web UI) 287 - │ ├── hold/ # Hold service entrypoint (BYOS) 288 - │ └── credential-helper/ # Docker credential helper 289 - ├── pkg/ 290 - │ ├── appview/ # Web UI components 291 - │ │ ├── handlers/ # HTTP handlers (home, repo, search, auth) 292 - │ │ ├── db/ # SQLite database layer 293 - │ │ ├── jetstream/ # ATProto Jetstream consumer 294 - │ │ ├── static/ # JS, CSS assets 295 - │ │ └── templates/ # HTML templates 296 - │ ├── atproto/ # ATProto integration 297 - │ │ ├── client.go # PDS client 298 - │ │ ├── resolver.go # DID/handle resolution 299 - │ │ ├── manifest_store.go # OCI manifest store 300 - │ │ ├── lexicon.go # ATProto record schemas 301 - │ │ └── profile.go # Sailor profile management 302 - │ ├── storage/ # Storage layer 303 - │ │ ├── routing_repository.go # Routes manifests/blobs 304 - │ │ ├── proxy_blob_store.go # BYOS proxy 305 - │ │ ├── s3_blob_store.go # S3 wrapper 306 - │ │ └── hold_cache.go # Hold endpoint cache 307 - │ ├── middleware/ # Registry middleware 308 - │ │ ├── registry.go # Name resolution 309 - │ │ └── repository.go # Storage routing 310 - │ └── auth/ # Authentication 311 - │ ├── oauth/ # ATProto OAuth with DPoP 312 - │ ├── token/ # JWT issuer/validator 313 - │ └── atproto/ # Session validation 314 - ├── config/ # Configuration files 315 - ├── docs/ # Documentation 316 - └── Dockerfile 317 - ``` 117 + cmd/ 118 + ├── appview/ # Registry server + web UI 119 + ├── hold/ # Storage service (BYOS) 120 + └── credential-helper/ # Docker credential helper 318 121 319 - ### Testing 320 - 321 - ```bash 322 - # Run tests 323 - go test ./... 324 - 325 - # Run with race detector 326 - go test -race ./... 122 + pkg/ 123 + ├── appview/ 124 + │ ├── db/ # SQLite database (migrations, queries, stores) 125 + │ ├── handlers/ # HTTP handlers (home, repo, search, auth, settings) 126 + │ ├── jetstream/ # ATProto Jetstream consumer 127 + │ ├── middleware/ # Auth & registry middleware 128 + │ ├── storage/ # Storage routing (hold cache, blob proxy, repository) 129 + │ ├── static/ # Static assets (JS, CSS, install scripts) 130 + │ └── templates/ # HTML templates 131 + ├── atproto/ # ATProto client, records, manifest/tag stores 132 + ├── auth/ 133 + │ ├── oauth/ # OAuth client, server, refresher, storage 134 + │ ├── token/ # JWT issuer, validator, claims 135 + │ └── atproto/ # Session validation 136 + └── hold/ # Hold service (authorization, storage, multipart, S3) 327 137 ``` 328 138 329 139 ## License
+7 -40
deploy/README.md
··· 204 204 # (Build from source or download release) 205 205 go install atcr.io/cmd/docker-credential-atcr@latest 206 206 207 - # Configure Docker 208 - docker-credential-atcr configure 209 - 210 - # Enter your ATProto handle when prompted 211 - # Complete OAuth flow in browser 207 + # Configure Docker to use the credential helper 208 + # Add to ~/.docker/config.json: 209 + { 210 + "credHelpers": { 211 + "atcr.io": "atcr" 212 + } 213 + } 212 214 ``` 213 215 214 216 #### Test 3: Push a test image ··· 409 411 # Then push an image 410 412 ``` 411 413 412 - ## Security Hardening 413 - 414 - ### Firewall 415 - 416 - ```bash 417 - # Allow only necessary ports 418 - firewall-cmd --permanent --remove-service=cockpit 419 - firewall-cmd --permanent --add-rich-rule='rule family="ipv4" source address="your-ip" service name="ssh" accept' 420 - firewall-cmd --reload 421 - ``` 422 - 423 414 ### Automatic Updates 424 415 425 416 ```bash ··· 472 463 - PostgreSQL for UI database (replace SQLite) 473 464 - Multiple hold services (geo-distributed) 474 465 475 - ## Cost Estimation 476 - 477 - **UpCloud Server:** 478 - - 2 GB RAM / 1 CPU: ~$15/month 479 - - 4 GB RAM / 2 CPU: ~$30/month 480 - 481 - **UpCloud Object Storage:** 482 - - Storage: $0.01/GB/month 483 - - Egress: $0.01/GB (first 1TB free in some regions) 484 - 485 - **Example monthly cost:** 486 - - Server: $15 487 - - Storage (100GB): $1 488 - - Transfer (500GB): $5 489 - - **Total: ~$21/month** 490 - 491 - **Bandwidth optimization:** 492 - - Presigned URLs mean hold service uses minimal bandwidth 493 - - Most costs are S3 storage + transfer (not server bandwidth) 494 - 495 466 ## Support 496 467 497 468 - Documentation: https://tangled.org/@evan.jarrett.net/at-container-registry 498 469 - Issues: https://github.com/your-org/atcr.io/issues 499 470 - Bluesky: @yourhandle.bsky.social 500 - 501 - ## License 502 - 503 - MIT
-434
docs/APPVIEW_OAUTH.md
··· 1 - # AppView-Mediated OAuth Architecture 2 - 3 - ## Overview 4 - 5 - ATCR uses a two-tier authentication model to support OAuth while allowing the AppView to write manifests to users' Personal Data Servers (PDS). 6 - 7 - ## The Problem 8 - 9 - OAuth with DPoP creates cryptographically bound tokens that cannot be delegated: 10 - 11 - - **Basic Auth**: App password is a shared secret that can be forwarded from client → AppView → PDS ✅ 12 - - **OAuth + DPoP**: Token is bound to client's keypair and cannot be reused by AppView ❌ 13 - 14 - This creates a challenge: How can the AppView write manifests to the user's PDS on their behalf? 15 - 16 - ## The Solution: Two-Tier Authentication 17 - 18 - ``` 19 - ┌──────────┐ ┌─────────┐ ┌────────────┐ 20 - │ Docker │◄───────►│ AppView │◄───────►│ PDS/Auth │ 21 - │ Client │ Auth1 │ (ATCR) │ Auth2 │ Server │ 22 - └──────────┘ └─────────┘ └────────────┘ 23 - ``` 24 - 25 - **Auth Tier 1** (Docker ↔ AppView): Registry authentication 26 - - Client authenticates to AppView using session tokens 27 - - AppView issues short-lived registry JWTs 28 - - Standard Docker registry auth protocol 29 - 30 - **Auth Tier 2** (AppView ↔ PDS): Resource access 31 - - AppView acts as OAuth client for each user 32 - - AppView stores refresh tokens per user 33 - - AppView gets access tokens on-demand to write manifests 34 - 35 - ## Complete Flows 36 - 37 - ### One-Time Authorization Flow 38 - 39 - ``` 40 - ┌────────┐ ┌──────────────┐ ┌─────────┐ ┌─────┐ 41 - │ User │ │ Credential │ │ AppView │ │ PDS │ 42 - │ │ │ Helper │ │ │ │ │ 43 - └───┬────┘ └──────┬───────┘ └────┬────┘ └──┬──┘ 44 - │ │ │ │ 45 - │ $ docker-credential-atcr configure │ │ 46 - │ Enter handle: evan.jarrett.net │ │ 47 - │─────────────────────>│ │ │ 48 - │ │ │ │ 49 - │ │ GET /auth/oauth/authorize?handle=... │ 50 - │ │─────────────────────>│ │ 51 - │ │ │ │ 52 - │ │ 302 Redirect to PDS │ │ 53 - │ │<─────────────────────│ │ 54 - │ │ │ │ 55 - │ [Browser opens] │ │ │ 56 - │<─────────────────────│ │ │ 57 - │ │ │ │ 58 - │ Authorize ATCR? │ │ │ 59 - │──────────────────────────────────────────────────────────────>│ 60 - │ │ │ │ 61 - │ │ │<─code────────────│ 62 - │ │ │ │ 63 - │ │ │ POST /token │ 64 - │ │ │ (exchange code) │ 65 - │ │ │ + DPoP proof │ 66 - │ │ │─────────────────>│ 67 - │ │ │ │ 68 - │ │ │<─refresh_token───│ 69 - │ │ │ access_token │ 70 - │ │ │ │ 71 - │ │ │ [Store tokens] │ 72 - │ │ │ DID → { │ 73 - │ │ │ refresh_token, │ 74 - │ │ │ dpop_key, │ 75 - │ │ │ pds_endpoint │ 76 - │ │ │ } │ 77 - │ │ │ │ 78 - │ │<─session_token───────│ │ 79 - │ │ │ │ 80 - │ [Store session] │ │ │ 81 - │<─────────────────────│ │ │ 82 - │ ~/.atcr/ │ │ │ 83 - │ session.json │ │ │ 84 - │ │ │ │ 85 - │ ✓ Authorization │ │ │ 86 - │ complete! │ │ │ 87 - │ │ │ │ 88 - ``` 89 - 90 - ### Docker Push Flow (Every Push) 91 - 92 - ``` 93 - ┌────────┐ ┌──────────┐ ┌─────────┐ ┌─────┐ 94 - │ Docker │ │ Cred │ │ AppView │ │ PDS │ 95 - │ │ │ Helper │ │ │ │ │ 96 - └───┬────┘ └────┬─────┘ └────┬────┘ └──┬──┘ 97 - │ │ │ │ 98 - │ docker push │ │ │ 99 - │──────────────>│ │ │ 100 - │ │ │ │ 101 - │ │ GET /auth/exchange │ 102 - │ │ Authorization: Bearer │ 103 - │ │ <session_token> │ 104 - │ │──────────────>│ │ 105 - │ │ │ │ 106 - │ │ │ [Validate │ 107 - │ │ │ session] │ 108 - │ │ │ │ 109 - │ │ │ [Issue JWT] │ 110 - │ │ │ │ 111 - │ │<──registry_jwt─│ │ 112 - │ │ │ │ 113 - │<─registry_jwt─│ │ │ 114 - │ │ │ │ 115 - │ PUT /v2/.../manifests/... │ │ 116 - │ Authorization: Bearer │ │ 117 - │ <registry_jwt> │ │ 118 - │──────────────────────────────>│ │ 119 - │ │ │ 120 - │ │ [Validate │ 121 - │ │ JWT] │ 122 - │ │ │ 123 - │ │ [Get fresh │ 124 - │ │ access │ 125 - │ │ token] │ 126 - │ │ │ 127 - │ │ POST /token │ 128 - │ │ (refresh) │ 129 - │ │ + DPoP │ 130 - │ │────────────>│ 131 - │ │ │ 132 - │ │<access_token│ 133 - │ │ │ 134 - │ │ PUT record │ 135 - │ │ (manifest) │ 136 - │ │ + DPoP │ 137 - │ │────────────>│ 138 - │ │ │ 139 - │ │<──201 OK────│ 140 - │ │ │ 141 - │<──────────201 OK──────────────│ │ 142 - │ │ │ 143 - ``` 144 - 145 - ## Components 146 - 147 - ### 1. OAuth Authorization Server (AppView) 148 - 149 - **File**: `pkg/auth/oauth/server.go` 150 - 151 - **Endpoints**: 152 - 153 - #### `GET /auth/oauth/authorize` 154 - 155 - Initiates OAuth flow for a user. 156 - 157 - **Query Parameters**: 158 - - `handle` (required): User's ATProto handle (e.g., `evan.jarrett.net`) 159 - 160 - **Flow**: 161 - 1. Resolve handle → DID → PDS endpoint 162 - 2. Discover PDS OAuth metadata 163 - 3. Generate state + PKCE verifier 164 - 4. Create PAR request to PDS 165 - 5. Redirect user to PDS authorization endpoint 166 - 167 - **Response**: `302 Redirect` to PDS authorization page 168 - 169 - #### `GET /auth/oauth/callback` 170 - 171 - Receives OAuth callback from PDS. 172 - 173 - **Query Parameters**: 174 - - `code`: Authorization code 175 - - `state`: State for CSRF protection 176 - 177 - **Flow**: 178 - 1. Validate state 179 - 2. Exchange code for tokens (POST to PDS token endpoint) 180 - 3. Use AppView's DPoP key for the exchange 181 - 4. Store refresh token + DPoP key for user's DID 182 - 5. Generate AppView session token 183 - 6. Redirect to success page with session token 184 - 185 - **Response**: HTML page with session token (user copies to credential helper) 186 - 187 - ### 2. Refresh Token Storage 188 - 189 - **File**: `pkg/auth/oauth/storage.go` 190 - 191 - **Storage Format**: 192 - 193 - ```json 194 - { 195 - "refresh_tokens": { 196 - "did:plc:abc123": { 197 - "refresh_token": "...", 198 - "dpop_key_pem": "-----BEGIN EC PRIVATE KEY-----\n...", 199 - "pds_endpoint": "https://bsky.social", 200 - "handle": "evan.jarrett.net", 201 - "created_at": "2025-10-04T...", 202 - "last_refreshed": "2025-10-04T..." 203 - } 204 - } 205 - } 206 - ``` 207 - 208 - **Location**: 209 - - Development: `~/.atcr/appview-tokens.json` 210 - - Production: Encrypted database or secret manager 211 - 212 - **Security**: 213 - - File permissions: `0600` (owner read/write only) 214 - - Consider encrypting DPoP keys at rest 215 - - Rotate refresh tokens periodically 216 - 217 - ### 3. Token Refresher 218 - 219 - **File**: `pkg/auth/oauth/refresher.go` 220 - 221 - **Interface**: 222 - 223 - ```go 224 - type Refresher interface { 225 - // GetAccessToken gets a fresh access token for a DID 226 - // Returns cached token if still valid, otherwise refreshes 227 - GetAccessToken(ctx context.Context, did string) (token string, dpopKey *ecdsa.PrivateKey, err error) 228 - 229 - // RefreshToken forces a token refresh 230 - RefreshToken(ctx context.Context, did string) error 231 - 232 - // RevokeToken removes stored refresh token 233 - RevokeToken(did string) error 234 - } 235 - ``` 236 - 237 - **Caching Strategy**: 238 - - Access tokens cached for 14 minutes (expire at 15min) 239 - - Refresh tokens stored persistently 240 - - Cache key: `did → {access_token, dpop_key, expires_at}` 241 - 242 - ### 4. Session Management 243 - 244 - **File**: `pkg/auth/session/handler.go` 245 - 246 - **Session Token Format**: 247 - ``` 248 - Base64(JSON({ 249 - "did": "did:plc:abc123", 250 - "handle": "evan.jarrett.net", 251 - "issued_at": "2025-10-04T...", 252 - "expires_at": "2025-11-03T..." // 30 days 253 - })).HMAC-SHA256(secret) 254 - ``` 255 - 256 - **Storage**: Stateless (validated by HMAC signature) 257 - 258 - **Endpoints**: 259 - 260 - #### `GET /auth/session/validate` 261 - 262 - Validates a session token. 263 - 264 - **Headers**: 265 - - `Authorization: Bearer <session_token>` 266 - 267 - **Response**: 268 - ```json 269 - { 270 - "did": "did:plc:abc123", 271 - "handle": "evan.jarrett.net", 272 - "valid": true 273 - } 274 - ``` 275 - 276 - ### 5. Updated Exchange Handler 277 - 278 - **File**: `pkg/auth/exchange/handler.go` 279 - 280 - **Changes**: 281 - - Accept session token instead of OAuth token 282 - - Validate session token → extract DID 283 - - Issue registry JWT with DID 284 - - Remove PDS token validation 285 - 286 - **Request**: 287 - ``` 288 - POST /auth/exchange 289 - Authorization: Bearer <session_token> 290 - 291 - { 292 - "scope": ["repository:*:pull,push"] 293 - } 294 - ``` 295 - 296 - **Response**: 297 - ```json 298 - { 299 - "token": "<registry-jwt>", 300 - "expires_in": 900 301 - } 302 - ``` 303 - 304 - ### 6. Credential Helper Updates 305 - 306 - **File**: `cmd/credential-helper/main.go` 307 - 308 - **Changes**: 309 - 310 - 1. **Configure command**: 311 - - Open browser to AppView: `http://127.0.0.1:5000/auth/oauth/authorize?handle=...` 312 - - User authorizes on PDS 313 - - AppView displays session token 314 - - User copies session token to helper 315 - - Helper stores session token 316 - 317 - 2. **Get command**: 318 - - Load session token from `~/.atcr/session.json` 319 - - Call `/auth/exchange` with session token 320 - - Return registry JWT to Docker 321 - 322 - 3. **Storage format**: 323 - ```json 324 - { 325 - "session_token": "...", 326 - "handle": "evan.jarrett.net", 327 - "appview_url": "http://127.0.0.1:5000" 328 - } 329 - ``` 330 - 331 - **Removed**: 332 - - DPoP key generation 333 - - OAuth client logic 334 - - Refresh token handling 335 - 336 - ## Security Considerations 337 - 338 - ### AppView as Trusted Component 339 - 340 - The AppView becomes a **trusted intermediary** that: 341 - - Stores refresh tokens for users 342 - - Acts on users' behalf to write manifests 343 - - Issues registry authentication tokens 344 - 345 - **Trust model**: 346 - - Users must trust the AppView operator 347 - - Similar to trusting a Docker registry operator 348 - - AppView has write access to manifests (not profile data) 349 - 350 - ### Scope Limitations 351 - 352 - AppView OAuth tokens are requested with minimal scopes: 353 - - `atproto` - Basic ATProto operations 354 - - Only needs: `com.atproto.repo.putRecord`, `com.atproto.repo.getRecord` 355 - - Does NOT need: profile updates, social graph access, etc. 356 - 357 - ### Token Security 358 - 359 - **Refresh Tokens**: 360 - - Stored encrypted at rest 361 - - File permissions: 0600 362 - - Rotated periodically (when used) 363 - - Can be revoked by user on PDS 364 - 365 - **Session Tokens**: 366 - - 30-day expiry 367 - - HMAC-signed (stateless validation) 368 - - Can be revoked by clearing storage 369 - 370 - **Access Tokens**: 371 - - Cached in-memory only 372 - - 15-minute expiry 373 - - Never stored persistently 374 - 375 - ### Audit Trail 376 - 377 - AppView should log: 378 - - OAuth authorizations (DID, timestamp) 379 - - Token refreshes (DID, timestamp) 380 - - Manifest writes (DID, repository, timestamp) 381 - 382 - ## Migration from Current OAuth 383 - 384 - Users currently using `docker-credential-atcr` with direct PDS OAuth will need to: 385 - 386 - 1. Run `docker-credential-atcr configure` again 387 - 2. Authorize AppView (new OAuth flow) 388 - 3. Old PDS tokens are no longer used 389 - 390 - ## Alternative: Bring Your Own AppView 391 - 392 - Users who don't trust a shared AppView can: 393 - 1. Run their own ATCR AppView instance 394 - 2. Configure credential helper to point at their AppView 395 - 3. Their AppView stores their refresh tokens locally 396 - 397 - ## Future Enhancements 398 - 399 - ### Multi-AppView Support 400 - 401 - Allow users to configure multiple AppViews: 402 - ```json 403 - { 404 - "appviews": { 405 - "default": "https://atcr.io", 406 - "personal": "http://localhost:5000" 407 - }, 408 - "sessions": { 409 - "https://atcr.io": {"session_token": "...", "handle": "..."}, 410 - "http://localhost:5000": {"session_token": "...", "handle": "..."} 411 - } 412 - } 413 - ``` 414 - 415 - ### Refresh Token Rotation 416 - 417 - Implement automatic refresh token rotation per OAuth best practices: 418 - - PDS issues new refresh token with each use 419 - - AppView updates stored token 420 - - Old refresh token invalidated 421 - 422 - ### Revocation UI 423 - 424 - Add web UI for users to: 425 - - View active sessions 426 - - Revoke AppView access 427 - - See audit log of manifest writes 428 - 429 - ## References 430 - 431 - - [ATProto OAuth Specification](https://atproto.com/specs/oauth) 432 - - [RFC 6749: OAuth 2.0](https://datatracker.ietf.org/doc/html/rfc6749) 433 - - [RFC 9449: DPoP](https://datatracker.ietf.org/doc/html/rfc9449) 434 - - [Docker Credential Helpers](https://github.com/docker/docker-credential-helpers)
-460
docs/HOLD_MULTIPART.md
··· 1 - # Hold Service Multipart Upload Architecture 2 - 3 - ## Overview 4 - 5 - The hold service supports multipart uploads through two modes: 6 - 1. **S3Native** - Uses S3's native multipart API with presigned URLs (optimal) 7 - 2. **Buffered** - Buffers parts in hold service memory, assembles on completion (fallback) 8 - 9 - This dual-mode approach enables the hold service to work with: 10 - - S3-compatible storage with presigned URL support (S3, Storj, MinIO, etc.) 11 - - S3-compatible storage WITHOUT presigned URL support 12 - - Filesystem storage 13 - - Any storage driver supported by distribution 14 - 15 - ## Current State 16 - 17 - ### What Works ✅ 18 - - **S3 Native Mode with presigned URLs**: Fully working! Direct uploads to S3 via presigned URLs 19 - - **Buffered mode with S3**: Tested and working with `DISABLE_PRESIGNED_URLS=true` 20 - - **Filesystem storage**: Tested and working! Buffered mode with filesystem driver 21 - - **AppView multipart client**: Implements chunked uploads via multipart API 22 - - **MultipartManager**: Session tracking, automatic cleanup, thread-safe operations 23 - - **Automatic fallback**: Falls back to buffered mode when S3 unavailable or disabled 24 - - **ETag normalization**: Handles quoted/unquoted ETags from S3 25 - - **Route handler**: `/multipart-parts/{uploadID}/{partNumber}` endpoint added and tested 26 - 27 - ### All Implementation Complete! 🎉 28 - All three multipart upload modes are fully implemented, tested, and working in production. 29 - 30 - ### Bugs Fixed 🔧 31 - - **Missing S3 parts in complete**: For S3Native mode, parts uploaded directly to S3 weren't being recorded. Fixed by storing parts from request in `HandleCompleteMultipart` before calling `CompleteMultipartUploadWithManager`. 32 - - **Malformed XML error from S3**: S3 requires ETags to be quoted in CompleteMultipartUpload XML. Added `normalizeETag()` function to ensure quotes are present. 33 - - **Route missing**: `/multipart-parts/{uploadID}/{partNumber}` not registered in cmd/hold/main.go. Fixed by adding route handler with path parsing. 34 - - **MultipartMgr access**: Field was private, preventing route handler access. Fixed by exporting as `MultipartMgr`. 35 - - **DISABLE_PRESIGNED_URLS not logged**: `initS3Client()` didn't check the flag before initializing. Fixed with early return check and proper logging. 36 - 37 - ## Architecture 38 - 39 - ### Three Modes of Operation 40 - 41 - #### Mode 1: S3 Native Multipart ✅ WORKING 42 - ``` 43 - Docker → AppView → Hold → S3 (presigned URLs) 44 - 45 - Returns presigned URL 46 - 47 - Docker ──────────→ S3 (direct upload) 48 - ``` 49 - 50 - **Flow:** 51 - 1. AppView: `POST /start-multipart` → Hold starts S3 multipart, returns uploadID 52 - 2. AppView: `POST /part-presigned-url` → Hold returns S3 presigned URL 53 - 3. Docker → S3: Direct upload via presigned URL 54 - 4. AppView: `POST /complete-multipart` → Hold calls S3 CompleteMultipartUpload 55 - 56 - **Advantages:** 57 - - No data flows through hold service 58 - - Minimal bandwidth usage 59 - - Fast uploads 60 - 61 - #### Mode 2: S3 Proxy Mode (Buffered) ✅ WORKING 62 - ``` 63 - Docker → AppView → Hold → S3 (via driver) 64 - 65 - Buffers & proxies 66 - 67 - S3 68 - ``` 69 - 70 - **Flow:** 71 - 1. AppView: `POST /start-multipart` → Hold creates buffered session 72 - 2. AppView: `POST /part-presigned-url` → Hold returns proxy URL 73 - 3. Docker → Hold: `PUT /multipart-parts/{uploadID}/{part}` → Hold buffers 74 - 4. AppView: `POST /complete-multipart` → Hold uploads to S3 via driver 75 - 76 - **Use Cases:** 77 - - S3 provider doesn't support presigned URLs 78 - - S3 API fails to generate presigned URL 79 - - Fallback from Mode 1 80 - 81 - #### Mode 3: Filesystem Mode ✅ WORKING 82 - ``` 83 - Docker → AppView → Hold (filesystem driver) 84 - 85 - Buffers & writes 86 - 87 - Local filesystem 88 - ``` 89 - 90 - **Flow:** 91 - Same as Mode 2, but writes to filesystem driver instead of S3 driver. 92 - 93 - **Use Cases:** 94 - - Development/testing with local filesystem 95 - - Small deployments without S3 96 - - Air-gapped environments 97 - 98 - ## Implementation: pkg/hold/multipart.go 99 - 100 - ### Core Components 101 - 102 - #### MultipartManager 103 - ```go 104 - type MultipartManager struct { 105 - sessions map[string]*MultipartSession 106 - mu sync.RWMutex 107 - } 108 - ``` 109 - 110 - **Responsibilities:** 111 - - Track active multipart sessions 112 - - Clean up abandoned uploads (>24h inactive) 113 - - Thread-safe session access 114 - 115 - #### MultipartSession 116 - ```go 117 - type MultipartSession struct { 118 - UploadID string // Unique ID for this upload 119 - Digest string // Target blob digest 120 - Mode MultipartMode // S3Native or Buffered 121 - S3UploadID string // S3 upload ID (S3Native only) 122 - Parts map[int]*MultipartPart // Buffered parts (Buffered only) 123 - CreatedAt time.Time 124 - LastActivity time.Time 125 - } 126 - ``` 127 - 128 - **State Tracking:** 129 - - S3Native: Tracks S3 upload ID and part ETags 130 - - Buffered: Stores part data in memory 131 - 132 - #### MultipartPart 133 - ```go 134 - type MultipartPart struct { 135 - PartNumber int // Part number (1-indexed) 136 - Data []byte // Part data (Buffered mode only) 137 - ETag string // S3 ETag or computed hash 138 - Size int64 139 - } 140 - ``` 141 - 142 - ### Key Methods 143 - 144 - #### StartMultipartUploadWithManager 145 - ```go 146 - func (s *HoldService) StartMultipartUploadWithManager( 147 - ctx context.Context, 148 - digest string, 149 - manager *MultipartManager, 150 - ) (string, MultipartMode, error) 151 - ``` 152 - 153 - **Logic:** 154 - 1. Try S3 native multipart via `s.startMultipartUpload()` 155 - 2. If successful → Create S3Native session 156 - 3. If fails or no S3 client → Create Buffered session 157 - 4. Return uploadID and mode 158 - 159 - #### GetPartUploadURL 160 - ```go 161 - func (s *HoldService) GetPartUploadURL( 162 - ctx context.Context, 163 - session *MultipartSession, 164 - partNumber int, 165 - did string, 166 - ) (string, error) 167 - ``` 168 - 169 - **Logic:** 170 - - S3Native mode: Generate S3 presigned URL via `s.getPartPresignedURL()` 171 - - Buffered mode: Return proxy endpoint `/multipart-parts/{uploadID}/{part}` 172 - 173 - #### CompleteMultipartUploadWithManager 174 - ```go 175 - func (s *HoldService) CompleteMultipartUploadWithManager( 176 - ctx context.Context, 177 - session *MultipartSession, 178 - manager *MultipartManager, 179 - ) error 180 - ``` 181 - 182 - **Logic:** 183 - - S3Native: Call `s.completeMultipartUpload()` with S3 API 184 - - Buffered: Assemble parts in order, write via storage driver 185 - 186 - #### HandleMultipartPartUpload (New Endpoint) 187 - ```go 188 - func (s *HoldService) HandleMultipartPartUpload( 189 - w http.ResponseWriter, 190 - r *http.Request, 191 - uploadID string, 192 - partNumber int, 193 - did string, 194 - manager *MultipartManager, 195 - ) 196 - ``` 197 - 198 - **New HTTP endpoint:** `PUT /multipart-parts/{uploadID}/{partNumber}` 199 - 200 - **Purpose:** Receive part uploads in Buffered mode 201 - 202 - **Logic:** 203 - 1. Validate session exists and is in Buffered mode 204 - 2. Authorize write access 205 - 3. Read part data from request body 206 - 4. Store in session with computed ETag (SHA256) 207 - 5. Return ETag in response header 208 - 209 - ## Integration Plan 210 - 211 - ### Phase 1: Migrate to pkg/hold (COMPLETE) 212 - - [x] Extract code from cmd/hold/main.go to pkg/hold/ 213 - - [x] Create isolated multipart.go implementation 214 - - [x] Update cmd/hold/main.go to import pkg/hold 215 - - [x] Test existing functionality works 216 - 217 - ### Phase 2: Add Buffered Mode Support (COMPLETE ✅) 218 - - [x] Add MultipartManager to HoldService 219 - - [x] Update handlers to use `*WithManager` methods 220 - - [x] Add DISABLE_PRESIGNED_URLS environment variable for testing 221 - - [x] Implement presigned URL disable checks in all methods 222 - - [x] **Fixed: Record S3 parts from request in HandleCompleteMultipart** 223 - - [x] **Fixed: ETag normalization (add quotes for S3 XML)** 224 - - [x] **Test S3 native mode with presigned URLs** ✅ WORKING 225 - - [x] **Add route in cmd/hold/main.go** ✅ COMPLETE 226 - - [x] **Export MultipartMgr field for route handler access** ✅ COMPLETE 227 - - [x] **Test DISABLE_PRESIGNED_URLS=true with S3 storage** ✅ WORKING 228 - - [x] **Test filesystem storage with buffered multipart** ✅ WORKING 229 - 230 - ### Phase 3: Update AppView 231 - - [ ] Detect hold capabilities (presigned vs proxy) 232 - - [ ] Fallback to buffered mode when presigned fails 233 - - [ ] Handle `/multipart-parts/` proxy URLs 234 - 235 - ### Phase 4: Capability Discovery 236 - - [ ] Add capability endpoint: `GET /capabilities` 237 - - [ ] Return: `{"multipart": "native|buffered|both", "storage": "s3|filesystem"}` 238 - - [ ] AppView uses capabilities to choose upload strategy 239 - 240 - ## Testing Strategy 241 - 242 - ### Unit Tests 243 - - [ ] MultipartManager session lifecycle 244 - - [ ] Part buffering and assembly 245 - - [ ] Concurrent part uploads (thread safety) 246 - - [ ] Session cleanup (expired uploads) 247 - 248 - ### Integration Tests 249 - 250 - **S3 Native Mode:** 251 - - [x] Start multipart → get presigned URLs → upload parts → complete ✅ WORKING 252 - - [x] Verify no data flows through hold service (only ~1KB API calls) 253 - - [ ] Test abort cleanup 254 - 255 - **Buffered Mode (S3 with DISABLE_PRESIGNED_URLS):** 256 - - [x] Start multipart → get proxy URLs → upload parts → complete ✅ WORKING 257 - - [x] Verify parts assembled correctly 258 - - [ ] Test missing part detection 259 - - [ ] Test abort cleanup 260 - 261 - **Buffered Mode (Filesystem):** 262 - - [x] Start multipart → get proxy URLs → upload parts → complete ✅ WORKING 263 - - [x] Verify parts assembled correctly ✅ WORKING 264 - - [x] Verify blobs written to filesystem ✅ WORKING 265 - - [ ] Test missing part detection 266 - - [ ] Test abort cleanup 267 - 268 - ### Load Tests 269 - - [ ] Concurrent multipart uploads (multiple sessions) 270 - - [ ] Large blobs (100MB+, many parts) 271 - - [ ] Memory usage with many buffered parts 272 - 273 - ## Performance Considerations 274 - 275 - ### Memory Usage (Buffered Mode) 276 - - Parts stored in memory until completion 277 - - Docker typically uses 5MB chunks (S3 minimum) 278 - - 100MB image = ~20 parts = ~100MB RAM during upload 279 - - Multiple concurrent uploads multiply memory usage 280 - 281 - **Mitigation:** 282 - - Session cleanup (24h timeout) 283 - - Consider disk-backed buffering for large parts (future optimization) 284 - - Monitor memory usage and set limits 285 - 286 - ### Network Bandwidth 287 - - S3Native: Minimal (only API calls) 288 - - Buffered: Full blob data flows through hold service 289 - - Filesystem: Always buffered (no presigned URL option) 290 - 291 - ## Configuration 292 - 293 - ### Environment Variables 294 - 295 - **Current (S3 only):** 296 - ```bash 297 - STORAGE_DRIVER=s3 298 - S3_BUCKET=my-bucket 299 - S3_ENDPOINT=https://s3.amazonaws.com 300 - AWS_ACCESS_KEY_ID=... 301 - AWS_SECRET_ACCESS_KEY=... 302 - ``` 303 - 304 - **Filesystem:** 305 - ```bash 306 - STORAGE_DRIVER=filesystem 307 - STORAGE_ROOT_DIR=/var/lib/atcr/hold 308 - ``` 309 - 310 - ### Automatic Mode Selection 311 - No configuration needed - hold service automatically: 312 - 1. Tries S3 native multipart if S3 client exists 313 - 2. Falls back to buffered mode if S3 unavailable or fails 314 - 3. Always uses buffered mode for filesystem driver 315 - 316 - ## Security Considerations 317 - 318 - ### Authorization 319 - - All multipart operations require write authorization 320 - - Buffered mode: Check auth on every part upload 321 - - S3Native: Auth only on start/complete (presigned URLs have embedded auth) 322 - 323 - ### Resource Limits 324 - - Max upload size: Controlled by storage backend 325 - - Max concurrent uploads: Limited by memory 326 - - Session timeout: 24 hours (configurable) 327 - 328 - ### Attack Vectors 329 - - **Memory exhaustion**: Attacker uploads many large parts 330 - - Mitigation: Session limits, cleanup, auth 331 - - **Incomplete uploads**: Attacker starts but never completes 332 - - Mitigation: 24h timeout, cleanup goroutine 333 - - **Part flooding**: Upload many tiny parts 334 - - Mitigation: S3 has 10,000 part limit, could add to buffered mode 335 - 336 - ## Future Enhancements 337 - 338 - ### Disk-Backed Buffering 339 - Instead of memory, buffer parts to temporary disk location: 340 - - Reduces memory pressure 341 - - Supports larger uploads 342 - - Requires cleanup on completion/abort 343 - 344 - ### Parallel Part Assembly 345 - For large uploads, assemble parts in parallel: 346 - - Stream parts to writer as they arrive 347 - - Reduce memory footprint 348 - - Faster completion 349 - 350 - ### Chunked Completion 351 - For very large assembled blobs: 352 - - Stream to storage driver in chunks 353 - - Avoid loading entire blob in memory 354 - - Use `io.Copy()` with buffer 355 - 356 - ### Multi-Backend Support 357 - - Azure Blob Storage multipart 358 - - Google Cloud Storage resumable uploads 359 - - Backblaze B2 large file API 360 - 361 - ## Implementation Complete ✅ 362 - 363 - The buffered multipart mode is fully implemented with the following components: 364 - 365 - **Route Handler** (`cmd/hold/main.go:47-73`): 366 - - Endpoint: `PUT /multipart-parts/{uploadID}/{partNumber}` 367 - - Parses URL path to extract uploadID and partNumber 368 - - Delegates to `service.HandleMultipartPartUpload()` 369 - 370 - **Exported Manager** (`pkg/hold/service.go:20`): 371 - - Field `MultipartMgr` is now exported for route handler access 372 - - All handlers updated to use `s.MultipartMgr` 373 - 374 - **Configuration Check** (`pkg/hold/s3.go:20-25`): 375 - - `initS3Client()` checks `DISABLE_PRESIGNED_URLS` flag before initializing 376 - - Logs clear message when presigned URLs are disabled 377 - - Prevents misleading "S3 presigned URLs enabled" message 378 - 379 - ## Testing Multipart Modes 380 - 381 - ### Test 1: S3 Native Mode (presigned URLs) ✅ TESTED 382 - ```bash 383 - export STORAGE_DRIVER=s3 384 - export S3_BUCKET=your-bucket 385 - export AWS_ACCESS_KEY_ID=... 386 - export AWS_SECRET_ACCESS_KEY=... 387 - # Do NOT set DISABLE_PRESIGNED_URLS 388 - 389 - # Start hold service 390 - ./bin/atcr-hold 391 - 392 - # Push an image 393 - docker push atcr.io/yourdid/test:latest 394 - 395 - # Expected logs: 396 - # "✅ S3 presigned URLs enabled" 397 - # "Started S3 native multipart: uploadID=... s3UploadID=..." 398 - # "Completed multipart upload: digest=... uploadID=... parts=..." 399 - ``` 400 - 401 - **Status**: ✅ Working - Direct uploads to S3, minimal bandwidth through hold service 402 - 403 - ### Test 2: Buffered Mode with S3 (forced proxy) ✅ TESTED 404 - ```bash 405 - export STORAGE_DRIVER=s3 406 - export S3_BUCKET=your-bucket 407 - export AWS_ACCESS_KEY_ID=... 408 - export AWS_SECRET_ACCESS_KEY=... 409 - export DISABLE_PRESIGNED_URLS=true # Force buffered mode 410 - 411 - # Start hold service 412 - ./bin/atcr-hold 413 - 414 - # Push an image 415 - docker push atcr.io/yourdid/test:latest 416 - 417 - # Expected logs: 418 - # "⚠️ S3 presigned URLs DISABLED by config (DISABLE_PRESIGNED_URLS=true)" 419 - # "Presigned URLs disabled (DISABLE_PRESIGNED_URLS=true), using buffered mode" 420 - # "Stored part: uploadID=... part=1 size=..." 421 - # "Assembled buffered parts: uploadID=... parts=... totalSize=..." 422 - # "Completed buffered multipart: uploadID=... size=... written=..." 423 - ``` 424 - 425 - **Status**: ✅ Working - Parts buffered in hold service memory, assembled and written to S3 via driver 426 - 427 - ### Test 3: Filesystem Mode (always buffered) ✅ TESTED 428 - ```bash 429 - export STORAGE_DRIVER=filesystem 430 - export STORAGE_ROOT_DIR=/tmp/atcr-hold-test 431 - # DISABLE_PRESIGNED_URLS not needed (filesystem never has presigned URLs) 432 - 433 - # Start hold service 434 - ./bin/atcr-hold 435 - 436 - # Push an image 437 - docker push atcr.io/yourdid/test:latest 438 - 439 - # Expected logs: 440 - # "Storage driver is filesystem (not S3), presigned URLs disabled" 441 - # "Started buffered multipart: uploadID=..." 442 - # "Stored part: uploadID=... part=1 size=..." 443 - # "Assembled buffered parts: uploadID=... parts=... totalSize=..." 444 - # "Completed buffered multipart: uploadID=... size=... written=..." 445 - 446 - # Verify blobs written to: 447 - ls -lh /var/lib/atcr/hold/docker/registry/v2/blobs/sha256/ 448 - # Or from outside container: 449 - docker exec atcr-hold ls -lh /var/lib/atcr/hold/docker/registry/v2/blobs/sha256/ 450 - ``` 451 - 452 - **Status**: ✅ Working - Parts buffered in memory, assembled, and written to filesystem via driver 453 - 454 - **Note**: Initial HEAD requests will show "Path not found" errors - this is normal! Docker checks if blobs exist before uploading. The errors occur for blobs that haven't been uploaded yet. After upload, subsequent HEAD checks succeed. 455 - 456 - ## References 457 - 458 - - S3 Multipart Upload API: https://docs.aws.amazon.com/AmazonS3/latest/API/API_CreateMultipartUpload.html 459 - - Distribution Storage Driver Interface: https://github.com/distribution/distribution/blob/main/registry/storage/driver/storagedriver.go 460 - - OCI Distribution Spec (Blob Upload): https://github.com/opencontainers/distribution-spec/blob/main/spec.md#pushing-a-blob-in-chunks
-570
docs/MULTIPART.md
··· 1 - S3 Multipart Upload Implementation Plan 2 - 3 - Problem Summary 4 - 5 - Current implementation uses a single presigned URL with a pipe for chunked uploads (PATCH). This causes: 6 - - Docker PATCH requests block waiting for pipe writes 7 - - S3 upload happens in background via single presigned URL 8 - - Docker times out → "client disconnected during blob PATCH" 9 - - Root cause: Single presigned URLs don't support OCI's chunked upload protocol 10 - 11 - Solution: S3 Multipart Upload API 12 - 13 - Implement proper S3 multipart upload to support Docker's chunked PATCH operations: 14 - - Each PATCH → separate S3 part upload with its own presigned URL 15 - - On Commit → complete multipart upload 16 - - No buffering, no pipes, no blocking 17 - 18 - --- 19 - Architecture Changes 20 - 21 - Current (Broken) Flow 22 - 23 - POST /blobs/uploads/ → Create() → Single presigned URL to temp location 24 - PATCH → Write to pipe → [blocks] → Background goroutine uploads via single URL 25 - PATCH → [blocks on pipe] → Docker timeout → disconnect ❌ 26 - 27 - New (Multipart) Flow 28 - 29 - POST /blobs/uploads/ → Create() → Initiate multipart upload, get upload ID 30 - PATCH #1 → Get presigned URL for part 1 → Upload part 1 to S3 → Store ETag 31 - PATCH #2 → Get presigned URL for part 2 → Upload part 2 to S3 → Store ETag 32 - PUT (commit) → Complete multipart upload with ETags → Done ✅ 33 - 34 - --- 35 - Implementation Details 36 - 37 - 1. Hold Service: Add Multipart Upload Endpoints 38 - 39 - File: cmd/hold/main.go 40 - 41 - New Request/Response Types 42 - 43 - // StartMultipartUploadRequest initiates a multipart upload 44 - type StartMultipartUploadRequest struct { 45 - DID string `json:"did"` 46 - Digest string `json:"digest"` 47 - } 48 - 49 - type StartMultipartUploadResponse struct { 50 - UploadID string `json:"upload_id"` 51 - ExpiresAt time.Time `json:"expires_at"` 52 - } 53 - 54 - // GetPartURLRequest requests a presigned URL for a specific part 55 - type GetPartURLRequest struct { 56 - DID string `json:"did"` 57 - Digest string `json:"digest"` 58 - UploadID string `json:"upload_id"` 59 - PartNumber int `json:"part_number"` 60 - } 61 - 62 - type GetPartURLResponse struct { 63 - URL string `json:"url"` 64 - ExpiresAt time.Time `json:"expires_at"` 65 - } 66 - 67 - // CompleteMultipartRequest completes a multipart upload 68 - type CompleteMultipartRequest struct { 69 - DID string `json:"did"` 70 - Digest string `json:"digest"` 71 - UploadID string `json:"upload_id"` 72 - Parts []CompletedPart `json:"parts"` 73 - } 74 - 75 - type CompletedPart struct { 76 - PartNumber int `json:"part_number"` 77 - ETag string `json:"etag"` 78 - } 79 - 80 - // AbortMultipartRequest aborts an in-progress upload 81 - type AbortMultipartRequest struct { 82 - DID string `json:"did"` 83 - Digest string `json:"digest"` 84 - UploadID string `json:"upload_id"` 85 - } 86 - 87 - New Endpoints 88 - 89 - POST /start-multipart 90 - func (s *HoldService) HandleStartMultipart(w http.ResponseWriter, r *http.Request) { 91 - // Validate DID authorization for WRITE 92 - // Build S3 key from digest 93 - // Call s3.CreateMultipartUploadRequest() 94 - // Generate presigned URL if needed, or return upload ID 95 - // Return upload ID to client 96 - } 97 - 98 - POST /part-presigned-url 99 - func (s *HoldService) HandleGetPartURL(w http.ResponseWriter, r *http.Request) { 100 - // Validate DID authorization for WRITE 101 - // Build S3 key from digest 102 - // Call s3.UploadPartRequest() with part number and upload ID 103 - // Generate presigned URL 104 - // Return presigned URL for this specific part 105 - } 106 - 107 - POST /complete-multipart 108 - func (s *HoldService) HandleCompleteMultipart(w http.ResponseWriter, r *http.Request) { 109 - // Validate DID authorization for WRITE 110 - // Build S3 key from digest 111 - // Prepare CompletedPart array with part numbers and ETags 112 - // Call s3.CompleteMultipartUpload() 113 - // Return success 114 - } 115 - 116 - POST /abort-multipart (for cleanup) 117 - func (s *HoldService) HandleAbortMultipart(w http.ResponseWriter, r *http.Request) { 118 - // Validate DID authorization for WRITE 119 - // Call s3.AbortMultipartUpload() 120 - // Return success 121 - } 122 - 123 - S3 Implementation 124 - 125 - // startMultipartUpload initiates a multipart upload and returns upload ID 126 - func (s *HoldService) startMultipartUpload(ctx context.Context, digest string) (string, error) { 127 - if s.s3Client == nil { 128 - return "", fmt.Errorf("S3 not configured") 129 - } 130 - 131 - path := blobPath(digest) 132 - s3Key := strings.TrimPrefix(path, "/") 133 - if s.s3PathPrefix != "" { 134 - s3Key = s.s3PathPrefix + "/" + s3Key 135 - } 136 - 137 - result, err := s.s3Client.CreateMultipartUploadWithContext(ctx, &s3.CreateMultipartUploadInput{ 138 - Bucket: aws.String(s.bucket), 139 - Key: aws.String(s3Key), 140 - }) 141 - if err != nil { 142 - return "", err 143 - } 144 - 145 - return *result.UploadId, nil 146 - } 147 - 148 - // getPartPresignedURL generates presigned URL for a specific part 149 - func (s *HoldService) getPartPresignedURL(ctx context.Context, digest, uploadID string, partNumber int) (string, error) { 150 - if s.s3Client == nil { 151 - return "", fmt.Errorf("S3 not configured") 152 - } 153 - 154 - path := blobPath(digest) 155 - s3Key := strings.TrimPrefix(path, "/") 156 - if s.s3PathPrefix != "" { 157 - s3Key = s.s3PathPrefix + "/" + s3Key 158 - } 159 - 160 - req, _ := s.s3Client.UploadPartRequest(&s3.UploadPartInput{ 161 - Bucket: aws.String(s.bucket), 162 - Key: aws.String(s3Key), 163 - UploadId: aws.String(uploadID), 164 - PartNumber: aws.Int64(int64(partNumber)), 165 - }) 166 - 167 - return req.Presign(15 * time.Minute) 168 - } 169 - 170 - // completeMultipartUpload finalizes the multipart upload 171 - func (s *HoldService) completeMultipartUpload(ctx context.Context, digest, uploadID string, parts []CompletedPart) error { 172 - if s.s3Client == nil { 173 - return fmt.Errorf("S3 not configured") 174 - } 175 - 176 - path := blobPath(digest) 177 - s3Key := strings.TrimPrefix(path, "/") 178 - if s.s3PathPrefix != "" { 179 - s3Key = s.s3PathPrefix + "/" + s3Key 180 - } 181 - 182 - // Convert to S3 CompletedPart format 183 - s3Parts := make([]*s3.CompletedPart, len(parts)) 184 - for i, p := range parts { 185 - s3Parts[i] = &s3.CompletedPart{ 186 - PartNumber: aws.Int64(int64(p.PartNumber)), 187 - ETag: aws.String(p.ETag), 188 - } 189 - } 190 - 191 - _, err := s.s3Client.CompleteMultipartUploadWithContext(ctx, &s3.CompleteMultipartUploadInput{ 192 - Bucket: aws.String(s.bucket), 193 - Key: aws.String(s3Key), 194 - UploadId: aws.String(uploadID), 195 - MultipartUpload: &s3.CompletedMultipartUpload{ 196 - Parts: s3Parts, 197 - }, 198 - }) 199 - 200 - return err 201 - } 202 - 203 - --- 204 - 2. AppView: Rewrite ProxyBlobStore for Multipart 205 - 206 - File: pkg/storage/proxy_blob_store.go 207 - 208 - Remove Current Implementation 209 - 210 - - Remove pipe-based streaming 211 - - Remove background goroutine with single presigned URL 212 - - Remove global upload tracking map 213 - 214 - New ProxyBlobWriter Structure 215 - 216 - type ProxyBlobWriter struct { 217 - store *ProxyBlobStore 218 - options distribution.CreateOptions 219 - uploadID string // S3 multipart upload ID 220 - parts []CompletedPart // Track uploaded parts with ETags 221 - partNumber int // Current part number (starts at 1) 222 - buffer *bytes.Buffer // Buffer for current part 223 - size int64 // Total bytes written 224 - closed bool 225 - id string // Distribution's upload ID (for state) 226 - startedAt time.Time 227 - finalDigest string // Set on Commit 228 - } 229 - 230 - type CompletedPart struct { 231 - PartNumber int 232 - ETag string 233 - } 234 - 235 - New Create() - Initiate Multipart Upload 236 - 237 - func (p *ProxyBlobStore) Create(ctx context.Context, options ...distribution.BlobCreateOption) (distribution.BlobWriter, error) { 238 - var opts distribution.CreateOptions 239 - for _, option := range options { 240 - if err := option.Apply(&opts); err != nil { 241 - return nil, err 242 - } 243 - } 244 - 245 - // Use temp digest for upload location 246 - writerID := fmt.Sprintf("upload-%d", time.Now().UnixNano()) 247 - tempDigest := digest.Digest(fmt.Sprintf("uploads/temp-%s", writerID)) 248 - 249 - // Start multipart upload via hold service 250 - uploadID, err := p.startMultipartUpload(ctx, tempDigest) 251 - if err != nil { 252 - return nil, fmt.Errorf("failed to start multipart upload: %w", err) 253 - } 254 - 255 - writer := &ProxyBlobWriter{ 256 - store: p, 257 - options: opts, 258 - uploadID: uploadID, 259 - parts: make([]CompletedPart, 0), 260 - partNumber: 1, 261 - buffer: bytes.NewBuffer(make([]byte, 0, 5*1024*1024)), // 5MB buffer 262 - id: writerID, 263 - startedAt: time.Now(), 264 - } 265 - 266 - // Store in global map for Resume() 267 - globalUploadsMu.Lock() 268 - globalUploads[writer.id] = writer 269 - globalUploadsMu.Unlock() 270 - 271 - return writer, nil 272 - } 273 - 274 - New Write() - Buffer and Flush Parts 275 - 276 - func (w *ProxyBlobWriter) Write(p []byte) (int, error) { 277 - if w.closed { 278 - return 0, fmt.Errorf("writer closed") 279 - } 280 - 281 - n, err := w.buffer.Write(p) 282 - w.size += int64(n) 283 - 284 - // Flush if buffer reaches 5MB (S3 minimum part size) 285 - if w.buffer.Len() >= 5*1024*1024 { 286 - if err := w.flushPart(); err != nil { 287 - return n, err 288 - } 289 - } 290 - 291 - return n, err 292 - } 293 - 294 - func (w *ProxyBlobWriter) flushPart() error { 295 - if w.buffer.Len() == 0 { 296 - return nil 297 - } 298 - 299 - ctx, cancel := context.WithTimeout(context.Background(), 5*time.Minute) 300 - defer cancel() 301 - 302 - // Get presigned URL for this part 303 - tempDigest := digest.Digest(fmt.Sprintf("uploads/temp-%s", w.id)) 304 - url, err := w.store.getPartPresignedURL(ctx, tempDigest, w.uploadID, w.partNumber) 305 - if err != nil { 306 - return fmt.Errorf("failed to get part presigned URL: %w", err) 307 - } 308 - 309 - // Upload part to S3 310 - req, err := http.NewRequestWithContext(ctx, "PUT", url, bytes.NewReader(w.buffer.Bytes())) 311 - if err != nil { 312 - return err 313 - } 314 - 315 - resp, err := w.store.httpClient.Do(req) 316 - if err != nil { 317 - return err 318 - } 319 - defer resp.Body.Close() 320 - 321 - if resp.StatusCode != http.StatusOK && resp.StatusCode != http.StatusCreated { 322 - return fmt.Errorf("part upload failed: status %d", resp.StatusCode) 323 - } 324 - 325 - // Store ETag for completion 326 - etag := resp.Header.Get("ETag") 327 - if etag == "" { 328 - return fmt.Errorf("no ETag in response") 329 - } 330 - 331 - w.parts = append(w.parts, CompletedPart{ 332 - PartNumber: w.partNumber, 333 - ETag: etag, 334 - }) 335 - 336 - // Reset buffer and increment part number 337 - w.buffer.Reset() 338 - w.partNumber++ 339 - 340 - return nil 341 - } 342 - 343 - New Commit() - Complete Multipart and Move 344 - 345 - func (w *ProxyBlobWriter) Commit(ctx context.Context, desc distribution.Descriptor) (distribution.Descriptor, error) { 346 - if w.closed { 347 - return distribution.Descriptor{}, fmt.Errorf("writer closed") 348 - } 349 - w.closed = true 350 - 351 - // Flush any remaining buffered data 352 - if w.buffer.Len() > 0 { 353 - if err := w.flushPart(); err != nil { 354 - // Try to abort multipart on error 355 - w.store.abortMultipartUpload(ctx, w.uploadID) 356 - return distribution.Descriptor{}, err 357 - } 358 - } 359 - 360 - // Complete multipart upload at temp location 361 - tempDigest := digest.Digest(fmt.Sprintf("uploads/temp-%s", w.id)) 362 - if err := w.store.completeMultipartUpload(ctx, tempDigest, w.uploadID, w.parts); err != nil { 363 - return distribution.Descriptor{}, err 364 - } 365 - 366 - // Move from temp → final location (server-side S3 copy) 367 - tempPath := fmt.Sprintf("uploads/temp-%s", w.id) 368 - finalPath := desc.Digest.String() 369 - 370 - moveURL := fmt.Sprintf("%s/move?from=%s&to=%s&did=%s", 371 - w.store.storageEndpoint, tempPath, finalPath, w.store.did) 372 - 373 - req, err := http.NewRequestWithContext(ctx, "POST", moveURL, nil) 374 - if err != nil { 375 - return distribution.Descriptor{}, err 376 - } 377 - 378 - resp, err := w.store.httpClient.Do(req) 379 - if err != nil { 380 - return distribution.Descriptor{}, err 381 - } 382 - defer resp.Body.Close() 383 - 384 - if resp.StatusCode != http.StatusOK && resp.StatusCode != http.StatusCreated { 385 - bodyBytes, _ := io.ReadAll(resp.Body) 386 - return distribution.Descriptor{}, fmt.Errorf("move failed: %d, %s", resp.StatusCode, bodyBytes) 387 - } 388 - 389 - // Remove from global map 390 - globalUploadsMu.Lock() 391 - delete(globalUploads, w.id) 392 - globalUploadsMu.Unlock() 393 - 394 - return distribution.Descriptor{ 395 - Digest: desc.Digest, 396 - Size: w.size, 397 - MediaType: desc.MediaType, 398 - }, nil 399 - } 400 - 401 - Add Hold Service Client Methods 402 - 403 - func (p *ProxyBlobStore) startMultipartUpload(ctx context.Context, dgst digest.Digest) (string, error) { 404 - reqBody := map[string]any{ 405 - "did": p.did, 406 - "digest": dgst.String(), 407 - } 408 - body, _ := json.Marshal(reqBody) 409 - 410 - url := fmt.Sprintf("%s/start-multipart", p.storageEndpoint) 411 - req, _ := http.NewRequestWithContext(ctx, "POST", url, bytes.NewReader(body)) 412 - req.Header.Set("Content-Type", "application/json") 413 - 414 - resp, err := p.httpClient.Do(req) 415 - if err != nil { 416 - return "", err 417 - } 418 - defer resp.Body.Close() 419 - 420 - var result struct { 421 - UploadID string `json:"upload_id"` 422 - } 423 - if err := json.NewDecoder(resp.Body).Decode(&result); err != nil { 424 - return "", err 425 - } 426 - 427 - return result.UploadID, nil 428 - } 429 - 430 - func (p *ProxyBlobStore) getPartPresignedURL(ctx context.Context, dgst digest.Digest, uploadID string, partNumber int) (string, error) { 431 - reqBody := map[string]any{ 432 - "did": p.did, 433 - "digest": dgst.String(), 434 - "upload_id": uploadID, 435 - "part_number": partNumber, 436 - } 437 - body, _ := json.Marshal(reqBody) 438 - 439 - url := fmt.Sprintf("%s/part-presigned-url", p.storageEndpoint) 440 - req, _ := http.NewRequestWithContext(ctx, "POST", url, bytes.NewReader(body)) 441 - req.Header.Set("Content-Type", "application/json") 442 - 443 - resp, err := p.httpClient.Do(req) 444 - if err != nil { 445 - return "", err 446 - } 447 - defer resp.Body.Close() 448 - 449 - var result struct { 450 - URL string `json:"url"` 451 - } 452 - if err := json.NewDecoder(resp.Body).Decode(&result); err != nil { 453 - return "", err 454 - } 455 - 456 - return result.URL, nil 457 - } 458 - 459 - func (p *ProxyBlobStore) completeMultipartUpload(ctx context.Context, dgst digest.Digest, uploadID string, parts []CompletedPart) error { 460 - reqBody := map[string]any{ 461 - "did": p.did, 462 - "digest": dgst.String(), 463 - "upload_id": uploadID, 464 - "parts": parts, 465 - } 466 - body, _ := json.Marshal(reqBody) 467 - 468 - url := fmt.Sprintf("%s/complete-multipart", p.storageEndpoint) 469 - req, _ := http.NewRequestWithContext(ctx, "POST", url, bytes.NewReader(body)) 470 - req.Header.Set("Content-Type", "application/json") 471 - 472 - resp, err := p.httpClient.Do(req) 473 - if err != nil { 474 - return err 475 - } 476 - defer resp.Body.Close() 477 - 478 - if resp.StatusCode != http.StatusOK { 479 - return fmt.Errorf("complete multipart failed: status %d", resp.StatusCode) 480 - } 481 - 482 - return nil 483 - } 484 - 485 - --- 486 - Testing Plan 487 - 488 - 1. Unit Tests 489 - 490 - - Test multipart upload initiation 491 - - Test part upload with presigned URLs 492 - - Test completion with ETags 493 - - Test abort on errors 494 - 495 - 2. Integration Tests 496 - 497 - - Push small images (< 5MB, single part) 498 - - Push medium images (10MB, 2 parts) 499 - - Push large images (100MB, 20 parts) 500 - - Test with Upcloud S3 501 - - Test with Storj S3 502 - 503 - 3. Validation 504 - 505 - - Monitor logs for "client disconnected" errors (should be gone) 506 - - Check Docker push success rate 507 - - Verify blobs stored correctly in S3 508 - - Check bandwidth usage on hold service (should be minimal) 509 - 510 - --- 511 - Migration & Deployment 512 - 513 - Backward Compatibility 514 - 515 - - Keep /put-presigned-url endpoint for fallback 516 - - Keep /move endpoint (still needed) 517 - - New multipart endpoints are additive 518 - 519 - Deployment Steps 520 - 521 - 1. Update hold service with new endpoints 522 - 2. Update AppView ProxyBlobStore 523 - 3. Deploy hold service first 524 - 4. Deploy AppView 525 - 5. Test with sample push 526 - 6. Monitor logs 527 - 528 - Rollback Plan 529 - 530 - - Revert AppView to previous version (uses old presigned URL method) 531 - - Hold service keeps both old and new endpoints 532 - 533 - --- 534 - Documentation Updates 535 - 536 - Update docs/PRESIGNED_URLS.md 537 - 538 - - Add section "Multipart Upload for Chunked Data" 539 - - Explain why single presigned URLs don't work with PATCH 540 - - Document new endpoints and flow 541 - - Add S3 part size recommendations (5MB-64MB for Storj) 542 - 543 - Add Troubleshooting Section 544 - 545 - - "Client disconnected during PATCH" → resolved by multipart 546 - - Storj-specific considerations (64MB parts recommended) 547 - - Upcloud compatibility notes 548 - 549 - --- 550 - Performance Impact 551 - 552 - Before (Broken) 553 - 554 - - Docker PATCH → blocks on pipe → timeout → retry → fail 555 - - Unable to push large images reliably 556 - 557 - After (Multipart) 558 - 559 - - Each PATCH → independent part upload → immediate response 560 - - No blocking, no timeouts 561 - - Parallel part uploads possible (future optimization) 562 - - Reliable pushes for any image size 563 - 564 - Bandwidth 565 - 566 - - Hold service: Only API calls (~1KB per part) 567 - - Direct S3 uploads: Full blob data 568 - - S3 copy for move: Server-side (no hold bandwidth) 569 - 570 - Estimated savings: 99.98% hold service bandwidth reduction (same as before, but now actually works!)
-448
docs/MULTIPART_OLD.md
··· 1 - S3 Multipart Upload Implementation Plan 2 - Problem Summary 3 - Current implementation uses a single presigned URL with a pipe for chunked uploads (PATCH). This causes: 4 - - Docker PATCH requests block waiting for pipe writes 5 - - S3 upload happens in background via single presigned URL 6 - - Docker times out → "client disconnected during blob PATCH" 7 - - Root cause: Single presigned URLs don't support OCI's chunked upload protocol 8 - Solution: S3 Multipart Upload API 9 - Implement proper S3 multipart upload to support Docker's chunked PATCH operations: 10 - - Each PATCH → separate S3 part upload with its own presigned URL 11 - - On Commit → complete multipart upload 12 - - No buffering, no pipes, no blocking 13 - --- 14 - Architecture Changes 15 - Current (Broken) Flow 16 - POST /blobs/uploads/ → Create() → Single presigned URL to temp location 17 - PATCH → Write to pipe → [blocks] → Background goroutine uploads via single URL 18 - PATCH → [blocks on pipe] → Docker timeout → disconnect ❌ 19 - New (Multipart) Flow 20 - POST /blobs/uploads/ → Create() → Initiate multipart upload, get upload ID 21 - PATCH #1 → Get presigned URL for part 1 → Upload part 1 to S3 → Store ETag 22 - PATCH #2 → Get presigned URL for part 2 → Upload part 2 to S3 → Store ETag 23 - PUT (commit) → Complete multipart upload with ETags → Done ✅ 24 - --- 25 - Implementation Details 26 - 1. Hold Service: Add Multipart Upload Endpoints 27 - File: cmd/hold/main.go 28 - New Request/Response Types 29 - // StartMultipartUploadRequest initiates a multipart upload 30 - type StartMultipartUploadRequest struct { 31 - DID string `json:"did"` 32 - Digest string `json:"digest"` 33 - } 34 - type StartMultipartUploadResponse struct { 35 - UploadID string `json:"upload_id"` 36 - ExpiresAt time.Time `json:"expires_at"` 37 - } 38 - // GetPartURLRequest requests a presigned URL for a specific part 39 - type GetPartURLRequest struct { 40 - DID string `json:"did"` 41 - Digest string `json:"digest"` 42 - UploadID string `json:"upload_id"` 43 - PartNumber int `json:"part_number"` 44 - } 45 - type GetPartURLResponse struct { 46 - URL string `json:"url"` 47 - ExpiresAt time.Time `json:"expires_at"` 48 - } 49 - // CompleteMultipartRequest completes a multipart upload 50 - type CompleteMultipartRequest struct { 51 - DID string `json:"did"` 52 - Digest string `json:"digest"` 53 - UploadID string `json:"upload_id"` 54 - Parts []CompletedPart `json:"parts"` 55 - } 56 - type CompletedPart struct { 57 - PartNumber int `json:"part_number"` 58 - ETag string `json:"etag"` 59 - } 60 - // AbortMultipartRequest aborts an in-progress upload 61 - type AbortMultipartRequest struct { 62 - DID string `json:"did"` 63 - Digest string `json:"digest"` 64 - UploadID string `json:"upload_id"` 65 - } 66 - New Endpoints 67 - POST /start-multipart 68 - func (s *HoldService) HandleStartMultipart(w http.ResponseWriter, r *http.Request) { 69 - // Validate DID authorization for WRITE 70 - // Build S3 key from digest 71 - // Call s3.CreateMultipartUploadRequest() 72 - // Generate presigned URL if needed, or return upload ID 73 - // Return upload ID to client 74 - } 75 - POST /part-presigned-url 76 - func (s *HoldService) HandleGetPartURL(w http.ResponseWriter, r *http.Request) { 77 - // Validate DID authorization for WRITE 78 - // Build S3 key from digest 79 - // Call s3.UploadPartRequest() with part number and upload ID 80 - // Generate presigned URL 81 - // Return presigned URL for this specific part 82 - } 83 - POST /complete-multipart 84 - func (s *HoldService) HandleCompleteMultipart(w http.ResponseWriter, r *http.Request) { 85 - // Validate DID authorization for WRITE 86 - // Build S3 key from digest 87 - // Prepare CompletedPart array with part numbers and ETags 88 - // Call s3.CompleteMultipartUpload() 89 - // Return success 90 - } 91 - POST /abort-multipart (for cleanup) 92 - func (s *HoldService) HandleAbortMultipart(w http.ResponseWriter, r *http.Request) { 93 - // Validate DID authorization for WRITE 94 - // Call s3.AbortMultipartUpload() 95 - // Return success 96 - } 97 - S3 Implementation 98 - // startMultipartUpload initiates a multipart upload and returns upload ID 99 - func (s *HoldService) startMultipartUpload(ctx context.Context, digest string) (string, error) { 100 - if s.s3Client == nil { 101 - return "", fmt.Errorf("S3 not configured") 102 - } 103 - path := blobPath(digest) 104 - s3Key := strings.TrimPrefix(path, "/") 105 - if s.s3PathPrefix != "" { 106 - s3Key = s.s3PathPrefix + "/" + s3Key 107 - } 108 - result, err := s.s3Client.CreateMultipartUploadWithContext(ctx, &s3.CreateMultipartUploadInput{ 109 - Bucket: aws.String(s.bucket), 110 - Key: aws.String(s3Key), 111 - }) 112 - if err != nil { 113 - return "", err 114 - } 115 - return *result.UploadId, nil 116 - } 117 - // getPartPresignedURL generates presigned URL for a specific part 118 - func (s *HoldService) getPartPresignedURL(ctx context.Context, digest, uploadID string, partNumber int) (string, error) { 119 - if s.s3Client == nil { 120 - return "", fmt.Errorf("S3 not configured") 121 - } 122 - path := blobPath(digest) 123 - s3Key := strings.TrimPrefix(path, "/") 124 - if s.s3PathPrefix != "" { 125 - s3Key = s.s3PathPrefix + "/" + s3Key 126 - } 127 - req, _ := s.s3Client.UploadPartRequest(&s3.UploadPartInput{ 128 - Bucket: aws.String(s.bucket), 129 - Key: aws.String(s3Key), 130 - UploadId: aws.String(uploadID), 131 - PartNumber: aws.Int64(int64(partNumber)), 132 - }) 133 - return req.Presign(15 * time.Minute) 134 - } 135 - // completeMultipartUpload finalizes the multipart upload 136 - func (s *HoldService) completeMultipartUpload(ctx context.Context, digest, uploadID string, parts []CompletedPart) error { 137 - if s.s3Client == nil { 138 - return fmt.Errorf("S3 not configured") 139 - } 140 - path := blobPath(digest) 141 - s3Key := strings.TrimPrefix(path, "/") 142 - if s.s3PathPrefix != "" { 143 - s3Key = s.s3PathPrefix + "/" + s3Key 144 - } 145 - // Convert to S3 CompletedPart format 146 - s3Parts := make([]*s3.CompletedPart, len(parts)) 147 - for i, p := range parts { 148 - s3Parts[i] = &s3.CompletedPart{ 149 - PartNumber: aws.Int64(int64(p.PartNumber)), 150 - ETag: aws.String(p.ETag), 151 - } 152 - } 153 - _, err := s.s3Client.CompleteMultipartUploadWithContext(ctx, &s3.CompleteMultipartUploadInput{ 154 - Bucket: aws.String(s.bucket), 155 - Key: aws.String(s3Key), 156 - UploadId: aws.String(uploadID), 157 - MultipartUpload: &s3.CompletedMultipartUpload{ 158 - Parts: s3Parts, 159 - }, 160 - }) 161 - return err 162 - } 163 - --- 164 - 2. AppView: Rewrite ProxyBlobStore for Multipart 165 - File: pkg/storage/proxy_blob_store.go 166 - Remove Current Implementation 167 - - Remove pipe-based streaming 168 - - Remove background goroutine with single presigned URL 169 - - Remove global upload tracking map 170 - New ProxyBlobWriter Structure 171 - type ProxyBlobWriter struct { 172 - store *ProxyBlobStore 173 - options distribution.CreateOptions 174 - uploadID string // S3 multipart upload ID 175 - parts []CompletedPart // Track uploaded parts with ETags 176 - partNumber int // Current part number (starts at 1) 177 - buffer *bytes.Buffer // Buffer for current part 178 - size int64 // Total bytes written 179 - closed bool 180 - id string // Distribution's upload ID (for state) 181 - startedAt time.Time 182 - finalDigest string // Set on Commit 183 - } 184 - type CompletedPart struct { 185 - PartNumber int 186 - ETag string 187 - } 188 - New Create() - Initiate Multipart Upload 189 - func (p *ProxyBlobStore) Create(ctx context.Context, options ...distribution.BlobCreateOption) (distribution.BlobWriter, error) { 190 - var opts distribution.CreateOptions 191 - for _, option := range options { 192 - if err := option.Apply(&opts); err != nil { 193 - return nil, err 194 - } 195 - } 196 - // Use temp digest for upload location 197 - writerID := fmt.Sprintf("upload-%d", time.Now().UnixNano()) 198 - tempDigest := digest.Digest(fmt.Sprintf("uploads/temp-%s", writerID)) 199 - // Start multipart upload via hold service 200 - uploadID, err := p.startMultipartUpload(ctx, tempDigest) 201 - if err != nil { 202 - return nil, fmt.Errorf("failed to start multipart upload: %w", err) 203 - } 204 - writer := &ProxyBlobWriter{ 205 - store: p, 206 - options: opts, 207 - uploadID: uploadID, 208 - parts: make([]CompletedPart, 0), 209 - partNumber: 1, 210 - buffer: bytes.NewBuffer(make([]byte, 0, 5*1024*1024)), // 5MB buffer 211 - id: writerID, 212 - startedAt: time.Now(), 213 - } 214 - // Store in global map for Resume() 215 - globalUploadsMu.Lock() 216 - globalUploads[writer.id] = writer 217 - globalUploadsMu.Unlock() 218 - return writer, nil 219 - } 220 - New Write() - Buffer and Flush Parts 221 - func (w *ProxyBlobWriter) Write(p []byte) (int, error) { 222 - if w.closed { 223 - return 0, fmt.Errorf("writer closed") 224 - } 225 - n, err := w.buffer.Write(p) 226 - w.size += int64(n) 227 - // Flush if buffer reaches 5MB (S3 minimum part size) 228 - if w.buffer.Len() >= 5*1024*1024 { 229 - if err := w.flushPart(); err != nil { 230 - return n, err 231 - } 232 - } 233 - return n, err 234 - } 235 - func (w *ProxyBlobWriter) flushPart() error { 236 - if w.buffer.Len() == 0 { 237 - return nil 238 - } 239 - ctx, cancel := context.WithTimeout(context.Background(), 5*time.Minute) 240 - defer cancel() 241 - // Get presigned URL for this part 242 - tempDigest := digest.Digest(fmt.Sprintf("uploads/temp-%s", w.id)) 243 - url, err := w.store.getPartPresignedURL(ctx, tempDigest, w.uploadID, w.partNumber) 244 - if err != nil { 245 - return fmt.Errorf("failed to get part presigned URL: %w", err) 246 - } 247 - // Upload part to S3 248 - req, err := http.NewRequestWithContext(ctx, "PUT", url, bytes.NewReader(w.buffer.Bytes())) 249 - if err != nil { 250 - return err 251 - } 252 - resp, err := w.store.httpClient.Do(req) 253 - if err != nil { 254 - return err 255 - } 256 - defer resp.Body.Close() 257 - if resp.StatusCode != http.StatusOK && resp.StatusCode != http.StatusCreated { 258 - return fmt.Errorf("part upload failed: status %d", resp.StatusCode) 259 - } 260 - // Store ETag for completion 261 - etag := resp.Header.Get("ETag") 262 - if etag == "" { 263 - return fmt.Errorf("no ETag in response") 264 - } 265 - w.parts = append(w.parts, CompletedPart{ 266 - PartNumber: w.partNumber, 267 - ETag: etag, 268 - }) 269 - // Reset buffer and increment part number 270 - w.buffer.Reset() 271 - w.partNumber++ 272 - return nil 273 - } 274 - New Commit() - Complete Multipart and Move 275 - func (w *ProxyBlobWriter) Commit(ctx context.Context, desc distribution.Descriptor) (distribution.Descriptor, error) { 276 - if w.closed { 277 - return distribution.Descriptor{}, fmt.Errorf("writer closed") 278 - } 279 - w.closed = true 280 - // Flush any remaining buffered data 281 - if w.buffer.Len() > 0 { 282 - if err := w.flushPart(); err != nil { 283 - // Try to abort multipart on error 284 - w.store.abortMultipartUpload(ctx, w.uploadID) 285 - return distribution.Descriptor{}, err 286 - } 287 - } 288 - // Complete multipart upload at temp location 289 - tempDigest := digest.Digest(fmt.Sprintf("uploads/temp-%s", w.id)) 290 - if err := w.store.completeMultipartUpload(ctx, tempDigest, w.uploadID, w.parts); err != nil { 291 - return distribution.Descriptor{}, err 292 - } 293 - // Move from temp → final location (server-side S3 copy) 294 - tempPath := fmt.Sprintf("uploads/temp-%s", w.id) 295 - finalPath := desc.Digest.String() 296 - moveURL := fmt.Sprintf("%s/move?from=%s&to=%s&did=%s", 297 - w.store.storageEndpoint, tempPath, finalPath, w.store.did) 298 - req, err := http.NewRequestWithContext(ctx, "POST", moveURL, nil) 299 - if err != nil { 300 - return distribution.Descriptor{}, err 301 - } 302 - resp, err := w.store.httpClient.Do(req) 303 - if err != nil { 304 - return distribution.Descriptor{}, err 305 - } 306 - defer resp.Body.Close() 307 - if resp.StatusCode != http.StatusOK && resp.StatusCode != http.StatusCreated { 308 - bodyBytes, _ := io.ReadAll(resp.Body) 309 - return distribution.Descriptor{}, fmt.Errorf("move failed: %d, %s", resp.StatusCode, bodyBytes) 310 - } 311 - // Remove from global map 312 - globalUploadsMu.Lock() 313 - delete(globalUploads, w.id) 314 - globalUploadsMu.Unlock() 315 - return distribution.Descriptor{ 316 - Digest: desc.Digest, 317 - Size: w.size, 318 - MediaType: desc.MediaType, 319 - }, nil 320 - } 321 - Add Hold Service Client Methods 322 - func (p *ProxyBlobStore) startMultipartUpload(ctx context.Context, dgst digest.Digest) (string, error) { 323 - reqBody := map[string]any{ 324 - "did": p.did, 325 - "digest": dgst.String(), 326 - } 327 - body, _ := json.Marshal(reqBody) 328 - url := fmt.Sprintf("%s/start-multipart", p.storageEndpoint) 329 - req, _ := http.NewRequestWithContext(ctx, "POST", url, bytes.NewReader(body)) 330 - req.Header.Set("Content-Type", "application/json") 331 - resp, err := p.httpClient.Do(req) 332 - if err != nil { 333 - return "", err 334 - } 335 - defer resp.Body.Close() 336 - var result struct { 337 - UploadID string `json:"upload_id"` 338 - } 339 - if err := json.NewDecoder(resp.Body).Decode(&result); err != nil { 340 - return "", err 341 - } 342 - return result.UploadID, nil 343 - } 344 - func (p *ProxyBlobStore) getPartPresignedURL(ctx context.Context, dgst digest.Digest, uploadID string, partNumber int) (string, error) { 345 - reqBody := map[string]any{ 346 - "did": p.did, 347 - "digest": dgst.String(), 348 - "upload_id": uploadID, 349 - "part_number": partNumber, 350 - } 351 - body, _ := json.Marshal(reqBody) 352 - url := fmt.Sprintf("%s/part-presigned-url", p.storageEndpoint) 353 - req, _ := http.NewRequestWithContext(ctx, "POST", url, bytes.NewReader(body)) 354 - req.Header.Set("Content-Type", "application/json") 355 - resp, err := p.httpClient.Do(req) 356 - if err != nil { 357 - return "", err 358 - } 359 - defer resp.Body.Close() 360 - var result struct { 361 - URL string `json:"url"` 362 - } 363 - if err := json.NewDecoder(resp.Body).Decode(&result); err != nil { 364 - return "", err 365 - } 366 - return result.URL, nil 367 - } 368 - func (p *ProxyBlobStore) completeMultipartUpload(ctx context.Context, dgst digest.Digest, uploadID string, parts []CompletedPart) error { 369 - reqBody := map[string]any{ 370 - "did": p.did, 371 - "digest": dgst.String(), 372 - "upload_id": uploadID, 373 - "parts": parts, 374 - } 375 - body, _ := json.Marshal(reqBody) 376 - url := fmt.Sprintf("%s/complete-multipart", p.storageEndpoint) 377 - req, _ := http.NewRequestWithContext(ctx, "POST", url, bytes.NewReader(body)) 378 - req.Header.Set("Content-Type", "application/json") 379 - resp, err := p.httpClient.Do(req) 380 - if err != nil { 381 - return err 382 - } 383 - defer resp.Body.Close() 384 - if resp.StatusCode != http.StatusOK { 385 - return fmt.Errorf("complete multipart failed: status %d", resp.StatusCode) 386 - } 387 - return nil 388 - } 389 - --- 390 - Testing Plan 391 - 1. Unit Tests 392 - - Test multipart upload initiation 393 - - Test part upload with presigned URLs 394 - - Test completion with ETags 395 - - Test abort on errors 396 - 2. Integration Tests 397 - - Push small images (< 5MB, single part) 398 - - Push medium images (10MB, 2 parts) 399 - - Push large images (100MB, 20 parts) 400 - - Test with Upcloud S3 401 - - Test with Storj S3 402 - 3. Validation 403 - - Monitor logs for "client disconnected" errors (should be gone) 404 - - Check Docker push success rate 405 - - Verify blobs stored correctly in S3 406 - - Check bandwidth usage on hold service (should be minimal) 407 - --- 408 - Migration & Deployment 409 - Backward Compatibility 410 - - Keep /put-presigned-url endpoint for fallback 411 - - Keep /move endpoint (still needed) 412 - - New multipart endpoints are additive 413 - Deployment Steps 414 - 1. Update hold service with new endpoints 415 - 2. Update AppView ProxyBlobStore 416 - 3. Deploy hold service first 417 - 4. Deploy AppView 418 - 5. Test with sample push 419 - 6. Monitor logs 420 - Rollback Plan 421 - - Revert AppView to previous version (uses old presigned URL method) 422 - - Hold service keeps both old and new endpoints 423 - --- 424 - Documentation Updates 425 - Update docs/PRESIGNED_URLS.md 426 - - Add section "Multipart Upload for Chunked Data" 427 - - Explain why single presigned URLs don't work with PATCH 428 - - Document new endpoints and flow 429 - - Add S3 part size recommendations (5MB-64MB for Storj) 430 - Add Troubleshooting Section 431 - - "Client disconnected during PATCH" → resolved by multipart 432 - - Storj-specific considerations (64MB parts recommended) 433 - - Upcloud compatibility notes 434 - --- 435 - Performance Impact 436 - Before (Broken) 437 - - Docker PATCH → blocks on pipe → timeout → retry → fail 438 - - Unable to push large images reliably 439 - After (Multipart) 440 - - Each PATCH → independent part upload → immediate response 441 - - No blocking, no timeouts 442 - - Parallel part uploads possible (future optimization) 443 - - Reliable pushes for any image size 444 - Bandwidth 445 - - Hold service: Only API calls (~1KB per part) 446 - - Direct S3 uploads: Full blob data 447 - - S3 copy for move: Server-side (no hold bandwidth) 448 - Estimated savings: 99.98% hold service bandwidth reduction (same as before, but now actually works!)
-1017
docs/PRESIGNED_UPLOADS.md
··· 1 - # Presigned Upload URLs Implementation Guide 2 - 3 - ## Current Architecture (Proxy Mode) 4 - 5 - ### Upload Flow Today 6 - 1. **AppView** receives blob upload request from Docker 7 - 2. **ProxyBlobStore.Create()** creates streaming upload via pipe 8 - 3. Data streams to **Hold Service** temp location: `uploads/temp-{id}` 9 - 4. Hold service uploads to S3 via storage driver 10 - 5. **ProxyBlobWriter.Commit()** moves blob: temp → final digest-based path 11 - 6. Hold service performs S3 Move operation 12 - 13 - ### Why Uploads Don't Use Presigned URLs Today 14 - - `Create()` doesn't know the blob digest upfront 15 - - Presigned S3 URLs require the full object key (which includes digest) 16 - - Current approach streams to temp location, calculates digest, then moves 17 - 18 - ### Bandwidth Flow (Current) 19 - ``` 20 - Docker → AppView → Hold Service → S3/Storj 21 - (proxy) (proxy) 22 - ``` 23 - 24 - All upload bandwidth flows through Hold Service. 25 - 26 - --- 27 - 28 - ## Proposed Architecture (Presigned Uploads) 29 - 30 - ### New Upload Flow 31 - 1. **AppView** receives blob upload request from Docker 32 - 2. **ProxyBlobStore.Create()** creates buffered upload writer 33 - 3. Data buffered in memory during `Write()` calls 34 - 4. **ProxyBlobWriter.Commit()** calculates digest from buffer 35 - 5. Request presigned PUT URL from Hold Service with digest 36 - 6. Upload buffered data directly to S3 via presigned URL 37 - 7. No move operation needed (uploaded to final path) 38 - 39 - ### Bandwidth Flow (Presigned) 40 - ``` 41 - Docker → AppView → S3/Storj (direct via presigned URL) 42 - (buffer) 43 - 44 - Hold Service only issues presigned URLs (minimal bandwidth) 45 - ``` 46 - 47 - --- 48 - 49 - ## Detailed Implementation 50 - 51 - ### Phase 1: Add Buffering to ProxyBlobWriter 52 - 53 - **File:** `pkg/storage/proxy_blob_store.go` 54 - 55 - #### Changes to ProxyBlobWriter struct 56 - 57 - ```go 58 - type ProxyBlobWriter struct { 59 - store *ProxyBlobStore 60 - options distribution.CreateOptions 61 - 62 - // Remove pipe-based streaming 63 - // pipeWriter *io.PipeWriter 64 - // pipeReader *io.PipeReader 65 - // digestChan chan string 66 - // uploadErr chan error 67 - 68 - // Add buffering 69 - buffer *bytes.Buffer // In-memory buffer for blob data 70 - hasher digest.Digester // Calculate digest while writing 71 - 72 - finalDigest string 73 - size int64 74 - closed bool 75 - id string 76 - startedAt time.Time 77 - } 78 - ``` 79 - 80 - **Rationale:** 81 - - Remove pipe mechanism (no longer streaming to temp) 82 - - Add buffer to store blob data in memory 83 - - Add hasher to calculate digest incrementally 84 - 85 - #### Modify Create() method 86 - 87 - **Before (lines 208-312):** 88 - ```go 89 - func (p *ProxyBlobStore) Create(ctx context.Context, options ...distribution.BlobCreateOption) (distribution.BlobWriter, error) { 90 - // Creates pipe and starts background goroutine for streaming 91 - pipeReader, pipeWriter := io.Pipe() 92 - // ... streams to temp location 93 - } 94 - ``` 95 - 96 - **After:** 97 - ```go 98 - func (p *ProxyBlobStore) Create(ctx context.Context, options ...distribution.BlobCreateOption) (distribution.BlobWriter, error) { 99 - fmt.Printf("🔧 [proxy_blob_store/Create] Starting buffered upload for presigned URL\n") 100 - 101 - // Parse options 102 - var opts distribution.CreateOptions 103 - for _, option := range options { 104 - if err := option.Apply(&opts); err != nil { 105 - return nil, err 106 - } 107 - } 108 - 109 - // Create buffered writer 110 - writer := &ProxyBlobWriter{ 111 - store: p, 112 - options: opts, 113 - buffer: new(bytes.Buffer), 114 - hasher: digest.Canonical.Digester(), // Usually SHA256 115 - id: fmt.Sprintf("upload-%d", time.Now().UnixNano()), 116 - startedAt: time.Now(), 117 - } 118 - 119 - // Store in global uploads map for resume support 120 - globalUploadsMu.Lock() 121 - globalUploads[writer.id] = writer 122 - globalUploadsMu.Unlock() 123 - 124 - fmt.Printf(" Upload ID: %s\n", writer.id) 125 - fmt.Printf(" Repository: %s\n", p.repository) 126 - 127 - return writer, nil 128 - } 129 - ``` 130 - 131 - **Key Changes:** 132 - - No more pipe creation 133 - - No background goroutine 134 - - Initialize buffer and hasher 135 - - Everything else stays synchronous 136 - 137 - #### Modify Write() method 138 - 139 - **Before (lines 440-455):** 140 - ```go 141 - func (w *ProxyBlobWriter) Write(p []byte) (int, error) { 142 - // Writes to pipe, streams to hold service 143 - n, err := w.pipeWriter.Write(p) 144 - w.size += int64(n) 145 - return n, nil 146 - } 147 - ``` 148 - 149 - **After:** 150 - ```go 151 - func (w *ProxyBlobWriter) Write(p []byte) (int, error) { 152 - if w.closed { 153 - return 0, fmt.Errorf("writer closed") 154 - } 155 - 156 - // Write to buffer 157 - n, err := w.buffer.Write(p) 158 - if err != nil { 159 - return n, fmt.Errorf("failed to buffer data: %w", err) 160 - } 161 - 162 - // Update hasher for digest calculation 163 - w.hasher.Hash().Write(p) 164 - 165 - w.size += int64(n) 166 - 167 - // Memory pressure check (optional safety) 168 - if w.buffer.Len() > 500*1024*1024 { // 500MB limit 169 - return n, fmt.Errorf("blob too large for buffered upload: %d bytes", w.buffer.Len()) 170 - } 171 - 172 - return n, nil 173 - } 174 - ``` 175 - 176 - **Key Changes:** 177 - - Write to in-memory buffer instead of pipe 178 - - Update hasher incrementally (efficient) 179 - - Add safety check for excessive memory usage 180 - - No streaming to hold service yet 181 - 182 - #### Modify Commit() method 183 - 184 - **Before (lines 493-548):** 185 - ```go 186 - func (w *ProxyBlobWriter) Commit(ctx context.Context, desc distribution.Descriptor) (distribution.Descriptor, error) { 187 - // Close pipe, send digest to goroutine 188 - // Wait for temp upload 189 - // Move temp → final 190 - } 191 - ``` 192 - 193 - **After:** 194 - ```go 195 - func (w *ProxyBlobWriter) Commit(ctx context.Context, desc distribution.Descriptor) (distribution.Descriptor, error) { 196 - if w.closed { 197 - return distribution.Descriptor{}, fmt.Errorf("writer closed") 198 - } 199 - w.closed = true 200 - 201 - // Remove from global uploads map 202 - globalUploadsMu.Lock() 203 - delete(globalUploads, w.id) 204 - globalUploadsMu.Unlock() 205 - 206 - // Calculate digest from buffered data 207 - calculatedDigest := w.hasher.Digest() 208 - 209 - // Verify digest matches if provided 210 - if desc.Digest != "" && desc.Digest != calculatedDigest { 211 - return distribution.Descriptor{}, fmt.Errorf( 212 - "digest mismatch: expected %s, got %s", 213 - desc.Digest, calculatedDigest, 214 - ) 215 - } 216 - 217 - finalDigest := calculatedDigest 218 - if desc.Digest != "" { 219 - finalDigest = desc.Digest 220 - } 221 - 222 - fmt.Printf("📤 [ProxyBlobWriter.Commit] Uploading via presigned URL\n") 223 - fmt.Printf(" Digest: %s\n", finalDigest) 224 - fmt.Printf(" Size: %d bytes\n", w.size) 225 - fmt.Printf(" Buffered: %d bytes\n", w.buffer.Len()) 226 - 227 - // Get presigned upload URL from hold service 228 - url, err := w.store.getUploadURL(ctx, finalDigest, w.size) 229 - if err != nil { 230 - return distribution.Descriptor{}, fmt.Errorf("failed to get presigned upload URL: %w", err) 231 - } 232 - 233 - fmt.Printf(" Presigned URL: %s\n", url) 234 - 235 - // Upload directly to S3 via presigned URL 236 - req, err := http.NewRequestWithContext(ctx, "PUT", url, bytes.NewReader(w.buffer.Bytes())) 237 - if err != nil { 238 - return distribution.Descriptor{}, fmt.Errorf("failed to create upload request: %w", err) 239 - } 240 - req.Header.Set("Content-Type", "application/octet-stream") 241 - req.ContentLength = w.size 242 - 243 - resp, err := w.store.httpClient.Do(req) 244 - if err != nil { 245 - return distribution.Descriptor{}, fmt.Errorf("presigned upload failed: %w", err) 246 - } 247 - defer resp.Body.Close() 248 - 249 - if resp.StatusCode != http.StatusOK && resp.StatusCode != http.StatusCreated { 250 - bodyBytes, _ := io.ReadAll(resp.Body) 251 - return distribution.Descriptor{}, fmt.Errorf( 252 - "presigned upload failed: status %d, body: %s", 253 - resp.StatusCode, string(bodyBytes), 254 - ) 255 - } 256 - 257 - fmt.Printf("✅ [ProxyBlobWriter.Commit] Upload successful\n") 258 - 259 - // Clear buffer to free memory 260 - w.buffer = nil 261 - 262 - return distribution.Descriptor{ 263 - Digest: finalDigest, 264 - Size: w.size, 265 - MediaType: desc.MediaType, 266 - }, nil 267 - } 268 - ``` 269 - 270 - **Key Changes:** 271 - - Calculate digest from hasher (already computed incrementally) 272 - - Verify digest if provided by client 273 - - Get presigned upload URL with final digest 274 - - Upload buffer contents directly to S3 275 - - No temp location, no move operation 276 - - Clear buffer to free memory immediately 277 - 278 - #### Modify Cancel() method 279 - 280 - **Before (lines 551-572):** 281 - ```go 282 - func (w *ProxyBlobWriter) Cancel(ctx context.Context) error { 283 - // Close pipe, cancel temp upload 284 - } 285 - ``` 286 - 287 - **After:** 288 - ```go 289 - func (w *ProxyBlobWriter) Cancel(ctx context.Context) error { 290 - w.closed = true 291 - 292 - // Remove from global uploads map 293 - globalUploadsMu.Lock() 294 - delete(globalUploads, w.id) 295 - globalUploadsMu.Unlock() 296 - 297 - // Clear buffer to free memory 298 - w.buffer = nil 299 - 300 - fmt.Printf("[ProxyBlobWriter.Cancel] Upload cancelled: id=%s\n", w.id) 301 - return nil 302 - } 303 - ``` 304 - 305 - **Key Changes:** 306 - - Simply clear buffer 307 - - No pipe cleanup needed 308 - - No temp cleanup needed (nothing uploaded yet) 309 - 310 - --- 311 - 312 - ### Phase 2: Update Hold Service (Optional Enhancement) 313 - 314 - The current `getUploadURL()` implementation in `cmd/hold/main.go` (lines 528-587) already supports presigned uploads correctly. No changes needed unless you want to add additional logging. 315 - 316 - **Optional logging enhancement at line 547:** 317 - 318 - ```go 319 - url, err := req.Presign(15 * time.Minute) 320 - if err != nil { 321 - log.Printf("Failed to generate presigned upload URL: %v", err) 322 - return s.getProxyUploadURL(digest, did), nil 323 - } 324 - 325 - log.Printf("🔑 Generated presigned upload URL:") 326 - log.Printf(" Digest: %s", digest) 327 - log.Printf(" S3 Key: %s", s3Key) 328 - log.Printf(" Size: %d bytes", size) 329 - log.Printf(" URL length: %d chars", len(url)) 330 - log.Printf(" Expires: 15min") 331 - 332 - return url, nil 333 - ``` 334 - 335 - --- 336 - 337 - ### Phase 3: Memory Management Considerations 338 - 339 - #### Add Configuration for Max Buffer Size 340 - 341 - **File:** `pkg/storage/proxy_blob_store.go` 342 - 343 - Add constants at top of file: 344 - 345 - ```go 346 - const ( 347 - maxChunkSize = 5 * 1024 * 1024 // 5MB (existing) 348 - 349 - // Maximum blob size for in-memory buffering 350 - // Blobs larger than this will fail (alternative: fallback to proxy mode) 351 - maxBufferedBlobSize = 500 * 1024 * 1024 // 500MB 352 - ) 353 - ``` 354 - 355 - #### Alternative: Disk-Based Buffering 356 - 357 - For very large blobs, consider disk-based buffering: 358 - 359 - ```go 360 - type ProxyBlobWriter struct { 361 - // ... existing fields ... 362 - 363 - // Choose one: 364 - buffer *bytes.Buffer // Memory buffer (current) 365 - // OR 366 - tempFile *os.File // Disk buffer (for large blobs) 367 - bufferSize int64 368 - } 369 - ``` 370 - 371 - **Memory buffer (simple, fast):** 372 - - Pro: Fast, no disk I/O 373 - - Con: Limited by available RAM 374 - - Use for: Blobs < 500MB 375 - 376 - **Disk buffer (scalable):** 377 - - Pro: No memory limit 378 - - Con: Slower, disk I/O overhead 379 - - Use for: Blobs > 500MB 380 - 381 - #### Hybrid Approach (Recommended) 382 - 383 - ```go 384 - const ( 385 - memoryBufferThreshold = 50 * 1024 * 1024 // 50MB 386 - ) 387 - 388 - func (w *ProxyBlobWriter) Write(p []byte) (int, error) { 389 - // If buffer exceeds threshold, switch to disk 390 - if w.buffer != nil && w.buffer.Len() > memoryBufferThreshold { 391 - return 0, fmt.Errorf("blob exceeds memory buffer threshold, disk buffering not implemented") 392 - // TODO: Implement disk buffering or fallback to proxy mode 393 - } 394 - 395 - // Otherwise use memory buffer 396 - // ... existing Write() logic ... 397 - } 398 - ``` 399 - 400 - --- 401 - 402 - ## Optional Enhancement: Presigned HEAD URLs 403 - 404 - ### Motivation 405 - 406 - Currently HEAD requests (blob verification) are proxied through the Hold Service. This is fine because HEAD bandwidth is negligible (~300 bytes per request), but we can eliminate this round-trip by using presigned HEAD URLs. 407 - 408 - ### Implementation 409 - 410 - #### Step 1: Add getHeadURL() to Hold Service 411 - 412 - **File:** `cmd/hold/main.go` 413 - 414 - Add new function after `getDownloadURL()`: 415 - 416 - ```go 417 - // getHeadURL generates a presigned HEAD URL for blob verification 418 - func (s *HoldService) getHeadURL(ctx context.Context, digest string) (string, error) { 419 - // Check if blob exists first 420 - path := blobPath(digest) 421 - _, err := s.driver.Stat(ctx, path) 422 - if err != nil { 423 - return "", fmt.Errorf("blob not found: %w", err) 424 - } 425 - 426 - // If S3 client available, generate presigned HEAD URL 427 - if s.s3Client != nil { 428 - s3Key := strings.TrimPrefix(path, "/") 429 - if s.s3PathPrefix != "" { 430 - s3Key = s.s3PathPrefix + "/" + s3Key 431 - } 432 - 433 - // Generate presigned HEAD URL (method-specific!) 434 - req, _ := s.s3Client.HeadObjectRequest(&s3.HeadObjectInput{ 435 - Bucket: aws.String(s.bucket), 436 - Key: aws.String(s3Key), 437 - }) 438 - 439 - log.Printf("🔍 [getHeadURL] Generating presigned HEAD URL:") 440 - log.Printf(" Digest: %s", digest) 441 - log.Printf(" S3 Key: %s", s3Key) 442 - 443 - url, err := req.Presign(15 * time.Minute) 444 - if err != nil { 445 - log.Printf("[getHeadURL] Presign failed: %v", err) 446 - // Fallback to proxy URL 447 - return s.getProxyHeadURL(digest), nil 448 - } 449 - 450 - log.Printf("✅ [getHeadURL] Presigned HEAD URL generated") 451 - return url, nil 452 - } 453 - 454 - // Fallback: return proxy URL 455 - return s.getProxyHeadURL(digest), nil 456 - } 457 - 458 - // getProxyHeadURL returns a proxy URL for HEAD requests 459 - func (s *HoldService) getProxyHeadURL(digest string) string { 460 - // HEAD requests don't need DID in query string (read-only check) 461 - return fmt.Sprintf("%s/blobs/%s", s.config.Server.PublicURL, digest) 462 - } 463 - ``` 464 - 465 - #### Step 2: Add HTTP endpoint for presigned HEAD URLs 466 - 467 - **File:** `cmd/hold/main.go` 468 - 469 - Add handler similar to `HandleGetPresignedURL()`: 470 - 471 - ```go 472 - // HeadPresignedURLRequest represents a request for a presigned HEAD URL 473 - type HeadPresignedURLRequest struct { 474 - DID string `json:"did"` 475 - Digest string `json:"digest"` 476 - } 477 - 478 - // HeadPresignedURLResponse contains the presigned HEAD URL 479 - type HeadPresignedURLResponse struct { 480 - URL string `json:"url"` 481 - ExpiresAt time.Time `json:"expires_at"` 482 - } 483 - 484 - // HandleHeadPresignedURL handles requests for HEAD URLs 485 - func (s *HoldService) HandleHeadPresignedURL(w http.ResponseWriter, r *http.Request) { 486 - if r.Method != http.MethodPost { 487 - http.Error(w, "method not allowed", http.StatusMethodNotAllowed) 488 - return 489 - } 490 - 491 - var req HeadPresignedURLRequest 492 - if err := json.NewDecoder(r.Body).Decode(&req); err != nil { 493 - http.Error(w, fmt.Sprintf("invalid request: %v", err), http.StatusBadRequest) 494 - return 495 - } 496 - 497 - // Validate DID authorization for READ 498 - if !s.isAuthorizedRead(req.DID) { 499 - if req.DID == "" { 500 - http.Error(w, "unauthorized: authentication required", http.StatusUnauthorized) 501 - } else { 502 - http.Error(w, "forbidden: access denied", http.StatusForbidden) 503 - } 504 - return 505 - } 506 - 507 - // Generate presigned HEAD URL 508 - ctx := context.Background() 509 - expiry := time.Now().Add(15 * time.Minute) 510 - 511 - url, err := s.getHeadURL(ctx, req.Digest) 512 - if err != nil { 513 - http.Error(w, fmt.Sprintf("failed to generate URL: %v", err), http.StatusInternalServerError) 514 - return 515 - } 516 - 517 - resp := HeadPresignedURLResponse{ 518 - URL: url, 519 - ExpiresAt: expiry, 520 - } 521 - 522 - w.Header().Set("Content-Type", "application/json") 523 - json.NewEncoder(w).Encode(resp) 524 - } 525 - ``` 526 - 527 - #### Step 3: Register endpoint in main() 528 - 529 - **File:** `cmd/hold/main.go` 530 - 531 - In `main()` function, add route: 532 - 533 - ```go 534 - mux.HandleFunc("/head-presigned-url", service.HandleHeadPresignedURL) 535 - ``` 536 - 537 - #### Step 4: Update ProxyBlobStore.ServeBlob() 538 - 539 - **File:** `pkg/storage/proxy_blob_store.go` 540 - 541 - Modify HEAD handling (currently lines 197-224): 542 - 543 - **Before:** 544 - ```go 545 - if r.Method == http.MethodHead { 546 - // Check if blob exists via hold service HEAD request 547 - url := fmt.Sprintf("%s/blobs/%s?did=%s", p.storageEndpoint, dgst.String(), p.did) 548 - req, err := http.NewRequestWithContext(ctx, "HEAD", url, nil) 549 - // ... proxy through hold service ... 550 - } 551 - ``` 552 - 553 - **After:** 554 - ```go 555 - if r.Method == http.MethodHead { 556 - // Get presigned HEAD URL from hold service 557 - headURL, err := p.getHeadURL(ctx, dgst) 558 - if err != nil { 559 - return distribution.ErrBlobUnknown 560 - } 561 - 562 - // Redirect to presigned HEAD URL 563 - http.Redirect(w, r, headURL, http.StatusTemporaryRedirect) 564 - return nil 565 - } 566 - ``` 567 - 568 - #### Step 5: Add getHeadURL() to ProxyBlobStore 569 - 570 - **File:** `pkg/storage/proxy_blob_store.go` 571 - 572 - Add after `getDownloadURL()`: 573 - 574 - ```go 575 - // getHeadURL requests a presigned HEAD URL from the storage service 576 - func (p *ProxyBlobStore) getHeadURL(ctx context.Context, dgst digest.Digest) (string, error) { 577 - reqBody := map[string]any{ 578 - "did": p.did, 579 - "digest": dgst.String(), 580 - } 581 - 582 - body, err := json.Marshal(reqBody) 583 - if err != nil { 584 - return "", err 585 - } 586 - 587 - url := fmt.Sprintf("%s/head-presigned-url", p.storageEndpoint) 588 - req, err := http.NewRequestWithContext(ctx, "POST", url, bytes.NewReader(body)) 589 - if err != nil { 590 - return "", err 591 - } 592 - req.Header.Set("Content-Type", "application/json") 593 - 594 - resp, err := p.httpClient.Do(req) 595 - if err != nil { 596 - return "", err 597 - } 598 - defer resp.Body.Close() 599 - 600 - if resp.StatusCode != http.StatusOK { 601 - return "", fmt.Errorf("failed to get HEAD URL: status %d", resp.StatusCode) 602 - } 603 - 604 - var result struct { 605 - URL string `json:"url"` 606 - } 607 - if err := json.NewDecoder(resp.Body).Decode(&result); err != nil { 608 - return "", err 609 - } 610 - 611 - return result.URL, nil 612 - } 613 - ``` 614 - 615 - ### Presigned HEAD URLs: Trade-offs 616 - 617 - **Benefits:** 618 - - Offloads HEAD requests from Hold Service 619 - - Docker verifies blobs directly against S3 620 - - Slightly lower latency (one fewer hop) 621 - 622 - **Costs:** 623 - - Requires round-trip to get presigned HEAD URL 624 - - More complex code 625 - - Two HTTP requests instead of one proxy request 626 - 627 - **Bandwidth Analysis:** 628 - - Current: 1 HEAD request to Hold Service (~300 bytes) 629 - - Presigned: 1 POST to get URL (~200 bytes) + 1 HEAD to S3 (~300 bytes) 630 - - **Net difference: Adds ~200 bytes per verification** 631 - 632 - **Recommendation:** Optional enhancement. The current proxied HEAD approach is simpler and bandwidth difference is negligible. Only implement if: 633 - - Hold Service is becoming a bottleneck 634 - - You want to minimize Hold Service load completely 635 - - Latency of HEAD requests becomes noticeable 636 - 637 - --- 638 - 639 - ## Testing & Validation 640 - 641 - ### Test Plan for Presigned Uploads 642 - 643 - #### 1. Small Blob Upload (< 1MB) 644 - ```bash 645 - # Build test image with small layers 646 - echo "FROM scratch" > Dockerfile 647 - echo "COPY small-file /" >> Dockerfile 648 - dd if=/dev/urandom of=small-file bs=1024 count=512 # 512KB 649 - 650 - docker build -t atcr.io/youruser/test:small . 651 - docker push atcr.io/youruser/test:small 652 - ``` 653 - 654 - **Expected behavior:** 655 - - Blob buffered in memory 656 - - Presigned upload URL requested with correct digest 657 - - Direct upload to S3 via presigned URL 658 - - No temp location, no move operation 659 - 660 - **Verify in logs:** 661 - ``` 662 - 📤 [ProxyBlobWriter.Commit] Uploading via presigned URL 663 - Digest: sha256:... 664 - Size: 524288 bytes 665 - Presigned URL: https://gateway.storjshare.io/... 666 - ✅ [ProxyBlobWriter.Commit] Upload successful 667 - ``` 668 - 669 - #### 2. Medium Blob Upload (10-50MB) 670 - ```bash 671 - dd if=/dev/urandom of=medium-file bs=1048576 count=25 # 25MB 672 - 673 - docker build -t atcr.io/youruser/test:medium . 674 - docker push atcr.io/youruser/test:medium 675 - ``` 676 - 677 - **Monitor memory usage:** 678 - ```bash 679 - # While push is running 680 - docker stats atcr-appview 681 - ``` 682 - 683 - Should see ~25MB spike during buffer + upload. 684 - 685 - #### 3. Large Blob Upload (100-500MB) 686 - ```bash 687 - dd if=/dev/urandom of=large-file bs=1048576 count=200 # 200MB 688 - 689 - docker build -t atcr.io/youruser/test:large . 690 - docker push atcr.io/youruser/test:large 691 - ``` 692 - 693 - **Monitor:** 694 - - Memory usage (should see ~200MB spike) 695 - - Upload completes successfully 696 - - S3 shows blob in correct location 697 - 698 - #### 4. Concurrent Uploads 699 - ```bash 700 - # Push multiple images in parallel 701 - docker push atcr.io/youruser/test1:tag & 702 - docker push atcr.io/youruser/test2:tag & 703 - docker push atcr.io/youruser/test3:tag & 704 - wait 705 - ``` 706 - 707 - **Verify:** 708 - - All uploads complete successfully 709 - - Memory usage peaks but doesn't OOM 710 - - No data corruption (digests match) 711 - 712 - #### 5. Error Handling Tests 713 - 714 - **Test presigned URL failure:** 715 - - Temporarily break S3 credentials 716 - - Verify graceful error message 717 - - Check for memory leaks (buffer cleared on error) 718 - 719 - **Test digest mismatch:** 720 - - This shouldn't happen in practice, but verify error handling 721 - - Buffer should be cleared even on error 722 - 723 - **Test network interruption:** 724 - - Kill network during upload 725 - - Verify proper error propagation 726 - - Check for hanging goroutines 727 - 728 - ### Test Plan for Presigned HEAD URLs (Optional) 729 - 730 - #### 1. HEAD Request Redirect 731 - ```bash 732 - # Pull image (triggers HEAD verification) 733 - docker pull atcr.io/youruser/test:tag 734 - ``` 735 - 736 - **Expected behavior:** 737 - - AppView redirects HEAD to presigned HEAD URL 738 - - Docker follows redirect to S3 739 - - S3 responds to HEAD request successfully 740 - 741 - **Verify in logs:** 742 - ``` 743 - 🔍 [getHeadURL] Generating presigned HEAD URL: 744 - Digest: sha256:... 745 - ✅ [getHeadURL] Presigned HEAD URL generated 746 - ``` 747 - 748 - #### 2. Method Verification 749 - ```bash 750 - # Manually verify presigned HEAD URL works 751 - curl -I "presigned-head-url-here" 752 - ``` 753 - 754 - Should return 200 OK with Content-Length header. 755 - 756 - ```bash 757 - # Verify it ONLY works with HEAD (not GET) 758 - curl "presigned-head-url-here" 759 - ``` 760 - 761 - Should return 403 Forbidden (method mismatch). 762 - 763 - --- 764 - 765 - ## Performance Comparison 766 - 767 - ### Current Architecture (Proxy Mode) 768 - 769 - **Upload:** 770 - ``` 771 - Client → AppView (stream) → Hold Service (stream) → S3 772 - ~0ms delay ~0ms delay ~100ms 773 - ``` 774 - - Total latency: ~100ms + upload time 775 - - Bandwidth: All through Hold Service 776 - 777 - **Download:** 778 - ``` 779 - Client → AppView (redirect) → S3 (presigned GET) 780 - ~5ms ~50ms 781 - ``` 782 - - Total latency: ~55ms + download time 783 - - Bandwidth: Direct from S3 ✅ 784 - 785 - **Verification (HEAD):** 786 - ``` 787 - Client → AppView (redirect) → Hold Service (proxy HEAD) → S3 788 - ~5ms ~10ms ~50ms 789 - ``` 790 - - Total latency: ~65ms 791 - - Bandwidth: ~300 bytes through Hold Service 792 - 793 - ### Presigned Upload Architecture 794 - 795 - **Upload:** 796 - ``` 797 - Client → AppView (buffer) → S3 (presigned PUT) 798 - ~0ms ~100ms 799 - ``` 800 - - Total latency: ~100ms + upload time (same) 801 - - Bandwidth: Direct to S3 ✅ 802 - - Memory: +blob_size during buffer 803 - 804 - **Download:** (unchanged) 805 - ``` 806 - Client → AppView (redirect) → S3 (presigned GET) 807 - ``` 808 - 809 - **Verification (HEAD):** (if presigned HEAD enabled) 810 - ``` 811 - Client → AppView (redirect) → S3 (presigned HEAD) 812 - ~5ms ~50ms 813 - ``` 814 - - Total latency: ~55ms (10ms faster) 815 - - Bandwidth: Direct to S3 ✅ 816 - 817 - --- 818 - 819 - ## Trade-offs Summary 820 - 821 - ### Presigned Uploads 822 - 823 - | Aspect | Proxy Mode (Current) | Presigned URLs | 824 - |--------|---------------------|----------------| 825 - | **Upload Bandwidth** | Through Hold Service | Direct to S3 ✅ | 826 - | **Hold Service Load** | High (all upload traffic) | Low (only URL generation) ✅ | 827 - | **Memory Usage** | Low (streaming) | High (buffering) ⚠️ | 828 - | **Disk Usage** | None | Optional temp files for large blobs | 829 - | **Code Complexity** | Simple ✅ | Moderate | 830 - | **Max Blob Size** | Unlimited ✅ | Limited by memory (~500MB) ⚠️ | 831 - | **Latency** | Same | Same | 832 - | **Error Recovery** | Simple (cancel stream) | More complex (clear buffer) | 833 - 834 - ### Presigned HEAD URLs 835 - 836 - | Aspect | Proxy Mode (Current) | Presigned HEAD | 837 - |--------|---------------------|----------------| 838 - | **Bandwidth** | 300 bytes (negligible) | 500 bytes (still negligible) | 839 - | **Hold Service Load** | Low (HEAD is tiny) | Lower (but minimal gain) | 840 - | **Latency** | 65ms | 55ms (10ms faster) | 841 - | **Code Complexity** | Simple ✅ | More complex | 842 - | **Reliability** | High (fewer moving parts) ✅ | Moderate (more failure modes) | 843 - 844 - --- 845 - 846 - ## Recommendations 847 - 848 - ### Presigned Uploads 849 - 850 - **Implement if:** 851 - - ✅ Hold Service bandwidth is a concern 852 - - ✅ You want to minimize Hold Service load 853 - - ✅ Most blobs are < 100MB (typical Docker layers) 854 - - ✅ AppView has sufficient memory (2-4GB+ RAM) 855 - 856 - **Skip if:** 857 - - ⚠️ Memory is constrained 858 - - ⚠️ You regularly push very large layers (> 500MB) 859 - - ⚠️ Current proxy mode is working fine 860 - - ⚠️ Simplicity is priority 861 - 862 - ### Presigned HEAD URLs 863 - 864 - **Implement if:** 865 - - ✅ You want complete S3 offloading 866 - - ✅ You're already implementing presigned uploads 867 - - ✅ Hold Service is CPU/bandwidth constrained 868 - 869 - **Skip if:** 870 - - ⚠️ Current HEAD proxying works fine (it does) 871 - - ⚠️ You want to minimize code complexity 872 - - ⚠️ 10ms latency difference doesn't matter 873 - 874 - ### Suggested Approach 875 - 876 - **Phase 1:** Implement presigned uploads first 877 - - Bigger performance win (offloads upload bandwidth) 878 - - More valuable for write-heavy workflows 879 - - Test thoroughly with various blob sizes 880 - 881 - **Phase 2:** Monitor and evaluate 882 - - Check Hold Service load after presigned uploads 883 - - Measure HEAD request impact 884 - - Assess if presigned HEAD is worth the complexity 885 - 886 - **Phase 3:** Optionally add presigned HEAD 887 - - Only if Hold Service is still bottlenecked 888 - - Or if you want feature completeness 889 - 890 - --- 891 - 892 - ## Migration Path 893 - 894 - ### Step 1: Feature Flag 895 - Add configuration option to enable/disable presigned uploads: 896 - 897 - ```go 898 - // In AppView config 899 - type Config struct { 900 - // ... existing fields ... 901 - 902 - UsePresignedUploads bool `yaml:"use_presigned_uploads"` // Default: false 903 - } 904 - ``` 905 - 906 - ### Step 2: Gradual Rollout 907 - 1. Deploy with `use_presigned_uploads: false` (current behavior) 908 - 2. Test in staging with `use_presigned_uploads: true` 909 - 3. Roll out to production incrementally 910 - 4. Monitor memory usage and error rates 911 - 912 - ### Step 3: Fallback Mechanism 913 - If presigned upload fails, fallback to proxy mode: 914 - 915 - ```go 916 - func (w *ProxyBlobWriter) Commit(...) { 917 - // Try presigned upload 918 - url, err := w.store.getUploadURL(ctx, finalDigest, w.size) 919 - if err != nil { 920 - // Fallback: use proxy mode 921 - log.Printf("⚠️ Presigned upload unavailable, falling back to proxy") 922 - return w.proxyUpload(ctx, desc) 923 - } 924 - // ... presigned upload ... 925 - } 926 - ``` 927 - 928 - --- 929 - 930 - ## Appendix: Memory Profiling 931 - 932 - To monitor memory usage during development: 933 - 934 - ```bash 935 - # Enable Go memory profiling 936 - go tool pprof http://localhost:5000/debug/pprof/heap 937 - 938 - # Or use runtime metrics 939 - import "runtime" 940 - 941 - var m runtime.MemStats 942 - runtime.ReadMemStats(&m) 943 - fmt.Printf("Alloc = %v MB", m.Alloc / 1024 / 1024) 944 - ``` 945 - 946 - Monitor these metrics: 947 - - `Alloc`: Current memory allocation 948 - - `TotalAlloc`: Cumulative allocation (detect leaks) 949 - - `Sys`: Total memory from OS 950 - - `NumGC`: Garbage collection count 951 - 952 - Expected behavior with presigned uploads: 953 - - Memory spikes during `Write()` calls 954 - - Memory drops after `Commit()` completes 955 - - No memory leaks (TotalAlloc should plateau) 956 - 957 - --- 958 - 959 - ## Questions for Decision 960 - 961 - Before implementing, answer: 962 - 963 - 1. **What's the typical size of your Docker layers?** 964 - - < 50MB: Presigned uploads perfect fit 965 - - 50-200MB: Acceptable with memory monitoring 966 - - > 200MB: Consider disk buffering or stick with proxy 967 - 968 - 2. **What's your AppView's available memory?** 969 - - 1GB: Skip presigned uploads 970 - - 2-4GB: Fine for typical workloads 971 - - 8GB+: No concerns 972 - 973 - 3. **Is Hold Service bandwidth currently a problem?** 974 - - No: Current proxy mode is fine 975 - - Yes: Presigned uploads will help significantly 976 - 977 - 4. **How important is code simplicity?** 978 - - Very: Stick with proxy mode 979 - - Moderate: Implement presigned uploads only 980 - - Low: Implement both presigned uploads and HEAD 981 - 982 - 5. **What's your deployment model?** 983 - - Single Hold Service: Bandwidth matters more 984 - - Multiple Hold Services: Less critical 985 - 986 - --- 987 - 988 - ## Implementation Checklist 989 - 990 - ### Presigned Uploads 991 - - [ ] Modify `ProxyBlobWriter` struct (remove pipe, add buffer/hasher) 992 - - [ ] Update `Create()` to initialize buffer 993 - - [ ] Update `Write()` to buffer + hash data 994 - - [ ] Update `Commit()` to upload via presigned URL 995 - - [ ] Update `Cancel()` to clear buffer 996 - - [ ] Add memory usage monitoring 997 - - [ ] Add configuration flag 998 - - [ ] Test with small blobs (< 1MB) 999 - - [ ] Test with medium blobs (10-50MB) 1000 - - [ ] Test with large blobs (100-500MB) 1001 - - [ ] Test concurrent uploads 1002 - - [ ] Test error scenarios 1003 - - [ ] Update documentation 1004 - - [ ] Deploy to staging 1005 - - [ ] Monitor production rollout 1006 - 1007 - ### Presigned HEAD URLs (Optional) 1008 - - [ ] Add `getHeadURL()` to Hold Service 1009 - - [ ] Add `HandleHeadPresignedURL()` endpoint 1010 - - [ ] Register `/head-presigned-url` route 1011 - - [ ] Add `getHeadURL()` to ProxyBlobStore 1012 - - [ ] Update `ServeBlob()` to redirect HEAD requests 1013 - - [ ] Test HEAD redirects 1014 - - [ ] Verify method-specific signatures 1015 - - [ ] Test with Docker pull operations 1016 - - [ ] Deploy to staging 1017 - - [ ] Monitor production rollout
-824
docs/PRESIGNED_URLS.md
··· 1 - # S3 Presigned URLs Implementation 2 - 3 - ## Overview 4 - 5 - Currently, ATCR's hold service acts as a proxy for all blob data, meaning every byte flows through the hold service when uploading or downloading container images. This document describes the implementation of **S3 presigned URLs** to eliminate this bottleneck, allowing direct data transfer between clients and S3-compatible storage. 6 - 7 - ### Current Architecture (Proxy Mode) 8 - 9 - ``` 10 - Downloads: Docker → AppView → Hold Service → S3 → Hold Service → AppView → Docker 11 - Uploads: Docker → AppView → Hold Service → S3 12 - ``` 13 - 14 - **Problems:** 15 - - All blob data flows through hold service 16 - - Hold service bandwidth = total image bandwidth 17 - - Latency from extra hops 18 - - Hold service becomes bottleneck for large images 19 - 20 - ### Target Architecture (Presigned URLs) 21 - 22 - ``` 23 - Downloads: Docker → AppView (gets presigned URL) → S3 (direct download) 24 - Uploads: Docker → AppView → S3 (via presigned URL) 25 - Move: AppView → Hold Service → S3 (server-side CopyObject API) 26 - ``` 27 - 28 - **Benefits:** 29 - - ✅ Hold service only orchestrates (no data transfer) 30 - - ✅ Blob data never touches hold service 31 - - ✅ Direct S3 uploads/downloads at wire speed 32 - - ✅ Hold service can run on minimal resources 33 - - ✅ Works with all S3-compatible services 34 - 35 - ## How Presigned URLs Work 36 - 37 - ### For Downloads (GET) 38 - 39 - 1. **Docker requests blob:** `GET /v2/alice/myapp/blobs/sha256:abc123` 40 - 2. **AppView asks hold service:** `POST /get-presigned-url` 41 - ```json 42 - {"did": "did:plc:alice123", "digest": "sha256:abc123"} 43 - ``` 44 - 3. **Hold service generates presigned URL:** 45 - ```go 46 - req, _ := s3Client.GetObjectRequest(&s3.GetObjectInput{ 47 - Bucket: "my-bucket", 48 - Key: "blobs/sha256/ab/abc123.../data", 49 - }) 50 - url, _ := req.Presign(15 * time.Minute) 51 - // Returns: https://gateway.storjshare.io/bucket/blobs/...?X-Amz-Signature=... 52 - ``` 53 - 4. **AppView redirects Docker:** `HTTP 307 Location: <presigned-url>` 54 - 5. **Docker downloads directly from S3** using the presigned URL 55 - 56 - **Data path:** Docker → S3 (direct) 57 - **Hold service bandwidth:** ~1KB (API request/response) 58 - 59 - ### For Uploads (PUT) 60 - 61 - **Small blobs (< 5MB) using Put():** 62 - 63 - 1. **Docker sends blob to AppView:** `PUT /v2/alice/myapp/blobs/uploads/{uuid}` 64 - 2. **AppView asks hold service:** `POST /put-presigned-url` 65 - ```json 66 - {"did": "did:plc:alice123", "digest": "sha256:abc123", "size": 1024} 67 - ``` 68 - 3. **Hold service generates presigned URL:** 69 - ```go 70 - req, _ := s3Client.PutObjectRequest(&s3.PutObjectInput{ 71 - Bucket: "my-bucket", 72 - Key: "blobs/sha256/ab/abc123.../data", 73 - }) 74 - url, _ := req.Presign(15 * time.Minute) 75 - ``` 76 - 4. **AppView uploads to S3** using presigned URL 77 - 5. **AppView confirms to Docker:** `201 Created` 78 - 79 - **Data path:** Docker → AppView → S3 (via presigned URL) 80 - **Hold service bandwidth:** ~1KB (API request/response) 81 - 82 - ### For Streaming Uploads (Create/Commit) 83 - 84 - **Large blobs (> 5MB) using streaming:** 85 - 86 - 1. **Docker starts upload:** `POST /v2/alice/myapp/blobs/uploads/` 87 - 2. **AppView creates upload session** with UUID 88 - 3. **AppView gets presigned URL for temp location:** 89 - ```json 90 - POST /put-presigned-url 91 - {"did": "...", "digest": "uploads/temp-{uuid}", "size": 0} 92 - ``` 93 - 4. **Docker streams data:** `PATCH /v2/alice/myapp/blobs/uploads/{uuid}` 94 - 5. **AppView streams to S3** using presigned URL to `uploads/temp-{uuid}/data` 95 - 6. **Docker finalizes:** `PUT /v2/.../uploads/{uuid}?digest=sha256:abc123` 96 - 7. **AppView requests move:** `POST /move?from=uploads/temp-{uuid}&to=sha256:abc123` 97 - 8. **Hold service executes S3 server-side copy:** 98 - ```go 99 - s3.CopyObject(&s3.CopyObjectInput{ 100 - Bucket: "my-bucket", 101 - CopySource: "/my-bucket/uploads/temp-{uuid}/data", 102 - Key: "blobs/sha256/ab/abc123.../data", 103 - }) 104 - s3.DeleteObject(&s3.DeleteObjectInput{ 105 - Key: "uploads/temp-{uuid}/data", 106 - }) 107 - ``` 108 - 109 - **Data path:** Docker → AppView → S3 (temp location) 110 - **Move path:** S3 internal copy (no data transfer!) 111 - **Hold service bandwidth:** ~2KB (presigned URL + CopyObject API) 112 - 113 - ### For Chunked Uploads (Multipart Upload) 114 - 115 - **Large blobs with OCI chunked protocol (Docker PATCH requests):** 116 - 117 - The OCI Distribution Spec uses chunked uploads via multiple PATCH requests. Single presigned URLs don't support this - we need **S3 Multipart Upload**. 118 - 119 - 1. **Docker starts upload:** `POST /v2/alice/myapp/blobs/uploads/` 120 - 2. **AppView initiates multipart:** 121 - ```json 122 - POST /start-multipart 123 - {"did": "...", "digest": "uploads/temp-{uuid}"} 124 - → Returns: {"upload_id": "xyz123"} 125 - ``` 126 - 3. **Docker sends chunk 1:** `PATCH /v2/.../uploads/{uuid}` (5MB data) 127 - 4. **AppView gets part URL:** 128 - ```json 129 - POST /part-presigned-url 130 - {"did": "...", "digest": "uploads/temp-{uuid}", "upload_id": "xyz123", "part_number": 1} 131 - → Returns: {"url": "https://s3.../part?uploadId=xyz123&partNumber=1&..."} 132 - ``` 133 - 5. **AppView uploads part 1** using presigned URL → Gets ETag 134 - 6. **Docker sends chunk 2:** `PATCH /v2/.../uploads/{uuid}` (5MB data) 135 - 7. **Repeat steps 4-5** for part 2 (and subsequent parts) 136 - 8. **Docker finalizes:** `PUT /v2/.../uploads/{uuid}?digest=sha256:abc123` 137 - 9. **AppView completes multipart:** 138 - ```json 139 - POST /complete-multipart 140 - {"did": "...", "digest": "uploads/temp-{uuid}", "upload_id": "xyz123", 141 - "parts": [{"part_number": 1, "etag": "..."}, {"part_number": 2, "etag": "..."}]} 142 - ``` 143 - 10. **AppView requests move:** `POST /move?from=uploads/temp-{uuid}&to=sha256:abc123` 144 - 11. **Hold service executes S3 server-side copy** (same as above) 145 - 146 - **Data path:** Docker → AppView (buffers 5MB) → S3 (via presigned URL per part) 147 - **Each PATCH:** Independent, non-blocking, immediate response 148 - **Hold service bandwidth:** ~1KB per part + ~1KB for completion 149 - 150 - **Why This Fixes "Client Disconnected" Errors:** 151 - - Previous implementation: Single presigned URL + pipe → PATCH blocks → Docker timeout 152 - - New implementation: Each PATCH → separate part upload → immediate response → no blocking 153 - 154 - ## Why the Temp → Final Move is Required 155 - 156 - This is **not an ATCR implementation detail** — it's required by the [OCI Distribution Specification](https://github.com/opencontainers/distribution-spec/blob/main/spec.md#push). 157 - 158 - ### The Problem: Unknown Digest 159 - 160 - Docker doesn't know the blob's digest until **after** uploading: 161 - 162 - 1. **Streaming data:** Can't buffer 5GB layer in memory to calculate digest first 163 - 2. **Stdin pipes:** `docker build . | docker push` generates data on-the-fly 164 - 3. **Chunked uploads:** Multiple PATCH requests, digest calculated as data streams 165 - 166 - ### The Solution: Upload to Temp, Verify, Move 167 - 168 - **All OCI registries do this:** 169 - 170 - 1. Client: `POST /v2/{name}/blobs/uploads/` → Get upload UUID 171 - 2. Client: `PATCH /v2/{name}/blobs/uploads/{uuid}` → Stream data to temp location 172 - 3. Client: `PUT /v2/{name}/blobs/uploads/{uuid}?digest=sha256:abc` → Provide digest 173 - 4. Registry: Verify digest matches uploaded data 174 - 5. Registry: Move `uploads/{uuid}` → `blobs/sha256/abc123...` 175 - 176 - **Docker Hub, GHCR, ECR, Harbor — all use this pattern.** 177 - 178 - ### Why It's Efficient with S3 179 - 180 - **For S3, the move is a CopyObject API call:** 181 - 182 - ```go 183 - // This happens INSIDE S3 servers - no data transfer! 184 - s3.CopyObject(&s3.CopyObjectInput{ 185 - Bucket: "my-bucket", 186 - CopySource: "/my-bucket/uploads/temp-12345/data", // 5GB blob 187 - Key: "blobs/sha256/ab/abc123.../data", 188 - }) 189 - // S3 copies internally, hold service only sends ~1KB API request 190 - ``` 191 - 192 - **For a 5GB layer:** 193 - - Hold service bandwidth: **~1KB** (API request/response) 194 - - S3 internal copy: Instant (metadata operation on S3 side) 195 - - No data leaves S3, no network transfer 196 - 197 - This is why the move operation is essentially free! 198 - 199 - ## Implementation Details 200 - 201 - ### 1. Add S3 Client to Hold Service 202 - 203 - **File: `cmd/hold/main.go`** 204 - 205 - Modify `HoldService` struct: 206 - ```go 207 - type HoldService struct { 208 - driver storagedriver.StorageDriver 209 - config *Config 210 - s3Client *s3.S3 // NEW: S3 client for presigned URLs 211 - bucket string // NEW: Bucket name 212 - s3PathPrefix string // NEW: Path prefix (if any) 213 - } 214 - ``` 215 - 216 - Add initialization function: 217 - ```go 218 - func (s *HoldService) initS3Client() error { 219 - if s.config.Storage.Type() != "s3" { 220 - log.Printf("Storage driver is %s (not S3), presigned URLs disabled", s.config.Storage.Type()) 221 - return nil 222 - } 223 - 224 - params := s.config.Storage.Parameters()["s3"].(configuration.Parameters) 225 - 226 - // Build AWS config 227 - awsConfig := &aws.Config{ 228 - Region: aws.String(params["region"].(string)), 229 - Credentials: credentials.NewStaticCredentials( 230 - params["accesskey"].(string), 231 - params["secretkey"].(string), 232 - "", 233 - ), 234 - } 235 - 236 - // Add custom endpoint for S3-compatible services (Storj, MinIO, etc.) 237 - if endpoint, ok := params["regionendpoint"].(string); ok && endpoint != "" { 238 - awsConfig.Endpoint = aws.String(endpoint) 239 - awsConfig.S3ForcePathStyle = aws.Bool(true) // Required for MinIO, Storj 240 - } 241 - 242 - sess, err := session.NewSession(awsConfig) 243 - if err != nil { 244 - return fmt.Errorf("failed to create AWS session: %w", err) 245 - } 246 - 247 - s.s3Client = s3.New(sess) 248 - s.bucket = params["bucket"].(string) 249 - 250 - log.Printf("S3 presigned URLs enabled for bucket: %s", s.bucket) 251 - return nil 252 - } 253 - ``` 254 - 255 - Call during service initialization: 256 - ```go 257 - func NewHoldService(cfg *Config) (*HoldService, error) { 258 - // ... existing driver creation ... 259 - 260 - service := &HoldService{ 261 - driver: driver, 262 - config: cfg, 263 - } 264 - 265 - // Initialize S3 client for presigned URLs 266 - if err := service.initS3Client(); err != nil { 267 - log.Printf("WARNING: S3 presigned URLs disabled: %v", err) 268 - } 269 - 270 - return service, nil 271 - } 272 - ``` 273 - 274 - ### 2. Implement Presigned URL Generation 275 - 276 - **For Downloads:** 277 - 278 - ```go 279 - func (s *HoldService) getDownloadURL(ctx context.Context, digest string, did string) (string, error) { 280 - path := blobPath(digest) 281 - 282 - // Check if blob exists 283 - if _, err := s.driver.Stat(ctx, path); err != nil { 284 - return "", fmt.Errorf("blob not found: %w", err) 285 - } 286 - 287 - // If S3 client available, generate presigned URL 288 - if s.s3Client != nil { 289 - s3Key := strings.TrimPrefix(path, "/") 290 - 291 - req, _ := s.s3Client.GetObjectRequest(&s3.GetObjectInput{ 292 - Bucket: aws.String(s.bucket), 293 - Key: aws.String(s3Key), 294 - }) 295 - 296 - url, err := req.Presign(15 * time.Minute) 297 - if err != nil { 298 - log.Printf("WARN: Presigned URL generation failed, falling back to proxy: %v", err) 299 - return s.getProxyDownloadURL(digest, did), nil 300 - } 301 - 302 - log.Printf("Generated presigned download URL for %s (expires in 15min)", digest) 303 - return url, nil 304 - } 305 - 306 - // Fallback: return proxy URL 307 - return s.getProxyDownloadURL(digest, did), nil 308 - } 309 - 310 - func (s *HoldService) getProxyDownloadURL(digest, did string) string { 311 - return fmt.Sprintf("%s/blobs/%s?did=%s", s.config.Server.PublicURL, digest, did) 312 - } 313 - ``` 314 - 315 - **For Uploads:** 316 - 317 - ```go 318 - func (s *HoldService) getUploadURL(ctx context.Context, digest string, size int64, did string) (string, error) { 319 - path := blobPath(digest) 320 - 321 - // If S3 client available, generate presigned URL 322 - if s.s3Client != nil { 323 - s3Key := strings.TrimPrefix(path, "/") 324 - 325 - req, _ := s.s3Client.PutObjectRequest(&s3.PutObjectInput{ 326 - Bucket: aws.String(s.bucket), 327 - Key: aws.String(s3Key), 328 - }) 329 - 330 - url, err := req.Presign(15 * time.Minute) 331 - if err != nil { 332 - log.Printf("WARN: Presigned URL generation failed, falling back to proxy: %v", err) 333 - return s.getProxyUploadURL(digest, did), nil 334 - } 335 - 336 - log.Printf("Generated presigned upload URL for %s (expires in 15min)", digest) 337 - return url, nil 338 - } 339 - 340 - // Fallback: return proxy URL 341 - return s.getProxyUploadURL(digest, did), nil 342 - } 343 - 344 - func (s *HoldService) getProxyUploadURL(digest, did string) string { 345 - return fmt.Sprintf("%s/blobs/%s?did=%s", s.config.Server.PublicURL, digest, did) 346 - } 347 - ``` 348 - 349 - ### 3. Multipart Upload Endpoints (Required for Chunked Uploads) 350 - 351 - **File: `cmd/hold/main.go`** 352 - 353 - #### Start Multipart Upload 354 - 355 - ```go 356 - func (s *HoldService) HandleStartMultipart(w http.ResponseWriter, r *http.Request) { 357 - var req StartMultipartUploadRequest // {did, digest} 358 - 359 - // Validate DID authorization for WRITE 360 - if !s.isAuthorizedWrite(req.DID) { 361 - // Return 403 Forbidden 362 - } 363 - 364 - // Initiate S3 multipart upload 365 - result, err := s.s3Client.CreateMultipartUploadWithContext(ctx, &s3.CreateMultipartUploadInput{ 366 - Bucket: aws.String(s.bucket), 367 - Key: aws.String(s3Key), 368 - }) 369 - 370 - // Return upload ID 371 - json.NewEncoder(w).Encode(StartMultipartUploadResponse{ 372 - UploadID: *result.UploadId, 373 - ExpiresAt: time.Now().Add(24 * time.Hour), 374 - }) 375 - } 376 - ``` 377 - 378 - **Route:** `POST /start-multipart` 379 - 380 - #### Get Part Presigned URL 381 - 382 - ```go 383 - func (s *HoldService) HandleGetPartURL(w http.ResponseWriter, r *http.Request) { 384 - var req GetPartURLRequest // {did, digest, upload_id, part_number} 385 - 386 - // Generate presigned URL for specific part 387 - req, _ := s.s3Client.UploadPartRequest(&s3.UploadPartInput{ 388 - Bucket: aws.String(s.bucket), 389 - Key: aws.String(s3Key), 390 - UploadId: aws.String(uploadID), 391 - PartNumber: aws.Int64(int64(partNumber)), 392 - }) 393 - 394 - url, err := req.Presign(15 * time.Minute) 395 - 396 - json.NewEncoder(w).Encode(GetPartURLResponse{URL: url}) 397 - } 398 - ``` 399 - 400 - **Route:** `POST /part-presigned-url` 401 - 402 - #### Complete Multipart Upload 403 - 404 - ```go 405 - func (s *HoldService) HandleCompleteMultipart(w http.ResponseWriter, r *http.Request) { 406 - var req CompleteMultipartRequest // {did, digest, upload_id, parts: [{part_number, etag}]} 407 - 408 - // Convert parts to S3 format 409 - s3Parts := make([]*s3.CompletedPart, len(req.Parts)) 410 - for i, p := range req.Parts { 411 - s3Parts[i] = &s3.CompletedPart{ 412 - PartNumber: aws.Int64(int64(p.PartNumber)), 413 - ETag: aws.String(p.ETag), 414 - } 415 - } 416 - 417 - // Complete multipart upload 418 - _, err := s.s3Client.CompleteMultipartUploadWithContext(ctx, &s3.CompleteMultipartUploadInput{ 419 - Bucket: aws.String(s.bucket), 420 - Key: aws.String(s3Key), 421 - UploadId: aws.String(uploadID), 422 - MultipartUpload: &s3.CompletedMultipartUpload{Parts: s3Parts}, 423 - }) 424 - } 425 - ``` 426 - 427 - **Route:** `POST /complete-multipart` 428 - 429 - #### Abort Multipart Upload 430 - 431 - ```go 432 - func (s *HoldService) HandleAbortMultipart(w http.ResponseWriter, r *http.Request) { 433 - var req AbortMultipartRequest // {did, digest, upload_id} 434 - 435 - // Abort and cleanup parts 436 - _, err := s.s3Client.AbortMultipartUploadWithContext(ctx, &s3.AbortMultipartUploadInput{ 437 - Bucket: aws.String(s.bucket), 438 - Key: aws.String(s3Key), 439 - UploadId: aws.String(uploadID), 440 - }) 441 - } 442 - ``` 443 - 444 - **Route:** `POST /abort-multipart` 445 - 446 - ### 4. Move Operation (No Changes) 447 - 448 - The existing `/move` endpoint already uses `driver.Move()`, which for S3: 449 - - Calls `s3.CopyObject()` (server-side copy) 450 - - Calls `s3.DeleteObject()` (delete source) 451 - - No data transfer through hold service! 452 - 453 - **File: `cmd/hold/main.go:393` (already exists, no changes needed)** 454 - 455 - ```go 456 - func (s *HoldService) HandleMove(w http.ResponseWriter, r *http.Request) { 457 - // ... existing auth and parsing ... 458 - 459 - sourcePath := blobPath(fromPath) // uploads/temp-{uuid}/data 460 - destPath := blobPath(toDigest) // blobs/sha256/ab/abc123.../data 461 - 462 - // For S3, this does CopyObject + DeleteObject (server-side) 463 - if err := s.driver.Move(ctx, sourcePath, destPath); err != nil { 464 - // ... error handling ... 465 - } 466 - } 467 - ``` 468 - 469 - ### 5. AppView Changes (Multipart Upload Implementation) 470 - 471 - **File: `pkg/storage/proxy_blob_store.go:228`** 472 - 473 - Currently streams to hold service proxy URL. Could be optimized to use presigned URL: 474 - 475 - ```go 476 - // In Create() - line 228 477 - go func() { 478 - defer pipeReader.Close() 479 - 480 - tempPath := fmt.Sprintf("uploads/temp-%s", writer.id) 481 - 482 - // Try to get presigned URL for temp location 483 - url, err := p.getUploadURL(ctx, digest.FromString(tempPath), 0) 484 - if err != nil { 485 - // Fallback to direct proxy URL 486 - url = fmt.Sprintf("%s/blobs/%s?did=%s", p.storageEndpoint, tempPath, p.did) 487 - } 488 - 489 - req, err := http.NewRequestWithContext(uploadCtx, "PUT", url, pipeReader) 490 - // ... rest unchanged 491 - }() 492 - ``` 493 - 494 - **Note:** This optimization is optional. The presigned URL will be returned by hold service's `getUploadURL()` anyway. 495 - 496 - ## S3-Compatible Service Support 497 - 498 - ### Storj 499 - 500 - ```bash 501 - # .env file 502 - STORAGE_DRIVER=s3 503 - AWS_ACCESS_KEY_ID=your-storj-access-key 504 - AWS_SECRET_ACCESS_KEY=your-storj-secret-key 505 - S3_BUCKET=your-bucket-name 506 - S3_REGION=global 507 - S3_ENDPOINT=https://gateway.storjshare.io 508 - ``` 509 - 510 - **Presigned URL example:** 511 - ``` 512 - https://gateway.storjshare.io/your-bucket/blobs/sha256/ab/abc123.../data?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=...&X-Amz-Signature=... 513 - ``` 514 - 515 - ### MinIO 516 - 517 - ```bash 518 - STORAGE_DRIVER=s3 519 - AWS_ACCESS_KEY_ID=minioadmin 520 - AWS_SECRET_ACCESS_KEY=minioadmin 521 - S3_BUCKET=registry 522 - S3_REGION=us-east-1 523 - S3_ENDPOINT=http://minio.example.com:9000 524 - ``` 525 - 526 - ### Backblaze B2 527 - 528 - ```bash 529 - STORAGE_DRIVER=s3 530 - AWS_ACCESS_KEY_ID=your-b2-key-id 531 - AWS_SECRET_ACCESS_KEY=your-b2-application-key 532 - S3_BUCKET=your-bucket-name 533 - S3_REGION=us-west-002 534 - S3_ENDPOINT=https://s3.us-west-002.backblazeb2.com 535 - ``` 536 - 537 - ### Cloudflare R2 538 - 539 - ```bash 540 - STORAGE_DRIVER=s3 541 - AWS_ACCESS_KEY_ID=your-r2-access-key-id 542 - AWS_SECRET_ACCESS_KEY=your-r2-secret-access-key 543 - S3_BUCKET=your-bucket-name 544 - S3_REGION=auto 545 - S3_ENDPOINT=https://<account-id>.r2.cloudflarestorage.com 546 - ``` 547 - 548 - **All these services support presigned URLs with AWS SDK v1!** 549 - 550 - ## Performance Impact 551 - 552 - ### Bandwidth Savings 553 - 554 - **Before (proxy mode):** 555 - - 5GB layer upload: Hold service receives 5GB, sends 5GB to S3 = **10GB** bandwidth 556 - - 5GB layer download: S3 sends 5GB to hold, hold sends 5GB to client = **10GB** bandwidth 557 - - **Total for push+pull: 20GB hold service bandwidth** 558 - 559 - **After (presigned URLs):** 560 - - 5GB layer upload: Hold generates URL (1KB), AppView → S3 direct (5GB), CopyObject API (1KB) = **~2KB** hold bandwidth 561 - - 5GB layer download: Hold generates URL (1KB), client → S3 direct = **~1KB** hold bandwidth 562 - - **Total for push+pull: ~3KB hold service bandwidth** 563 - 564 - **Savings: 99.98% reduction in hold service bandwidth!** 565 - 566 - ### Latency Improvements 567 - 568 - **Before:** 569 - - Download: Client → AppView → Hold → S3 → Hold → AppView → Client (4 hops) 570 - - Upload: Client → AppView → Hold → S3 (3 hops) 571 - 572 - **After:** 573 - - Download: Client → AppView (redirect) → S3 (1 hop to data) 574 - - Upload: Client → AppView → S3 (2 hops) 575 - - Move: S3 internal (no network hops) 576 - 577 - ### Resource Requirements 578 - 579 - **Before:** 580 - - Hold service needs bandwidth = sum of all image operations 581 - - For 100 concurrent 1GB pushes: 100GB/s bandwidth needed 582 - - Expensive, hard to scale 583 - 584 - **After:** 585 - - Hold service needs minimal CPU for presigned URL signing 586 - - For 100 concurrent 1GB pushes: ~100KB/s bandwidth needed (API traffic) 587 - - Can run on $5/month instance! 588 - 589 - ## Security Considerations 590 - 591 - ### Presigned URL Expiration 592 - 593 - - Default: **15 minutes** expiration 594 - - Presigned URL includes embedded credentials in query params 595 - - After expiry, URL becomes invalid (S3 rejects with 403) 596 - - No long-lived URLs floating around 597 - 598 - ### Authorization Flow 599 - 600 - 1. **AppView validates user** via ATProto OAuth 601 - 2. **AppView passes DID to hold service** in presigned URL request 602 - 3. **Hold service validates DID** (owner or crew member) 603 - 4. **Hold service generates presigned URL** if authorized 604 - 5. **Client uses presigned URL** directly with S3 605 - 606 - **Security boundary:** Hold service controls who gets presigned URLs, S3 validates the URLs. 607 - 608 - ### Fallback Security 609 - 610 - If presigned URL generation fails: 611 - - Falls back to proxy URLs (existing behavior) 612 - - Still requires hold service authorization 613 - - Data flows through hold service (original security model) 614 - 615 - ## Testing & Validation 616 - 617 - ### Verify Presigned URLs are Used 618 - 619 - **1. Check hold service logs:** 620 - ```bash 621 - docker logs atcr-hold | grep -i presigned 622 - # Should see: "Generated presigned download/upload URL for sha256:..." 623 - ``` 624 - 625 - **2. Monitor network traffic:** 626 - ```bash 627 - # Before: Large data transfers to/from hold service 628 - docker stats atcr-hold 629 - 630 - # After: Minimal network usage on hold service 631 - docker stats atcr-hold 632 - ``` 633 - 634 - **3. Inspect redirect responses:** 635 - ```bash 636 - # Should see 307 redirect to S3 URL 637 - curl -v http://appview:5000/v2/alice/myapp/blobs/sha256:abc123 \ 638 - -H "Authorization: Bearer $TOKEN" 639 - 640 - # Look for: 641 - # < HTTP/1.1 307 Temporary Redirect 642 - # < Location: https://gateway.storjshare.io/...?X-Amz-Signature=... 643 - ``` 644 - 645 - ### Test Fallback Behavior 646 - 647 - **1. With filesystem driver (should use proxy URLs):** 648 - ```bash 649 - STORAGE_DRIVER=filesystem docker-compose up atcr-hold 650 - # Logs should show: "Storage driver is filesystem (not S3), presigned URLs disabled" 651 - ``` 652 - 653 - **2. With S3 but invalid credentials (should fall back):** 654 - ```bash 655 - AWS_ACCESS_KEY_ID=invalid docker-compose up atcr-hold 656 - # Logs should show: "WARN: Presigned URL generation failed, falling back to proxy" 657 - ``` 658 - 659 - ### Bandwidth Monitoring 660 - 661 - **Track hold service bandwidth over time:** 662 - ```bash 663 - # Install bandwidth monitoring 664 - docker exec atcr-hold apt-get update && apt-get install -y vnstat 665 - 666 - # Monitor 667 - docker exec atcr-hold vnstat -l 668 - ``` 669 - 670 - **Expected results:** 671 - - Before: Bandwidth correlates with image operations 672 - - After: Bandwidth stays minimal regardless of image operations 673 - 674 - ## Migration Guide 675 - 676 - ### For Existing ATCR Deployments 677 - 678 - **1. Update hold service code** (this implementation) 679 - 680 - **2. No configuration changes needed** if already using S3: 681 - ```bash 682 - # Existing S3 config works automatically 683 - STORAGE_DRIVER=s3 684 - AWS_ACCESS_KEY_ID=... 685 - AWS_SECRET_ACCESS_KEY=... 686 - S3_BUCKET=... 687 - S3_ENDPOINT=... 688 - ``` 689 - 690 - **3. Restart hold service:** 691 - ```bash 692 - docker-compose restart atcr-hold 693 - ``` 694 - 695 - **4. Verify in logs:** 696 - ``` 697 - S3 presigned URLs enabled for bucket: my-bucket 698 - ``` 699 - 700 - **5. Test with image push/pull:** 701 - ```bash 702 - docker push atcr.io/alice/myapp:latest 703 - docker pull atcr.io/alice/myapp:latest 704 - ``` 705 - 706 - **6. Monitor bandwidth** to confirm reduction 707 - 708 - ### Rollback Plan 709 - 710 - If issues arise: 711 - 712 - **Option 1: Disable presigned URLs via env var** (if we add this feature) 713 - ```bash 714 - PRESIGNED_URLS_ENABLED=false docker-compose restart atcr-hold 715 - ``` 716 - 717 - **Option 2: Revert code changes** to previous hold service version 718 - 719 - The implementation has automatic fallbacks, so partial failures won't break functionality. 720 - 721 - ## Testing with DISABLE_PRESIGNED_URLS 722 - 723 - ### Environment Variable 724 - 725 - Set `DISABLE_PRESIGNED_URLS=true` to force proxy/buffered mode even when S3 is configured. 726 - 727 - **Use cases:** 728 - - Testing proxy/buffered code paths with S3 storage 729 - - Debugging multipart uploads in buffered mode 730 - - Simulating S3 providers that don't support presigned URLs 731 - - Verifying fallback behavior works correctly 732 - 733 - ### How It Works 734 - 735 - When `DISABLE_PRESIGNED_URLS=true`: 736 - 737 - **Single blob operations:** 738 - - `getDownloadURL()` returns proxy URL instead of S3 presigned URL 739 - - `getHeadURL()` returns proxy URL instead of S3 presigned HEAD URL 740 - - `getUploadURL()` returns proxy URL instead of S3 presigned PUT URL 741 - - Client uses `/blobs/{digest}` endpoints (proxy through hold service) 742 - 743 - **Multipart uploads:** 744 - - `StartMultipartUploadWithManager()` creates **Buffered** session instead of **S3Native** 745 - - `GetPartUploadURL()` returns `/multipart-parts/{uploadID}/{partNumber}` instead of S3 presigned URL 746 - - Parts are buffered in memory in the hold service 747 - - `CompleteMultipartUploadWithManager()` assembles parts and writes via storage driver 748 - 749 - ### Testing Example 750 - 751 - ```bash 752 - # Test S3 with forced proxy mode 753 - export STORAGE_DRIVER=s3 754 - export S3_BUCKET=my-bucket 755 - export AWS_ACCESS_KEY_ID=... 756 - export AWS_SECRET_ACCESS_KEY=... 757 - export DISABLE_PRESIGNED_URLS=true # Force buffered/proxy mode 758 - 759 - ./bin/atcr-hold 760 - 761 - # Push an image - should use proxy mode 762 - docker push atcr.io/yourdid/test:latest 763 - 764 - # Check logs for: 765 - # "Presigned URLs disabled, using proxy URL" 766 - # "Presigned URLs disabled (DISABLE_PRESIGNED_URLS=true), using buffered mode" 767 - # "Stored part: uploadID=... part=1 size=..." 768 - ``` 769 - 770 - ## Future Enhancements 771 - 772 - ### 1. Configurable Expiration 773 - 774 - Allow customizing presigned URL expiry: 775 - ```bash 776 - PRESIGNED_URL_EXPIRY=30m # Default: 15m 777 - ``` 778 - 779 - ### 2. Presigned URL Caching 780 - 781 - Cache presigned URLs for frequently accessed blobs (with shorter TTL). 782 - 783 - ### 3. CloudFront/CDN Integration 784 - 785 - For downloads, use CloudFront presigned URLs instead of direct S3: 786 - - Better global distribution 787 - - Lower egress costs 788 - - Faster downloads 789 - 790 - ### 4. Multipart Upload Support 791 - 792 - For very large layers (>5GB), use presigned URLs with multipart upload: 793 - - Generate presigned URLs for each part 794 - - Client uploads parts directly to S3 795 - - Hold service finalizes multipart upload 796 - 797 - ### 5. Metrics & Monitoring 798 - 799 - Track presigned URL usage: 800 - - Count of presigned URLs generated 801 - - Fallback rate (proxy vs presigned) 802 - - Bandwidth savings metrics 803 - 804 - ## References 805 - 806 - - [OCI Distribution Specification - Push](https://github.com/opencontainers/distribution-spec/blob/main/spec.md#push) 807 - - [AWS SDK Go v1 - Presigned URLs](https://docs.aws.amazon.com/sdk-for-go/v1/developer-guide/s3-example-presigned-urls.html) 808 - - [Storj - Using Presigned URLs](https://docs.storj.io/dcs/api-reference/s3-compatible-gateway/using-presigned-urls) 809 - - [MinIO - Presigned Upload via Browser](https://docs.min.io/community/minio-object-store/integrations/presigned-put-upload-via-browser.html) 810 - - [Cloudflare R2 - Presigned URLs](https://developers.cloudflare.com/r2/api/s3/presigned-urls/) 811 - - [Backblaze B2 - S3 Compatible API](https://help.backblaze.com/hc/en-us/articles/360047815993-Does-the-B2-S3-Compatible-API-support-Pre-Signed-URLs) 812 - 813 - ## Summary 814 - 815 - Implementing S3 presigned URLs transforms ATCR's hold service from a **data proxy** to a **lightweight orchestrator**: 816 - 817 - ✅ **99.98% bandwidth reduction** for hold service 818 - ✅ **Direct client → S3 transfers** for maximum speed 819 - ✅ **Works with all S3-compatible services** (Storj, MinIO, R2, B2) 820 - ✅ **OCI-compliant** temp → final move pattern 821 - ✅ **Automatic fallback** to proxy mode for non-S3 drivers 822 - ✅ **No breaking changes** to existing deployments 823 - 824 - This makes BYOS (Bring Your Own Storage) truly scalable and cost-effective, as users can run hold services on minimal infrastructure while serving arbitrarily large container images.
+1 -4
pkg/appview/static/install.ps1
··· 126 126 Write-Host "" 127 127 Write-Host "Installation complete!" -ForegroundColor Green 128 128 Write-Host "" 129 - Write-Host "To use ATCR with Docker, configure your credentials:" -ForegroundColor Yellow 130 - Write-Host " docker-credential-atcr configure" 131 - Write-Host "" 132 - Write-Host "Then configure Docker to use this credential helper:" -ForegroundColor Yellow 129 + Write-Host "To use ATCR with Docker, configure Docker to use this credential helper:" -ForegroundColor Yellow 133 130 Write-Host ' Edit %USERPROFILE%\.docker\config.json and add:' 134 131 Write-Host ' { 135 132 "credHelpers": {
+1 -4
pkg/appview/static/install.sh
··· 101 101 echo "" 102 102 echo -e "${GREEN}Installation complete!${NC}" 103 103 echo "" 104 - echo -e "${YELLOW}To use ATCR with Docker, configure your credentials:${NC}" 105 - echo -e " ${BINARY_NAME} configure" 106 - echo "" 107 - echo -e "${YELLOW}Then configure Docker to use this credential helper:${NC}" 104 + echo -e "${YELLOW}To use ATCR with Docker, configure Docker to use this credential helper:${NC}" 108 105 echo -e ' echo '\''{"credHelpers": {"atcr.io": "atcr"}}'\'' > ~/.docker/config.json' 109 106 echo "" 110 107 echo -e "${YELLOW}Or add to existing config.json:${NC}"