Blob Support Implementation Plan#
For Claude: REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
Goal: Add blob (image/video) upload, storage, and retrieval to the PDS using Cloudflare R2.
Architecture: Blobs stored in R2 bucket keyed by {did}/{cid}. Metadata tracked in SQLite tables (blob, record_blob) within each Durable Object. Orphan cleanup via DO alarm. MIME sniffing for security.
Tech Stack: Cloudflare R2, Durable Object SQLite, Web Crypto API (SHA-256 for CID generation)
Task 1: Add R2 Bucket Binding#
Files:
- Modify:
wrangler.toml
Step 1: Add R2 binding to wrangler.toml
Add after the existing migrations section:
[[r2_buckets]]
binding = "BLOBS"
bucket_name = "pds-blobs"
Step 2: Create R2 bucket (if not exists)
Run: npx wrangler r2 bucket create pds-blobs
Step 3: Commit
git add wrangler.toml
git commit -m "feat: add R2 bucket binding for blob storage"
Task 2: Add Blob Database Schema#
Files:
- Modify:
src/pds.js:1162-1190(constructor schema initialization)
Step 1: Add blob and record_blob tables
In the PersonalDataServer constructor, after the existing CREATE TABLE statements (around line 1186), add:
CREATE TABLE IF NOT EXISTS blob (
cid TEXT PRIMARY KEY,
mimeType TEXT NOT NULL,
size INTEGER NOT NULL,
createdAt TEXT NOT NULL
);
CREATE TABLE IF NOT EXISTS record_blob (
blobCid TEXT NOT NULL,
recordUri TEXT NOT NULL,
PRIMARY KEY (blobCid, recordUri)
);
Step 2: Test schema creation manually
Deploy and verify tables exist:
npx wrangler deploy
Step 3: Commit
git add src/pds.js
git commit -m "feat: add blob and record_blob tables to schema"
Task 3: Implement MIME Type Sniffing#
Files:
- Modify:
src/pds.js(add after error helper, around line 30) - Test:
test/pds.test.js
Step 1: Write the failing test
Add to test/pds.test.js:
import {
// ... existing imports ...
sniffMimeType,
} from '../src/pds.js';
describe('MIME Type Sniffing', () => {
test('detects JPEG', () => {
const bytes = new Uint8Array([0xFF, 0xD8, 0xFF, 0xE0, 0x00, 0x10]);
assert.strictEqual(sniffMimeType(bytes), 'image/jpeg');
});
test('detects PNG', () => {
const bytes = new Uint8Array([0x89, 0x50, 0x4E, 0x47, 0x0D, 0x0A, 0x1A, 0x0A]);
assert.strictEqual(sniffMimeType(bytes), 'image/png');
});
test('detects GIF', () => {
const bytes = new Uint8Array([0x47, 0x49, 0x46, 0x38, 0x39, 0x61]);
assert.strictEqual(sniffMimeType(bytes), 'image/gif');
});
test('detects WebP', () => {
const bytes = new Uint8Array([
0x52, 0x49, 0x46, 0x46, // RIFF
0x00, 0x00, 0x00, 0x00, // size (ignored)
0x57, 0x45, 0x42, 0x50, // WEBP
]);
assert.strictEqual(sniffMimeType(bytes), 'image/webp');
});
test('detects MP4', () => {
const bytes = new Uint8Array([
0x00, 0x00, 0x00, 0x18, // size
0x66, 0x74, 0x79, 0x70, // ftyp
]);
assert.strictEqual(sniffMimeType(bytes), 'video/mp4');
});
test('returns null for unknown', () => {
const bytes = new Uint8Array([0x00, 0x01, 0x02, 0x03]);
assert.strictEqual(sniffMimeType(bytes), null);
});
});
Step 2: Run test to verify it fails
Run: npm test
Expected: FAIL with "sniffMimeType is not exported"
Step 3: Write minimal implementation
Add to src/pds.js after the error helper (around line 30):
// === MIME TYPE SNIFFING ===
// Detect file type from magic bytes (first 12 bytes)
/**
* Sniff MIME type from file magic bytes
* @param {Uint8Array|ArrayBuffer} bytes - File bytes (only first 12 needed)
* @returns {string|null} Detected MIME type or null if unknown
*/
export function sniffMimeType(bytes) {
const arr = new Uint8Array(bytes.slice(0, 12));
// JPEG: FF D8 FF
if (arr[0] === 0xff && arr[1] === 0xd8 && arr[2] === 0xff) {
return 'image/jpeg';
}
// PNG: 89 50 4E 47 0D 0A 1A 0A
if (
arr[0] === 0x89 &&
arr[1] === 0x50 &&
arr[2] === 0x4e &&
arr[3] === 0x47 &&
arr[4] === 0x0d &&
arr[5] === 0x0a &&
arr[6] === 0x1a &&
arr[7] === 0x0a
) {
return 'image/png';
}
// GIF: 47 49 46 38 (GIF8)
if (
arr[0] === 0x47 &&
arr[1] === 0x49 &&
arr[2] === 0x46 &&
arr[3] === 0x38
) {
return 'image/gif';
}
// WebP: RIFF....WEBP
if (
arr[0] === 0x52 &&
arr[1] === 0x49 &&
arr[2] === 0x46 &&
arr[3] === 0x46 &&
arr[8] === 0x57 &&
arr[9] === 0x45 &&
arr[10] === 0x42 &&
arr[11] === 0x50
) {
return 'image/webp';
}
// MP4/MOV: ....ftyp at byte 4
if (
arr[4] === 0x66 &&
arr[5] === 0x74 &&
arr[6] === 0x79 &&
arr[7] === 0x70
) {
return 'video/mp4';
}
return null;
}
Step 4: Run test to verify it passes
Run: npm test
Expected: PASS
Step 5: Commit
git add src/pds.js test/pds.test.js
git commit -m "feat: add MIME type sniffing from magic bytes"
Task 4: Implement Blob Ref Detection#
Files:
- Modify:
src/pds.js(add after sniffMimeType) - Test:
test/pds.test.js
Step 1: Write the failing test
Add to test/pds.test.js:
import {
// ... existing imports ...
findBlobRefs,
} from '../src/pds.js';
describe('Blob Ref Detection', () => {
test('finds blob ref in simple object', () => {
const record = {
$type: 'app.bsky.feed.post',
text: 'Hello',
embed: {
$type: 'app.bsky.embed.images',
images: [
{
image: {
$type: 'blob',
ref: { $link: 'bafkreiabc123' },
mimeType: 'image/jpeg',
size: 1234,
},
alt: 'test image',
},
],
},
};
const refs = findBlobRefs(record);
assert.deepStrictEqual(refs, ['bafkreiabc123']);
});
test('finds multiple blob refs', () => {
const record = {
images: [
{ image: { $type: 'blob', ref: { $link: 'cid1' }, mimeType: 'image/png', size: 100 } },
{ image: { $type: 'blob', ref: { $link: 'cid2' }, mimeType: 'image/png', size: 200 } },
],
};
const refs = findBlobRefs(record);
assert.deepStrictEqual(refs, ['cid1', 'cid2']);
});
test('returns empty array when no blobs', () => {
const record = { text: 'Hello world', count: 42 };
const refs = findBlobRefs(record);
assert.deepStrictEqual(refs, []);
});
test('handles null and primitives', () => {
assert.deepStrictEqual(findBlobRefs(null), []);
assert.deepStrictEqual(findBlobRefs('string'), []);
assert.deepStrictEqual(findBlobRefs(42), []);
});
});
Step 2: Run test to verify it fails
Run: npm test
Expected: FAIL with "findBlobRefs is not exported"
Step 3: Write minimal implementation
Add to src/pds.js after sniffMimeType:
// === BLOB REF DETECTION ===
// Recursively find blob references in records
/**
* Find all blob CID references in a record
* @param {*} obj - Record value to scan
* @param {string[]} refs - Accumulator array (internal)
* @returns {string[]} Array of blob CID strings
*/
export function findBlobRefs(obj, refs = []) {
if (!obj || typeof obj !== 'object') {
return refs;
}
// Check if this object is a blob ref
if (obj.$type === 'blob' && obj.ref?.$link) {
refs.push(obj.ref.$link);
}
// Recurse into arrays and objects
if (Array.isArray(obj)) {
for (const item of obj) {
findBlobRefs(item, refs);
}
} else {
for (const value of Object.values(obj)) {
findBlobRefs(value, refs);
}
}
return refs;
}
Step 4: Run test to verify it passes
Run: npm test
Expected: PASS
Step 5: Commit
git add src/pds.js test/pds.test.js
git commit -m "feat: add blob ref detection for records"
Task 5: Implement uploadBlob Endpoint#
Files:
- Modify:
src/pds.js(add route and handler)
Step 1: Add route to pdsRoutes
In pdsRoutes object (around line 1055), add:
'/xrpc/com.atproto.repo.uploadBlob': {
method: 'POST',
handler: (pds, req, _url) => pds.handleUploadBlob(req),
},
Step 2: Add handler method to PersonalDataServer class
Add method to the class (after existing handlers):
async handleUploadBlob(request) {
// Require auth
const authResult = await this.requireAuth(request);
if (authResult instanceof Response) return authResult;
const did = await this.getDid();
if (!did) {
return errorResponse('InvalidRequest', 'PDS not initialized', 400);
}
// Read body as ArrayBuffer
const bodyBytes = await request.arrayBuffer();
const size = bodyBytes.byteLength;
// Check size limit (50MB)
const MAX_BLOB_SIZE = 50 * 1024 * 1024;
if (size > MAX_BLOB_SIZE) {
return errorResponse(
'BlobTooLarge',
`Blob size ${size} exceeds maximum ${MAX_BLOB_SIZE}`,
400,
);
}
// Sniff MIME type, fall back to Content-Type header
const contentType = request.headers.get('Content-Type') || 'application/octet-stream';
const sniffed = sniffMimeType(bodyBytes);
const mimeType = sniffed || contentType;
// Compute CID (reuse existing createCid)
const cid = await createCid(new Uint8Array(bodyBytes));
const cidStr = cidToString(cid);
// Check if blob already exists
const existing = this.sql
.exec('SELECT cid FROM blob WHERE cid = ?', cidStr)
.toArray();
if (existing.length === 0) {
// Upload to R2
const r2Key = `${did}/${cidStr}`;
await this.env.BLOBS.put(r2Key, bodyBytes, {
httpMetadata: { contentType: mimeType },
});
// Insert metadata
const createdAt = new Date().toISOString();
this.sql.exec(
'INSERT INTO blob (cid, mimeType, size, createdAt) VALUES (?, ?, ?, ?)',
cidStr,
mimeType,
size,
createdAt,
);
}
// Return BlobRef
return Response.json({
blob: {
$type: 'blob',
ref: { $link: cidStr },
mimeType,
size,
},
});
}
Step 3: Verify deployment
Run: npx wrangler deploy
Step 4: Test manually with curl
curl -X POST \
-H "Authorization: Bearer <access-token>" \
-H "Content-Type: image/png" \
--data-binary @test-image.png \
https://your-pds.workers.dev/xrpc/com.atproto.repo.uploadBlob
Expected: JSON response with blob ref
Step 5: Commit
git add src/pds.js
git commit -m "feat: implement uploadBlob endpoint with R2 storage"
Task 6: Implement getBlob Endpoint#
Files:
- Modify:
src/pds.js(add route and handler)
Step 1: Add route to pdsRoutes
'/xrpc/com.atproto.sync.getBlob': {
handler: (pds, _req, url) => pds.handleGetBlob(url),
},
Step 2: Add handler method
async handleGetBlob(url) {
const did = url.searchParams.get('did');
const cid = url.searchParams.get('cid');
if (!did || !cid) {
return errorResponse('InvalidRequest', 'missing did or cid parameter', 400);
}
// Verify DID matches this DO
const myDid = await this.getDid();
if (did !== myDid) {
return errorResponse('InvalidRequest', 'DID does not match this repo', 400);
}
// Look up blob metadata
const rows = this.sql
.exec('SELECT mimeType, size FROM blob WHERE cid = ?', cid)
.toArray();
if (rows.length === 0) {
return errorResponse('BlobNotFound', 'blob not found', 404);
}
const { mimeType, size } = rows[0];
// Fetch from R2
const r2Key = `${did}/${cid}`;
const object = await this.env.BLOBS.get(r2Key);
if (!object) {
return errorResponse('BlobNotFound', 'blob not found in storage', 404);
}
// Return blob with security headers
return new Response(object.body, {
headers: {
'Content-Type': mimeType,
'Content-Length': String(size),
'X-Content-Type-Options': 'nosniff',
'Content-Security-Policy': "default-src 'none'; sandbox",
'Cache-Control': 'public, max-age=31536000, immutable',
},
});
}
Step 3: Deploy and test
Run: npx wrangler deploy
Test:
curl "https://your-pds.workers.dev/xrpc/com.atproto.sync.getBlob?did=did:plc:xxx&cid=bafkrei..."
Step 4: Commit
git add src/pds.js
git commit -m "feat: implement getBlob endpoint"
Task 7: Implement listBlobs Endpoint#
Files:
- Modify:
src/pds.js(add route and handler)
Step 1: Add route to pdsRoutes
'/xrpc/com.atproto.sync.listBlobs': {
handler: (pds, _req, url) => pds.handleListBlobs(url),
},
Step 2: Add handler method
async handleListBlobs(url) {
const did = url.searchParams.get('did');
const cursor = url.searchParams.get('cursor');
const limit = Math.min(Number(url.searchParams.get('limit')) || 500, 1000);
if (!did) {
return errorResponse('InvalidRequest', 'missing did parameter', 400);
}
// Verify DID matches this DO
const myDid = await this.getDid();
if (did !== myDid) {
return errorResponse('InvalidRequest', 'DID does not match this repo', 400);
}
// Query blobs with pagination
let query = 'SELECT cid, createdAt FROM blob';
const params = [];
if (cursor) {
query += ' WHERE createdAt > ?';
params.push(cursor);
}
query += ' ORDER BY createdAt ASC LIMIT ?';
params.push(limit + 1); // Fetch one extra to detect if there's more
const rows = this.sql.exec(query, ...params).toArray();
// Determine if there's a next page
let nextCursor = null;
if (rows.length > limit) {
rows.pop(); // Remove the extra row
nextCursor = rows[rows.length - 1].createdAt;
}
return Response.json({
cids: rows.map((r) => r.cid),
cursor: nextCursor,
});
}
Step 3: Deploy and test
Run: npx wrangler deploy
Test:
curl "https://your-pds.workers.dev/xrpc/com.atproto.sync.listBlobs?did=did:plc:xxx"
Step 4: Commit
git add src/pds.js
git commit -m "feat: implement listBlobs endpoint"
Task 8: Integrate Blob Association with createRecord#
Files:
- Modify:
src/pds.js:1253(createRecord method)
Step 1: Add blob association after record storage
In createRecord method, after storing the record in the records table (around line 1280), add:
// Associate blobs with this record
const blobRefs = findBlobRefs(record);
for (const blobCid of blobRefs) {
// Verify blob exists
const blobExists = this.sql
.exec('SELECT cid FROM blob WHERE cid = ?', blobCid)
.toArray();
if (blobExists.length === 0) {
throw new Error(`BlobNotFound: ${blobCid}`);
}
// Create association
this.sql.exec(
'INSERT OR IGNORE INTO record_blob (blobCid, recordUri) VALUES (?, ?)',
blobCid,
uri,
);
}
Step 2: Deploy and test
Test by uploading a blob, then creating a post that references it:
# Upload blob
BLOB=$(curl -X POST -H "Authorization: Bearer $TOKEN" \
-H "Content-Type: image/png" --data-binary @test.png \
https://your-pds.workers.dev/xrpc/com.atproto.repo.uploadBlob)
echo $BLOB # Get the CID
# Create post with image
curl -X POST -H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
https://your-pds.workers.dev/xrpc/com.atproto.repo.createRecord \
-d '{
"repo": "did:plc:xxx",
"collection": "app.bsky.feed.post",
"record": {
"$type": "app.bsky.feed.post",
"text": "Hello with image!",
"createdAt": "2026-01-06T12:00:00.000Z",
"embed": {
"$type": "app.bsky.embed.images",
"images": [{
"image": {
"$type": "blob",
"ref": {"$link": "<cid-from-upload>"},
"mimeType": "image/png",
"size": 1234
},
"alt": "test"
}]
}
}
}'
Step 3: Commit
git add src/pds.js
git commit -m "feat: associate blobs with records on createRecord"
Task 9: Implement Blob Cleanup on deleteRecord#
Files:
- Modify:
src/pds.js:1391(deleteRecord method)
Step 1: Add blob cleanup after record deletion
In deleteRecord method, after deleting the record from the records table, add:
// Get blobs associated with this record
const associatedBlobs = this.sql
.exec('SELECT blobCid FROM record_blob WHERE recordUri = ?', uri)
.toArray();
// Remove associations for this record
this.sql.exec('DELETE FROM record_blob WHERE recordUri = ?', uri);
// Check each blob for orphan status and delete if unreferenced
for (const { blobCid } of associatedBlobs) {
const stillReferenced = this.sql
.exec('SELECT 1 FROM record_blob WHERE blobCid = ? LIMIT 1', blobCid)
.toArray();
if (stillReferenced.length === 0) {
// Blob is orphaned, delete from R2 and database
const did = await this.getDid();
await this.env.BLOBS.delete(`${did}/${blobCid}`);
this.sql.exec('DELETE FROM blob WHERE cid = ?', blobCid);
}
}
Step 2: Deploy and test
Test by creating a post with an image, then deleting it:
# Delete the post
curl -X POST -H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
https://your-pds.workers.dev/xrpc/com.atproto.repo.deleteRecord \
-d '{
"repo": "did:plc:xxx",
"collection": "app.bsky.feed.post",
"rkey": "<rkey>"
}'
# Verify blob is gone
curl "https://your-pds.workers.dev/xrpc/com.atproto.sync.listBlobs?did=did:plc:xxx"
Step 3: Commit
git add src/pds.js
git commit -m "feat: cleanup orphaned blobs on record deletion"
Task 10: Implement Orphan Cleanup Alarm#
Files:
- Modify:
src/pds.js(add alarm handler and scheduling)
Step 1: Add alarm scheduling in initIdentity
In the initIdentity method (or after successful init), add:
// Schedule blob cleanup alarm (runs daily)
const currentAlarm = await this.state.storage.getAlarm();
if (!currentAlarm) {
await this.state.storage.setAlarm(Date.now() + 24 * 60 * 60 * 1000);
}
Step 2: Add alarm handler to PersonalDataServer class
async alarm() {
await this.cleanupOrphanedBlobs();
// Reschedule for next day
await this.state.storage.setAlarm(Date.now() + 24 * 60 * 60 * 1000);
}
async cleanupOrphanedBlobs() {
const did = await this.getDid();
if (!did) return;
// Find orphans: blobs not in record_blob, older than 24h
const cutoff = new Date(Date.now() - 24 * 60 * 60 * 1000).toISOString();
const orphans = this.sql
.exec(
`SELECT b.cid FROM blob b
LEFT JOIN record_blob rb ON b.cid = rb.blobCid
WHERE rb.blobCid IS NULL AND b.createdAt < ?`,
cutoff,
)
.toArray();
for (const { cid } of orphans) {
await this.env.BLOBS.delete(`${did}/${cid}`);
this.sql.exec('DELETE FROM blob WHERE cid = ?', cid);
}
if (orphans.length > 0) {
console.log(`Cleaned up ${orphans.length} orphaned blobs`);
}
}
Step 3: Deploy
Run: npx wrangler deploy
Step 4: Commit
git add src/pds.js
git commit -m "feat: add DO alarm for orphaned blob cleanup"
Task 11: Update README#
Files:
- Modify:
README.md
Step 1: Update feature checklist
Change:
- [ ] Blob storage (uploadBlob, getBlob, listBlobs)
To:
- [x] Blob storage (uploadBlob, getBlob, listBlobs)
Step 2: Add blob configuration section
Add under configuration:
### Blob Storage
Blobs (images, videos) are stored in Cloudflare R2:
1. Create an R2 bucket: `npx wrangler r2 bucket create pds-blobs`
2. The binding is already configured in `wrangler.toml`
Supported formats: JPEG, PNG, GIF, WebP, MP4
Max size: 50MB
Orphaned blobs are automatically cleaned up after 24 hours.
Step 3: Commit
git add README.md
git commit -m "docs: update README with blob storage feature"
Summary#
| Task | Description | Files Modified |
|---|---|---|
| 1 | Add R2 bucket binding | wrangler.toml |
| 2 | Add blob database schema | src/pds.js |
| 3 | Implement MIME sniffing | src/pds.js, test/pds.test.js |
| 4 | Implement blob ref detection | src/pds.js, test/pds.test.js |
| 5 | Implement uploadBlob endpoint | src/pds.js |
| 6 | Implement getBlob endpoint | src/pds.js |
| 7 | Implement listBlobs endpoint | src/pds.js |
| 8 | Integrate blob association | src/pds.js |
| 9 | Cleanup blobs on delete | src/pds.js |
| 10 | Add orphan cleanup alarm | src/pds.js |
| 11 | Update README | README.md |
Estimated additions: ~250 lines to src/pds.js, ~60 lines to test/pds.test.js