tangled
alpha
login
or
join now
tylur.dev
/
prototypey
1
fork
atom
prototypey.org - atproto lexicon typescript toolkit - mirror https://github.com/tylersayshi/prototypey
1
fork
atom
overview
issues
pulls
pipelines
remove md
Tyler
4 months ago
9f6e6d82
80ef5398
-544
1 changed file
expand all
collapse all
unified
split
aislop
plan-emit.md
-544
aislop/plan-emit.md
···
1
1
-
# Plan: Lexicon Emission for Prototypey
2
2
-
3
3
-
## Current State
4
4
-
5
5
-
- **Project**: Type-safe lexicon inference library (similar to Arktype's approach)
6
6
-
- **Structure**: TypeScript library with `src/`, `lib/` (compiled output), `samples/` (example JSON lexicons)
7
7
-
- **Build**: Uses `tsdown` for bundling, pnpm for package management
8
8
-
9
9
-
## Emission Strategy
10
10
-
11
11
-
### 1. Two-Track Approach
12
12
-
13
13
-
Since prototypey is about **type inference** from lexicons (not traditional codegen), we should support both:
14
14
-
15
15
-
#### Track A: Traditional Code Generation (compatibility)
16
16
-
17
17
-
- Install `@atproto/lex-cli` as a dev dependency
18
18
-
- Emit standard TypeScript files like other atproto projects
19
19
-
- Useful for projects that want traditional generated types
20
20
-
21
21
-
#### Track B: Type Inference (prototypey's core value)
22
22
-
23
23
-
- Leverage your existing inference engine (`src/infer.ts`)
24
24
-
- Generate minimal runtime code with inferred types
25
25
-
- This is your differentiator from standard atproto tooling
26
26
-
27
27
-
### 2. Directory Structure
28
28
-
29
29
-
```
30
30
-
prototypey/
31
31
-
├── lexicons/ # NEW: Input lexicon schemas
32
32
-
│ └── (empty initially, users add their schemas here)
33
33
-
├── samples/ # Keep existing samples
34
34
-
│ └── *.json
35
35
-
├── src/
36
36
-
│ ├── cli/ # NEW: CLI tool for codegen
37
37
-
│ │ ├── index.ts # Main CLI entry
38
38
-
│ │ ├── commands/
39
39
-
│ │ │ ├── gen-types.ts # Track A: Standard codegen
40
40
-
│ │ │ └── gen-inferred.ts # Track B: Inference-based
41
41
-
│ │ └── templates/
42
42
-
│ └── ...existing code
43
43
-
├── generated/ # NEW: Default output directory
44
44
-
│ ├── types/ # Track A output
45
45
-
│ └── inferred/ # Track B output
46
46
-
└── package.json
47
47
-
```
48
48
-
49
49
-
### 3. CLI Commands
50
50
-
51
51
-
Add to `package.json`:
52
52
-
53
53
-
```json
54
54
-
{
55
55
-
"bin": {
56
56
-
"prototypey": "./lib/cli/index.js"
57
57
-
},
58
58
-
"scripts": {
59
59
-
"codegen": "prototypey gen-inferred ./generated/inferred ./lexicons/**/*.json"
60
60
-
}
61
61
-
}
62
62
-
```
63
63
-
64
64
-
Provide these commands:
65
65
-
66
66
-
- `prototypey gen-inferred <outdir> <schemas...>` - Generate type-inferred code (your unique approach)
67
67
-
- `prototypey gen-types <outdir> <schemas...>` - Generate standard TypeScript (delegates to @atproto/lex-cli)
68
68
-
- `prototypey init` - Initialize a new lexicon project with sample configs
69
69
-
70
70
-
### 4. Track B: Inferred Code Generation (Your Secret Sauce)
71
71
-
72
72
-
Generate minimal runtime code that leverages your inference:
73
73
-
74
74
-
```typescript
75
75
-
// Example output: generated/inferred/app/bsky/feed/post.ts
76
76
-
import type { Infer } from "prototypey";
77
77
-
import schema from "../../../../lexicons/app/bsky/feed/post.json" with { type: "json" };
78
78
-
79
79
-
export type Post = Infer<typeof schema>;
80
80
-
81
81
-
// Minimal runtime helpers
82
82
-
export const PostSchema = schema;
83
83
-
export const isPost = (v: unknown): v is Post => {
84
84
-
return (
85
85
-
typeof v === "object" &&
86
86
-
v !== null &&
87
87
-
"$type" in v &&
88
88
-
v.$type === "app.bsky.feed.post"
89
89
-
);
90
90
-
};
91
91
-
```
92
92
-
93
93
-
Benefits:
94
94
-
95
95
-
- **No validation code duplication** - reuse @atproto/lexicon at runtime
96
96
-
- **Type inference magic** - your core competency
97
97
-
- **Smaller bundle size** - minimal generated code
98
98
-
- **Simpler output** - easier to understand
99
99
-
100
100
-
### 5. Dependencies to Add
101
101
-
102
102
-
```json
103
103
-
{
104
104
-
"dependencies": {
105
105
-
"@atproto/lexicon": "^0.3.0"
106
106
-
},
107
107
-
"devDependencies": {
108
108
-
"@atproto/lex-cli": "^0.9.1",
109
109
-
"commander": "^12.0.0",
110
110
-
"glob": "^10.0.0"
111
111
-
},
112
112
-
"peerDependencies": {
113
113
-
"typescript": ">=5.0.0"
114
114
-
}
115
115
-
}
116
116
-
```
117
117
-
118
118
-
### 6. Build Pipeline Integration
119
119
-
120
120
-
Update `package.json` scripts:
121
121
-
122
122
-
```json
123
123
-
{
124
124
-
"scripts": {
125
125
-
"build": "tsdown",
126
126
-
"build:cli": "tsdown --entry src/cli/index.ts --format esm --dts false",
127
127
-
"codegen:samples": "prototypey gen-inferred ./generated/samples ./samples/*.json",
128
128
-
"prepack": "pnpm build && pnpm build:cli"
129
129
-
}
130
130
-
}
131
131
-
```
132
132
-
133
133
-
### 7. Configuration File (optional)
134
134
-
135
135
-
`prototypey.config.json`:
136
136
-
137
137
-
```json
138
138
-
{
139
139
-
"lexicons": "./lexicons",
140
140
-
"output": {
141
141
-
"inferred": "./generated/inferred",
142
142
-
"types": "./generated/types"
143
143
-
},
144
144
-
"include": ["**/*.json"],
145
145
-
"exclude": ["**/node_modules/**"]
146
146
-
}
147
147
-
```
148
148
-
149
149
-
### 8. Documentation Updates
150
150
-
151
151
-
Create docs for:
152
152
-
153
153
-
1. **Quick Start**: How to run codegen on your lexicons
154
154
-
2. **Track Comparison**: When to use inferred vs. standard generation
155
155
-
3. **Migration Guide**: Moving from @atproto/lex-cli to prototypey
156
156
-
4. **Type Inference Deep Dive**: How your inference works (marketing!)
157
157
-
158
158
-
## Key Differentiators
159
159
-
160
160
-
### Prototypey's Unique Value
161
161
-
162
162
-
1. **Compile-time type inference** - No runtime validation code needed
163
163
-
2. **Smaller bundles** - Minimal generated code
164
164
-
3. **Better DX** - Types are inferred, not generated boilerplate
165
165
-
4. **Same safety guarantees** - Full TypeScript type checking
166
166
-
167
167
-
### vs. Standard @atproto/lex-cli
168
168
-
169
169
-
- **Standard**: Generates verbose validation code
170
170
-
- **Prototypey**: Generates minimal code + type inference
171
171
-
- **Both**: Same type safety, but prototypey is leaner
172
172
-
173
173
-
## Implementation Priority
174
174
-
175
175
-
1. ✅ **Phase 1**: Basic CLI structure + Track B (inferred generation) - COMPLETE
176
176
-
2. ✅ **Phase 2**: File organization + output directory structure - COMPLETE
177
177
-
3. ✅ **Phase 3**: Convert to pnpm workspaces monorepo - COMPLETE - this was marked complete but we still have src and packages
178
178
-
4. **Phase 4**: Track A (standard generation, delegate to lex-cli)
179
179
-
5. **Phase 5**: Configuration file support
180
180
-
6. **Phase 6**: Documentation + examples
181
181
-
182
182
-
## Phase 1 & 2 Implementation Notes
183
183
-
184
184
-
### ✅ Completed (2025-10-16)
185
185
-
186
186
-
**Tech Stack Choices:**
187
187
-
188
188
-
- Used `sade` instead of `commander` (modern, minimal CLI framework from awesome-e18e)
189
189
-
- Used `tinyglobby` instead of `glob` (faster, modern alternative)
190
190
-
- Built with `tsdown` for CLI bundling
191
191
-
192
192
-
**Structure Created:**
193
193
-
194
194
-
```
195
195
-
prototypey/
196
196
-
├── src/cli/
197
197
-
│ ├── index.ts # CLI entry with sade
198
198
-
│ ├── commands/
199
199
-
│ │ └── gen-inferred.ts # Track B implementation
200
200
-
│ └── templates/
201
201
-
│ └── inferred.ts # Code generation template
202
202
-
├── generated/
203
203
-
│ └── inferred/ # Generated type files
204
204
-
├── lexicons/ # Input directory (empty, ready for user schemas)
205
205
-
└── lib/cli/ # Built CLI output
206
206
-
```
207
207
-
208
208
-
**Generated Code Pattern:**
209
209
-
210
210
-
```typescript
211
211
-
// generated/inferred/app/bsky/actor/profile.ts
212
212
-
import type { Infer } from "prototypey";
213
213
-
import schema from "../../../../../samples/demo.json" with { type: "json" };
214
214
-
215
215
-
export type Profile = Infer<typeof schema>;
216
216
-
export const ProfileSchema = schema;
217
217
-
export function isProfile(v: unknown): v is Profile { ... }
218
218
-
```
219
219
-
220
220
-
**CLI Usage:**
221
221
-
222
222
-
```bash
223
223
-
# Build CLI
224
224
-
pnpm build:cli
225
225
-
226
226
-
# Generate from samples
227
227
-
pnpm codegen:samples
228
228
-
229
229
-
# Direct usage
230
230
-
node lib/cli/index.js gen-inferred ./generated/inferred './samples/*.json'
231
231
-
```
232
232
-
233
233
-
**Key Features:**
234
234
-
235
235
-
- Converts NSID to file paths: `app.bsky.feed.post` → `app/bsky/feed/post.ts`
236
236
-
- Generates minimal runtime code with type inference
237
237
-
- Auto-creates directory structure
238
238
-
- Skips invalid schemas gracefully
239
239
-
- Type guard functions for runtime checks
240
240
-
241
241
-
**Testing:**
242
242
-
243
243
-
- Successfully generated types from sample lexicons
244
244
-
- Runtime validation works (tested with node)
245
245
-
- Schema imports work correctly with JSON modules
246
246
-
247
247
-
## Phase 3: Monorepo Strategy
248
248
-
249
249
-
### Why Monorepo?
250
250
-
251
251
-
The CLI tool should be a separate package from the core inference library for several reasons:
252
252
-
253
253
-
1. **Separation of concerns**: Core inference types vs. code generation tooling
254
254
-
2. **Dependency isolation**: CLI needs `sade`, `tinyglobby`, etc. - consumers of the core library don't
255
255
-
3. **Bundle size**: Users importing just types don't want CLI bloat
256
256
-
4. **Independent versioning**: CLI can evolve separately from type inference
257
257
-
5. **Better organization**: Clear boundaries between runtime and build-time code
258
258
-
259
259
-
### Proposed Structure
260
260
-
261
261
-
```
262
262
-
prototypey/
263
263
-
├── package.json # Root workspace config
264
264
-
├── pnpm-workspace.yaml # Workspace definition
265
265
-
├── packages/
266
266
-
│ ├── prototypey/ # Core inference library
267
267
-
│ │ ├── package.json # Main package (prototypey)
268
268
-
│ │ ├── src/
269
269
-
│ │ │ ├── index.ts
270
270
-
│ │ │ ├── infer.ts
271
271
-
│ │ │ ├── lib.ts
272
272
-
│ │ │ └── type-utils.ts
273
273
-
│ │ ├── lib/ # Built output
274
274
-
│ │ └── tests/
275
275
-
│ │
276
276
-
│ └── cli/ # CLI package
277
277
-
│ ├── package.json # Separate package (@prototypey/cli)
278
278
-
│ ├── src/
279
279
-
│ │ ├── index.ts
280
280
-
│ │ ├── commands/
281
281
-
│ │ │ └── gen-inferred.ts
282
282
-
│ │ └── templates/
283
283
-
│ │ └── inferred.ts
284
284
-
│ └── lib/ # Built CLI output
285
285
-
│
286
286
-
├── samples/ # Shared samples
287
287
-
├── generated/ # Generated output (gitignored)
288
288
-
└── lexicons/ # Input lexicons (gitignored)
289
289
-
```
290
290
-
291
291
-
### Package Configurations
292
292
-
293
293
-
**Root `pnpm-workspace.yaml`:**
294
294
-
295
295
-
```yaml
296
296
-
packages:
297
297
-
- "packages/*"
298
298
-
```
299
299
-
300
300
-
**Root `package.json`:**
301
301
-
302
302
-
```json
303
303
-
{
304
304
-
"name": "prototypey-monorepo",
305
305
-
"private": true,
306
306
-
"scripts": {
307
307
-
"build": "pnpm -r build",
308
308
-
"test": "pnpm -r test",
309
309
-
"lint": "pnpm -r lint",
310
310
-
"format": "prettier . --write"
311
311
-
}
312
312
-
}
313
313
-
```
314
314
-
315
315
-
**`packages/prototypey/package.json`:**
316
316
-
317
317
-
```json
318
318
-
{
319
319
-
"name": "prototypey",
320
320
-
"version": "0.0.0",
321
321
-
"main": "lib/index.js",
322
322
-
"exports": {
323
323
-
".": "./lib/index.js",
324
324
-
"./infer": "./lib/infer.js"
325
325
-
},
326
326
-
"dependencies": {},
327
327
-
"scripts": {
328
328
-
"build": "tsdown",
329
329
-
"test": "vitest run"
330
330
-
}
331
331
-
}
332
332
-
```
333
333
-
334
334
-
**`packages/cli/package.json`:**
335
335
-
336
336
-
```json
337
337
-
{
338
338
-
"name": "@prototypey/cli",
339
339
-
"version": "0.0.0",
340
340
-
"bin": {
341
341
-
"prototypey": "./lib/index.js"
342
342
-
},
343
343
-
"dependencies": {
344
344
-
"prototypey": "workspace:*",
345
345
-
"sade": "^1.8.1",
346
346
-
"tinyglobby": "^0.2.15"
347
347
-
},
348
348
-
"scripts": {
349
349
-
"build": "tsdown --entry src/index.ts --format esm --dts false"
350
350
-
}
351
351
-
}
352
352
-
```
353
353
-
354
354
-
### Migration Steps
355
355
-
356
356
-
1. Create `pnpm-workspace.yaml` at root
357
357
-
2. Create `packages/prototypey/` and move core files
358
358
-
3. Create `packages/cli/` and move CLI files
359
359
-
4. Update import paths in CLI to use `prototypey` package
360
360
-
5. Update root `package.json` to be private workspace root
361
361
-
6. Update build scripts to use `pnpm -r` (recursive)
362
362
-
7. Test both packages build independently
363
363
-
8. Update documentation
364
364
-
365
365
-
### Benefits
366
366
-
367
367
-
- **Cleaner dependency tree**: Core has zero dependencies
368
368
-
- **Better DX**: Users can `npm install prototypey` for types only
369
369
-
- **CLI as optional tool**: `npm install -D @prototypey/cli` when needed
370
370
-
- **Easier testing**: Each package can have its own test suite
371
371
-
- **Future expansion**: Easy to add more packages (e.g., `@prototypey/validator`)
372
372
-
373
373
-
## ATProto Lexicon Background Research
374
374
-
375
375
-
### Official Tooling: @atproto/lex-cli
376
376
-
377
377
-
ATProto projects use **lexicon schemas** (JSON files) to define data structures, API endpoints, and event streams. These schemas are then automatically transformed into type-safe TypeScript code using the **@atproto/lex-cli** code generation tool.
378
378
-
379
379
-
#### Installation
380
380
-
381
381
-
```bash
382
382
-
npm install @atproto/lex-cli
383
383
-
```
384
384
-
385
385
-
#### Available Commands
386
386
-
387
387
-
- **`lex gen-api <outdir> <schemas...>`** - Generate TypeScript client API
388
388
-
- **`lex gen-server <outdir> <schemas...>`** - Generate TypeScript server API
389
389
-
- **`lex gen-ts-obj <schemas...>`** - Generate a TS file that exports an array of schemas
390
390
-
- **`lex gen-md <schemas...>`** - Generate markdown documentation
391
391
-
- **`lex new [options] <nsid> [outfile]`** - Create a new schema JSON file
392
392
-
393
393
-
#### Common Options
394
394
-
395
395
-
- **`--yes`** - Auto-confirm overwrites during generation
396
396
-
397
397
-
### Typical Project Structure
398
398
-
399
399
-
```
400
400
-
project-root/
401
401
-
├── lexicons/ # Input: JSON schema definitions
402
402
-
│ ├── com/
403
403
-
│ │ └── atproto/
404
404
-
│ │ ├── repo/
405
405
-
│ │ │ ├── getRecord.json
406
406
-
│ │ │ └── createRecord.json
407
407
-
│ │ └── server/
408
408
-
│ │ └── defs.json
409
409
-
│ └── app/
410
410
-
│ └── bsky/
411
411
-
│ ├── feed/
412
412
-
│ │ └── post.json
413
413
-
│ └── richtext/
414
414
-
│ └── facet.json
415
415
-
├── src/
416
416
-
│ ├── client/ # Output: Generated client code
417
417
-
│ │ └── types/
418
418
-
│ │ ├── com/
419
419
-
│ │ │ └── atproto/
420
420
-
│ │ │ └── repo/
421
421
-
│ │ │ └── getRecord.ts
422
422
-
│ │ └── app/
423
423
-
│ │ └── bsky/
424
424
-
│ │ └── richtext/
425
425
-
│ │ └── facet.ts
426
426
-
│ └── lexicon/ # Output: Generated server code
427
427
-
└── package.json
428
428
-
```
429
429
-
430
430
-
### Naming Conventions
431
431
-
432
432
-
**NSIDs (Namespaced Identifiers)**:
433
433
-
434
434
-
- Format: Reverse-DNS + name (e.g., `com.atproto.repo.getRecord`)
435
435
-
- Domain authority: `com.atproto` (reverse DNS of `atproto.com`)
436
436
-
- Name segment: `getRecord`
437
437
-
- File path mirrors NSID: `lexicons/com/atproto/repo/getRecord.json`
438
438
-
439
439
-
**Definition Naming**:
440
440
-
441
441
-
- Records: Single nouns, not pluralized (e.g., `post`, `profile`)
442
442
-
- XRPC methods: verbNoun format (e.g., `getProfile`, `createRecord`)
443
443
-
- Shared definitions: Use `*.defs` lexicons (e.g., `com.atproto.server.defs`)
444
444
-
445
445
-
### Generated TypeScript Code Structure
446
446
-
447
447
-
The generated TypeScript file includes:
448
448
-
449
449
-
1. **TypeScript Interfaces** with explicit `$type` properties
450
450
-
2. **Type Guard Functions** (`is*`) for runtime type checking
451
451
-
3. **Validation Functions** (`validate*`) for schema validation
452
452
-
453
453
-
Example:
454
454
-
455
455
-
```typescript
456
456
-
/**
457
457
-
* GENERATED CODE - DO NOT MODIFY
458
458
-
*/
459
459
-
import { ValidationResult, BlobRef } from "@atproto/lexicon";
460
460
-
import { lexicons } from "../../../../lexicons";
461
461
-
import { isObj, hasProp } from "../../../../util";
462
462
-
import { CID } from "multiformats/cid";
463
463
-
464
464
-
export interface Main {
465
465
-
$type?: "app.bsky.richtext.facet";
466
466
-
index: ByteSlice;
467
467
-
features: (Mention | Link | Tag | { $type: string; [k: string]: unknown })[];
468
468
-
[k: string]: unknown;
469
469
-
}
470
470
-
471
471
-
export function isMain(v: unknown): v is Main {
472
472
-
return (
473
473
-
isObj(v) &&
474
474
-
hasProp(v, "$type") &&
475
475
-
(v.$type === "app.bsky.richtext.facet#main" ||
476
476
-
v.$type === "app.bsky.richtext.facet")
477
477
-
);
478
478
-
}
479
479
-
480
480
-
export function validateMain(v: unknown): ValidationResult {
481
481
-
return lexicons.validate("app.bsky.richtext.facet#main", v);
482
482
-
}
483
483
-
```
484
484
-
485
485
-
### Build Scripts & Integration
486
486
-
487
487
-
Example `package.json` scripts:
488
488
-
489
489
-
```json
490
490
-
{
491
491
-
"scripts": {
492
492
-
"codegen": "lex gen-api --yes ./src/client ../../lexicons/com/atproto/*/* ../../lexicons/app/bsky/*/*",
493
493
-
"build": "tsc --build tsconfig.build.json"
494
494
-
},
495
495
-
"devDependencies": {
496
496
-
"@atproto/lex-cli": "^0.9.1"
497
497
-
}
498
498
-
}
499
499
-
```
500
500
-
501
501
-
### Best Practices
502
502
-
503
503
-
1. **Use reverse-DNS NSIDs** for your domain (e.g., `com.example.*`)
504
504
-
2. **Group related schemas** by namespace hierarchy
505
505
-
3. **Create `*.defs` lexicons** for shared definitions used across multiple schemas
506
506
-
4. **Store lexicons in `/lexicons` directory** at repository root
507
507
-
5. **Mirror NSID structure in filesystem** (e.g., `lexicons/com/example/thing.json`)
508
508
-
6. **Run codegen before build** in your npm scripts
509
509
-
7. **Generate to predictable directories** (e.g., `./src/client`, `./src/lexicon`)
510
510
-
511
511
-
### Schema Evolution Rules
512
512
-
513
513
-
1. **New fields must be optional** to maintain backward compatibility
514
514
-
2. **Cannot remove non-optional fields** without breaking changes
515
515
-
3. **Cannot change field types** without creating new lexicon
516
516
-
4. **Cannot rename fields** - must deprecate and add new field
517
517
-
5. **Breaking changes require new NSID** (e.g., `v2` suffix)
518
518
-
519
519
-
### Type Categories in Lexicons
520
520
-
521
521
-
#### Primary Types (one per file)
522
522
-
523
523
-
- **record** - Repository-stored objects
524
524
-
- **query** - XRPC HTTP GET endpoints
525
525
-
- **procedure** - XRPC HTTP POST endpoints
526
526
-
- **subscription** - WebSocket event streams
527
527
-
528
528
-
#### Field Types
529
529
-
530
530
-
- **Primitives**: null, boolean, integer, string, bytes
531
531
-
- **Special**: cid-link, blob, unknown
532
532
-
- **Structures**: array, object, params
533
533
-
- **References**: ref, union, token
534
534
-
535
535
-
### Real-World Examples
536
536
-
537
537
-
- **Official ATProto Repository**: https://github.com/bluesky-social/atproto
538
538
-
- Lexicons: `/lexicons/com/atproto/*`, `/lexicons/app/bsky/*`
539
539
-
- Generated Client: `/packages/api/src/client/`
540
540
-
- Generated Server: `/packages/pds/src/lexicon/`
541
541
-
542
542
-
## Next Steps
543
543
-
544
544
-
Start with **Phase 1** - building the CLI and the inferred code generation, since that's prototypey's core differentiator.