···11+# Plan: Lexicon Emission for Prototypey
22+33+## Current State
44+55+- **Project**: Type-safe lexicon inference library (similar to Arktype's approach)
66+- **Structure**: TypeScript library with `src/`, `lib/` (compiled output), `samples/` (example JSON lexicons)
77+- **Build**: Uses `tsdown` for bundling, pnpm for package management
88+99+## Emission Strategy
1010+1111+### 1. Two-Track Approach
1212+1313+Since prototypey is about **type inference** from lexicons (not traditional codegen), we should support both:
1414+1515+#### Track A: Traditional Code Generation (compatibility)
1616+1717+- Install `@atproto/lex-cli` as a dev dependency
1818+- Emit standard TypeScript files like other atproto projects
1919+- Useful for projects that want traditional generated types
2020+2121+#### Track B: Type Inference (prototypey's core value)
2222+2323+- Leverage your existing inference engine (`src/infer.ts`)
2424+- Generate minimal runtime code with inferred types
2525+- This is your differentiator from standard atproto tooling
2626+2727+### 2. Directory Structure
2828+2929+```
3030+prototypey/
3131+├── lexicons/ # NEW: Input lexicon schemas
3232+│ └── (empty initially, users add their schemas here)
3333+├── samples/ # Keep existing samples
3434+│ └── *.json
3535+├── src/
3636+│ ├── cli/ # NEW: CLI tool for codegen
3737+│ │ ├── index.ts # Main CLI entry
3838+│ │ ├── commands/
3939+│ │ │ ├── gen-types.ts # Track A: Standard codegen
4040+│ │ │ └── gen-inferred.ts # Track B: Inference-based
4141+│ │ └── templates/
4242+│ └── ...existing code
4343+├── generated/ # NEW: Default output directory
4444+│ ├── types/ # Track A output
4545+│ └── inferred/ # Track B output
4646+└── package.json
4747+```
4848+4949+### 3. CLI Commands
5050+5151+Add to `package.json`:
5252+5353+```json
5454+{
5555+ "bin": {
5656+ "prototypey": "./lib/cli/index.js"
5757+ },
5858+ "scripts": {
5959+ "codegen": "prototypey gen-inferred ./generated/inferred ./lexicons/**/*.json"
6060+ }
6161+}
6262+```
6363+6464+Provide these commands:
6565+6666+- `prototypey gen-inferred <outdir> <schemas...>` - Generate type-inferred code (your unique approach)
6767+- `prototypey gen-types <outdir> <schemas...>` - Generate standard TypeScript (delegates to @atproto/lex-cli)
6868+- `prototypey init` - Initialize a new lexicon project with sample configs
6969+7070+### 4. Track B: Inferred Code Generation (Your Secret Sauce)
7171+7272+Generate minimal runtime code that leverages your inference:
7373+7474+```typescript
7575+// Example output: generated/inferred/app/bsky/feed/post.ts
7676+import type { Infer } from 'prototypey'
7777+import schema from '../../../../lexicons/app/bsky/feed/post.json' with { type: 'json' }
7878+7979+export type Post = Infer<typeof schema>
8080+8181+// Minimal runtime helpers
8282+export const PostSchema = schema
8383+export const isPost = (v: unknown): v is Post => {
8484+ return typeof v === 'object' && v !== null && '$type' in v &&
8585+ v.$type === 'app.bsky.feed.post'
8686+}
8787+```
8888+8989+Benefits:
9090+9191+- **No validation code duplication** - reuse @atproto/lexicon at runtime
9292+- **Type inference magic** - your core competency
9393+- **Smaller bundle size** - minimal generated code
9494+- **Simpler output** - easier to understand
9595+9696+### 5. Dependencies to Add
9797+9898+```json
9999+{
100100+ "dependencies": {
101101+ "@atproto/lexicon": "^0.3.0"
102102+ },
103103+ "devDependencies": {
104104+ "@atproto/lex-cli": "^0.9.1",
105105+ "commander": "^12.0.0",
106106+ "glob": "^10.0.0"
107107+ },
108108+ "peerDependencies": {
109109+ "typescript": ">=5.0.0"
110110+ }
111111+}
112112+```
113113+114114+### 6. Build Pipeline Integration
115115+116116+Update `package.json` scripts:
117117+118118+```json
119119+{
120120+ "scripts": {
121121+ "build": "tsdown",
122122+ "build:cli": "tsdown --entry src/cli/index.ts --format esm --dts false",
123123+ "codegen:samples": "prototypey gen-inferred ./generated/samples ./samples/*.json",
124124+ "prepack": "pnpm build && pnpm build:cli"
125125+ }
126126+}
127127+```
128128+129129+### 7. Configuration File (optional)
130130+131131+`prototypey.config.json`:
132132+133133+```json
134134+{
135135+ "lexicons": "./lexicons",
136136+ "output": {
137137+ "inferred": "./generated/inferred",
138138+ "types": "./generated/types"
139139+ },
140140+ "include": ["**/*.json"],
141141+ "exclude": ["**/node_modules/**"]
142142+}
143143+```
144144+145145+### 8. Documentation Updates
146146+147147+Create docs for:
148148+149149+1. **Quick Start**: How to run codegen on your lexicons
150150+2. **Track Comparison**: When to use inferred vs. standard generation
151151+3. **Migration Guide**: Moving from @atproto/lex-cli to prototypey
152152+4. **Type Inference Deep Dive**: How your inference works (marketing!)
153153+154154+## Key Differentiators
155155+156156+### Prototypey's Unique Value
157157+158158+1. **Compile-time type inference** - No runtime validation code needed
159159+2. **Smaller bundles** - Minimal generated code
160160+3. **Better DX** - Types are inferred, not generated boilerplate
161161+4. **Same safety guarantees** - Full TypeScript type checking
162162+163163+### vs. Standard @atproto/lex-cli
164164+165165+- **Standard**: Generates verbose validation code
166166+- **Prototypey**: Generates minimal code + type inference
167167+- **Both**: Same type safety, but prototypey is leaner
168168+169169+## Implementation Priority
170170+171171+1. ✅ **Phase 1**: Basic CLI structure + Track B (inferred generation) - COMPLETE
172172+2. ✅ **Phase 2**: File organization + output directory structure - COMPLETE
173173+3. **Phase 3**: Track A (standard generation, delegate to lex-cli)
174174+4. **Phase 4**: Configuration file support
175175+5. **Phase 5**: Documentation + examples
176176+177177+## Phase 1 & 2 Implementation Notes
178178+179179+### ✅ Completed (2025-10-16)
180180+181181+**Tech Stack Choices:**
182182+- Used `sade` instead of `commander` (modern, minimal CLI framework from awesome-e18e)
183183+- Used `tinyglobby` instead of `glob` (faster, modern alternative)
184184+- Built with `tsdown` for CLI bundling
185185+186186+**Structure Created:**
187187+```
188188+prototypey/
189189+├── src/cli/
190190+│ ├── index.ts # CLI entry with sade
191191+│ ├── commands/
192192+│ │ └── gen-inferred.ts # Track B implementation
193193+│ └── templates/
194194+│ └── inferred.ts # Code generation template
195195+├── generated/
196196+│ └── inferred/ # Generated type files
197197+├── lexicons/ # Input directory (empty, ready for user schemas)
198198+└── lib/cli/ # Built CLI output
199199+```
200200+201201+**Generated Code Pattern:**
202202+```typescript
203203+// generated/inferred/app/bsky/actor/profile.ts
204204+import type { Infer } from "prototypey";
205205+import schema from "../../../../../samples/demo.json" with { type: "json" };
206206+207207+export type Profile = Infer<typeof schema>;
208208+export const ProfileSchema = schema;
209209+export function isProfile(v: unknown): v is Profile { ... }
210210+```
211211+212212+**CLI Usage:**
213213+```bash
214214+# Build CLI
215215+pnpm build:cli
216216+217217+# Generate from samples
218218+pnpm codegen:samples
219219+220220+# Direct usage
221221+node lib/cli/index.js gen-inferred ./generated/inferred './samples/*.json'
222222+```
223223+224224+**Key Features:**
225225+- Converts NSID to file paths: `app.bsky.feed.post` → `app/bsky/feed/post.ts`
226226+- Generates minimal runtime code with type inference
227227+- Auto-creates directory structure
228228+- Skips invalid schemas gracefully
229229+- Type guard functions for runtime checks
230230+231231+**Testing:**
232232+- Successfully generated types from sample lexicons
233233+- Runtime validation works (tested with node)
234234+- Schema imports work correctly with JSON modules
235235+236236+## ATProto Lexicon Background Research
237237+238238+### Official Tooling: @atproto/lex-cli
239239+240240+ATProto projects use **lexicon schemas** (JSON files) to define data structures, API endpoints, and event streams. These schemas are then automatically transformed into type-safe TypeScript code using the **@atproto/lex-cli** code generation tool.
241241+242242+#### Installation
243243+244244+```bash
245245+npm install @atproto/lex-cli
246246+```
247247+248248+#### Available Commands
249249+250250+- **`lex gen-api <outdir> <schemas...>`** - Generate TypeScript client API
251251+- **`lex gen-server <outdir> <schemas...>`** - Generate TypeScript server API
252252+- **`lex gen-ts-obj <schemas...>`** - Generate a TS file that exports an array of schemas
253253+- **`lex gen-md <schemas...>`** - Generate markdown documentation
254254+- **`lex new [options] <nsid> [outfile]`** - Create a new schema JSON file
255255+256256+#### Common Options
257257+258258+- **`--yes`** - Auto-confirm overwrites during generation
259259+260260+### Typical Project Structure
261261+262262+```
263263+project-root/
264264+├── lexicons/ # Input: JSON schema definitions
265265+│ ├── com/
266266+│ │ └── atproto/
267267+│ │ ├── repo/
268268+│ │ │ ├── getRecord.json
269269+│ │ │ └── createRecord.json
270270+│ │ └── server/
271271+│ │ └── defs.json
272272+│ └── app/
273273+│ └── bsky/
274274+│ ├── feed/
275275+│ │ └── post.json
276276+│ └── richtext/
277277+│ └── facet.json
278278+├── src/
279279+│ ├── client/ # Output: Generated client code
280280+│ │ └── types/
281281+│ │ ├── com/
282282+│ │ │ └── atproto/
283283+│ │ │ └── repo/
284284+│ │ │ └── getRecord.ts
285285+│ │ └── app/
286286+│ │ └── bsky/
287287+│ │ └── richtext/
288288+│ │ └── facet.ts
289289+│ └── lexicon/ # Output: Generated server code
290290+└── package.json
291291+```
292292+293293+### Naming Conventions
294294+295295+**NSIDs (Namespaced Identifiers)**:
296296+297297+- Format: Reverse-DNS + name (e.g., `com.atproto.repo.getRecord`)
298298+- Domain authority: `com.atproto` (reverse DNS of `atproto.com`)
299299+- Name segment: `getRecord`
300300+- File path mirrors NSID: `lexicons/com/atproto/repo/getRecord.json`
301301+302302+**Definition Naming**:
303303+304304+- Records: Single nouns, not pluralized (e.g., `post`, `profile`)
305305+- XRPC methods: verbNoun format (e.g., `getProfile`, `createRecord`)
306306+- Shared definitions: Use `*.defs` lexicons (e.g., `com.atproto.server.defs`)
307307+308308+### Generated TypeScript Code Structure
309309+310310+The generated TypeScript file includes:
311311+312312+1. **TypeScript Interfaces** with explicit `$type` properties
313313+2. **Type Guard Functions** (`is*`) for runtime type checking
314314+3. **Validation Functions** (`validate*`) for schema validation
315315+316316+Example:
317317+318318+```typescript
319319+/**
320320+ * GENERATED CODE - DO NOT MODIFY
321321+ */
322322+import { ValidationResult, BlobRef } from '@atproto/lexicon'
323323+import { lexicons } from '../../../../lexicons'
324324+import { isObj, hasProp } from '../../../../util'
325325+import { CID } from 'multiformats/cid'
326326+327327+export interface Main {
328328+ $type?: 'app.bsky.richtext.facet'
329329+ index: ByteSlice
330330+ features: (Mention | Link | Tag | { $type: string; [k: string]: unknown })[]
331331+ [k: string]: unknown
332332+}
333333+334334+export function isMain(v: unknown): v is Main {
335335+ return (
336336+ isObj(v) &&
337337+ hasProp(v, '$type') &&
338338+ (v.$type === 'app.bsky.richtext.facet#main' ||
339339+ v.$type === 'app.bsky.richtext.facet')
340340+ )
341341+}
342342+343343+export function validateMain(v: unknown): ValidationResult {
344344+ return lexicons.validate('app.bsky.richtext.facet#main', v)
345345+}
346346+```
347347+348348+### Build Scripts & Integration
349349+350350+Example `package.json` scripts:
351351+352352+```json
353353+{
354354+ "scripts": {
355355+ "codegen": "lex gen-api --yes ./src/client ../../lexicons/com/atproto/*/* ../../lexicons/app/bsky/*/*",
356356+ "build": "tsc --build tsconfig.build.json"
357357+ },
358358+ "devDependencies": {
359359+ "@atproto/lex-cli": "^0.9.1"
360360+ }
361361+}
362362+```
363363+364364+### Best Practices
365365+366366+1. **Use reverse-DNS NSIDs** for your domain (e.g., `com.example.*`)
367367+2. **Group related schemas** by namespace hierarchy
368368+3. **Create `*.defs` lexicons** for shared definitions used across multiple schemas
369369+4. **Store lexicons in `/lexicons` directory** at repository root
370370+5. **Mirror NSID structure in filesystem** (e.g., `lexicons/com/example/thing.json`)
371371+6. **Run codegen before build** in your npm scripts
372372+7. **Generate to predictable directories** (e.g., `./src/client`, `./src/lexicon`)
373373+374374+### Schema Evolution Rules
375375+376376+1. **New fields must be optional** to maintain backward compatibility
377377+2. **Cannot remove non-optional fields** without breaking changes
378378+3. **Cannot change field types** without creating new lexicon
379379+4. **Cannot rename fields** - must deprecate and add new field
380380+5. **Breaking changes require new NSID** (e.g., `v2` suffix)
381381+382382+### Type Categories in Lexicons
383383+384384+#### Primary Types (one per file)
385385+386386+- **record** - Repository-stored objects
387387+- **query** - XRPC HTTP GET endpoints
388388+- **procedure** - XRPC HTTP POST endpoints
389389+- **subscription** - WebSocket event streams
390390+391391+#### Field Types
392392+393393+- **Primitives**: null, boolean, integer, string, bytes
394394+- **Special**: cid-link, blob, unknown
395395+- **Structures**: array, object, params
396396+- **References**: ref, union, token
397397+398398+### Real-World Examples
399399+400400+- **Official ATProto Repository**: https://github.com/bluesky-social/atproto
401401+ - Lexicons: `/lexicons/com/atproto/*`, `/lexicons/app/bsky/*`
402402+ - Generated Client: `/packages/api/src/client/`
403403+ - Generated Server: `/packages/pds/src/lexicon/`
404404+405405+## Next Steps
406406+407407+Start with **Phase 1** - building the CLI and the inferred code generation, since that's prototypey's core differentiator.
+26
generated/inferred/app/bsky/actor/defs.ts
···11+// Generated by prototypey - DO NOT EDIT
22+// Source: app.bsky.actor.defs
33+import type { Infer } from "prototypey";
44+import schema from "../../../../../samples/actor-namespace.json" with { type: "json" };
55+66+/**
77+ * Type-inferred from lexicon schema: app.bsky.actor.defs
88+ */
99+export type Defs = Infer<typeof schema>;
1010+1111+/**
1212+ * The lexicon schema object
1313+ */
1414+export const DefsSchema = schema;
1515+1616+/**
1717+ * Type guard to check if a value is a Defs
1818+ */
1919+export function isDefs(v: unknown): v is Defs {
2020+ return (
2121+ typeof v === "object" &&
2222+ v !== null &&
2323+ "$type" in v &&
2424+ v.$type === "app.bsky.actor.defs"
2525+ );
2626+}
+26
generated/inferred/app/bsky/actor/profile.ts
···11+// Generated by prototypey - DO NOT EDIT
22+// Source: app.bsky.actor.profile
33+import type { Infer } from "prototypey";
44+import schema from "../../../../../samples/profile-namespace.json" with { type: "json" };
55+66+/**
77+ * Type-inferred from lexicon schema: app.bsky.actor.profile
88+ */
99+export type Profile = Infer<typeof schema>;
1010+1111+/**
1212+ * The lexicon schema object
1313+ */
1414+export const ProfileSchema = schema;
1515+1616+/**
1717+ * Type guard to check if a value is a Profile
1818+ */
1919+export function isProfile(v: unknown): v is Profile {
2020+ return (
2121+ typeof v === "object" &&
2222+ v !== null &&
2323+ "$type" in v &&
2424+ v.$type === "app.bsky.actor.profile"
2525+ );
2626+}
+23
generated/test-generated-types.ts
···11+// Test file to verify generated types work correctly
22+import type { Profile } from "./inferred/app/bsky/actor/profile.js";
33+import type { Defs } from "./inferred/app/bsky/actor/defs.js";
44+import { isProfile } from "./inferred/app/bsky/actor/profile.js";
55+import { isDefs } from "./inferred/app/bsky/actor/defs.js";
66+77+// Test that the types are inferred correctly
88+const profile: Profile = {
99+ $type: "app.bsky.actor.profile",
1010+ displayName: "Tyler",
1111+ description: "Building cool stuff",
1212+};
1313+1414+// Test type guard
1515+const unknownValue: unknown = { $type: "app.bsky.actor.defs" };
1616+if (isDefs(unknownValue)) {
1717+ // Type should be narrowed to Defs
1818+ const defs: Defs = unknownValue;
1919+ console.log("Is Defs:", defs.$type);
2020+}
2121+2222+console.log("Types work! ✓");
2323+console.log("Profile:", profile);