this repo has no description
at main 434 lines 12 kB view raw view rendered
1# Tagged String 2 3Extract structured data from strings using tag-based syntax. Zero dependencies, runs natively on Node.js v24+. 4 5```typescript 6import { TaggedStringParser } from 'tagged-string'; 7 8const parser = new TaggedStringParser(); 9const result = parser.parse('[operation:deploy] started with [changes:5] to [stack:prod-stack]'); 10 11console.log(result.entities); 12// [ 13// { type: 'operation', value: 'deploy', parsedValue: 'deploy', inferredType: 'string', ... }, 14// { type: 'changes', value: '5', parsedValue: 5, inferredType: 'number', ... }, 15// { type: 'stack', value: 'prod-stack', parsedValue: 'prod-stack', inferredType: 'string', ... } 16// ] 17``` 18 19The library focuses on parsing - extracting typed entities from plain strings. Generating tagged strings is trivial (just string interpolation); however, there is a `TaggedStringGenerator` class available as a reference or parent for your own implementation. 20 21## Installation 22 23```bash 24npm install tagged-string 25``` 26 27Requires Node.js v24 or later for native TypeScript support. 28 29## Usage 30 31### Basic Parsing 32 33The parser extracts `[type:value]` tags from strings and automatically infers types: 34 35```typescript 36const parser = new TaggedStringParser(); 37const result = parser.parse('[count:42] items processed, [enabled:true] flag set'); 38 39result.entities.forEach(entity => { 40 console.log(entity.type, entity.parsedValue, entity.inferredType); 41}); 42// count 42 number 43// enabled true boolean 44``` 45 46### Schema-Based Parsing 47 48Define a schema to enforce types and add formatters: 49 50```typescript 51const parser = new TaggedStringParser({ 52 schema: { 53 operation: { type: 'string', format: (v) => v.toUpperCase() }, 54 changes: { type: 'number', format: (n) => `${n} changes` }, 55 stack: 'string', // shorthand without formatter 56 } 57}); 58 59const result = parser.parse('[operation:deploy] started with [changes:5] to [stack:prod-stack]'); 60console.log(result.format()); 61// "DEPLOY started with 5 changes to prod-stack" 62``` 63 64### Filtering Entities 65 66```typescript 67const result = parser.parse('[action:create] [resource:function] with [count:3] instances'); 68 69result.getEntitiesByType('action'); // [{ type: 'action', parsedValue: 'create', ... }] 70result.getAllTypes(); // ['action', 'resource', 'count'] 71``` 72 73### Custom Delimiters 74 75Configure the parser to use different delimiters: 76 77```typescript 78const parser = new TaggedStringParser({ 79 openDelimiter: '{{', 80 closeDelimiter: '}}', 81 typeSeparator: '=', 82 schema: { 83 user: { type: 'string', format: (v) => `@${v}` } 84 } 85}); 86 87const result = parser.parse('User {{user=john}} performed {{count=10}} actions'); 88console.log(result.format()); 89// "User @john performed 10 actions" 90``` 91 92### Delimiter-Free Mode 93 94Parse key-value patterns without surrounding delimiters: 95 96```typescript 97const parser = new TaggedStringParser({ 98 delimiters: false, // or delimiters: [] 99 typeSeparator: '=' // default is ':', use '=' for key=value syntax 100}); 101 102const result = parser.parse('order=1337 was placed with status=pending'); 103console.log(result.entities); 104// [ 105// { type: 'order', value: '1337', parsedValue: 1337, inferredType: 'number', ... }, 106// { type: 'status', value: 'pending', parsedValue: 'pending', inferredType: 'string', ... } 107// ] 108``` 109 110Delimiter-free mode extracts `key=value` or `key:value` patterns from natural language strings. Entities are identified by whitespace boundaries. 111 112**Unified delimiter configuration:** 113 114```typescript 115// Delimiter-free mode 116new TaggedStringParser({ delimiters: false }) 117new TaggedStringParser({ delimiters: [] }) 118 119// Delimited mode with custom delimiters 120new TaggedStringParser({ delimiters: ['{{', '}}'] }) 121 122// Backward compatible (still works) 123new TaggedStringParser({ openDelimiter: '[', closeDelimiter: ']' }) 124``` 125 126### Quoted Strings 127 128Use double quotes to include spaces and special characters in keys or values: 129 130```typescript 131const parser = new TaggedStringParser({ 132 delimiters: false, 133 typeSeparator: '=' 134}); 135 136// Quoted values preserve spaces 137const result1 = parser.parse('order="number 42" was placed'); 138console.log(result1.entities[0].value); // "number 42" 139 140// Quoted keys allow spaces in type names 141const result2 = parser.parse('"store order"=1337 was cancelled'); 142console.log(result2.entities[0].type); // "store order" 143 144// Works in delimited mode too 145const parser2 = new TaggedStringParser({ 146 delimiters: ['[', ']'], 147 typeSeparator: '=' 148}); 149const result3 = parser2.parse('Server ["linux server"=home] is running'); 150console.log(result3.entities[0].type); // "linux server" 151``` 152 153### Escape Sequences 154 155Use backslash to include literal quotes within quoted strings: 156 157```typescript 158const parser = new TaggedStringParser({ 159 delimiters: false, 160 typeSeparator: '=' 161}); 162 163// Escape quotes with \" 164const result1 = parser.parse('msg="say \\"hello\\"" was sent'); 165console.log(result1.entities[0].value); // 'say "hello"' 166 167// Escape backslashes with \\ 168const result2 = parser.parse('path="C:\\\\Users\\\\file.txt" opened'); 169console.log(result2.entities[0].value); // 'C:\Users\file.txt' 170``` 171 172**Supported escape sequences:** 173- `\"` → literal double quote (`"`) 174- `\\` → literal backslash (`\`) 175 176Escape sequences only work inside quoted strings. Backslashes in unquoted text are treated as literal characters. 177 178## API 179 180### `TaggedStringParser` 181 182```typescript 183constructor(config?: ParserConfig) 184``` 185 186**Config options:** 187- `delimiters` - Unified delimiter configuration: 188 - `false` or `[]` - Enable delimiter-free mode (parse `key=value` patterns) 189 - `[open, close]` - Use specified delimiters (e.g., `['{{', '}}']`) 190 - If omitted, uses `openDelimiter` and `closeDelimiter` options 191- `openDelimiter` (default: `'['`) - Opening tag delimiter (legacy, use `delimiters` instead) 192- `closeDelimiter` (default: `']'`) - Closing tag delimiter (legacy, use `delimiters` instead) 193- `typeSeparator` (default: `':'`) - Separator between type and value 194- `schema` - Entity type definitions with optional formatters 195 196```typescript 197parse(message: string): ParseResult 198``` 199 200Extracts all tagged entities from the message. 201 202### `ParseResult` 203 204**Properties:** 205- `originalMessage: string` - The input message 206- `entities: Entity[]` - Extracted entities in order 207 208**Methods:** 209- `getEntitiesByType(type: string): Entity[]` - Filter entities by type 210- `getAllTypes(): string[]` - Get unique entity types 211- `format(): string` - Reconstruct message with formatted values 212 213### `Entity` 214 215```typescript 216interface Entity { 217 type: string; // Entity type name (can contain spaces if quoted) 218 value: string; // Raw string value (quotes removed, escapes processed) 219 parsedValue: string | number | boolean; // Typed value 220 formattedValue: string; // Formatted display value 221 inferredType: 'string' | 'number' | 'boolean'; 222 position: number; // Start position in message 223 endPosition: number; // End position in message 224} 225``` 226 227### `EntitySchema` 228 229```typescript 230type EntitySchema = Record<string, PrimitiveType | EntityDefinition>; 231 232interface EntityDefinition { 233 type: 'string' | 'number' | 'boolean'; 234 format?: (value: any) => string; 235} 236``` 237 238## Type Inference 239 240Without a schema, the parser infers types automatically: 241 242- **number**: Matches `/^-?\d+(\.\d+)?$/` (integers and decimals) 243- **boolean**: `'true'` or `'false'` (case-insensitive) 244- **string**: Everything else 245 246## Error Handling 247 248The parser is lenient by design: 249 250**Delimited mode:** 251- Malformed tags are skipped 252- Unclosed tags at end of string are ignored 253- Empty tag content is skipped 254 255**Delimiter-free mode:** 256- Unclosed quoted strings are skipped 257- Keys without values are skipped 258- Empty keys or values are skipped 259- Malformed patterns are ignored, parsing continues 260 261**Configuration errors** (throw on construction): 262- Invalid `delimiters` value 263- Empty delimiter strings in delimited mode 264- Same open and close delimiters 265 266## Complete Examples 267 268### Delimited Mode (Traditional) 269 270```typescript 271const parser = new TaggedStringParser({ 272 schema: { 273 operation: { type: 'string', format: (v) => v.toUpperCase() }, 274 count: { type: 'number', format: (n) => `${n} items` } 275 } 276}); 277 278const result = parser.parse('[operation:deploy] completed with [count:42]'); 279console.log(result.format()); 280// "DEPLOY completed with 42 items" 281``` 282 283### Delimiter-Free Mode 284 285```typescript 286const parser = new TaggedStringParser({ 287 delimiters: false, 288 typeSeparator: '=', 289 schema: { 290 order: 'number', 291 status: 'string' 292 } 293}); 294 295const result = parser.parse('Processing order=1337 with status=pending'); 296console.log(result.entities); 297// [ 298// { type: 'order', parsedValue: 1337, inferredType: 'number', ... }, 299// { type: 'status', parsedValue: 'pending', inferredType: 'string', ... } 300// ] 301``` 302 303### Quoted Strings with Spaces 304 305```typescript 306const parser = new TaggedStringParser({ 307 delimiters: false, 308 typeSeparator: '=' 309}); 310 311// Quoted values 312const msg1 = 'description="high priority task" created'; 313const result1 = parser.parse(msg1); 314console.log(result1.entities[0].value); // "high priority task" 315 316// Quoted keys 317const msg2 = '"user name"=john logged in'; 318const result2 = parser.parse(msg2); 319console.log(result2.entities[0].type); // "user name" 320 321// Both quoted 322const msg3 = '"store order"="order number 42" processed'; 323const result3 = parser.parse(msg3); 324console.log(result3.entities[0].type); // "store order" 325console.log(result3.entities[0].value); // "order number 42" 326``` 327 328### Escape Sequences 329 330```typescript 331const parser = new TaggedStringParser({ 332 delimiters: false, 333 typeSeparator: '=' 334}); 335 336// Escaped quotes 337const msg1 = 'message="She said \\"hello\\"" sent'; 338const result1 = parser.parse(msg1); 339console.log(result1.entities[0].value); // 'She said "hello"' 340 341// Escaped backslashes 342const msg2 = 'path="C:\\\\Program Files\\\\app.exe" opened'; 343const result2 = parser.parse(msg2); 344console.log(result2.entities[0].value); // 'C:\Program Files\app.exe' 345 346// Mixed escapes 347const msg3 = 'text="Line 1\\nLine 2" contains \\n'; 348const result3 = parser.parse(msg3); 349console.log(result3.entities[0].value); // 'Line 1\nLine 2' 350``` 351 352### Mixed Mode Usage 353 354```typescript 355// Delimited mode for structured logs 356const delimitedParser = new TaggedStringParser({ 357 delimiters: ['[', ']'], 358 schema: { level: 'string', code: 'number' } 359}); 360 361// Delimiter-free for natural language 362const freeParser = new TaggedStringParser({ 363 delimiters: false, 364 typeSeparator: '=', 365 schema: { order: 'number', user: 'string' } 366}); 367 368const log = delimitedParser.parse('[level:ERROR] [code:500] Server error'); 369const event = freeParser.parse('User user=alice placed order=1337'); 370 371console.log(log.getEntitiesByType('level')[0].value); // "ERROR" 372console.log(event.getEntitiesByType('order')[0].value); // "1337" 373``` 374 375## Generating Tagged Strings 376 377While you can create tagged strings with simple string interpolation, `TaggedStringGenerator` provides a reference implementation: 378 379```typescript 380import { TaggedStringGenerator } from 'tagged-string'; 381 382const generator = new TaggedStringGenerator(); 383 384// Simple tag generation 385const tag = generator.tag('operation', 'deploy'); 386// "[operation:deploy]" 387 388// Or just use template literals 389const message = `[operation:deploy] started with [changes:${5}]`; 390``` 391 392The generator is useful when you need to ensure delimiter consistency across a system. 393 394### `TaggedStringGenerator` 395 396```typescript 397constructor(config?: GeneratorConfig) 398``` 399 400**Config options:** 401- `openDelimiter` (default: `'['`) - Opening tag delimiter 402- `closeDelimiter` (default: `']'`) - Closing tag delimiter 403- `typeSeparator` (default: `':'`) - Separator between type and value 404 405```typescript 406tag(type: string, value: unknown): string 407``` 408 409Generates a single tagged entity. Values are converted to strings automatically. 410 411```typescript 412embed(message: string, type: string, value: unknown): string 413``` 414 415Convenience method that concatenates a message with a generated tag. 416 417## Examples 418 419Run the included examples: 420 421```bash 422node src/examples.ts 423``` 424 425## Development 426 427```bash 428npm test # Run tests 429node src/examples.ts # Run examples 430``` 431 432## License 433 434MIT