An MCP server for Osprey
1# SML Rules Guide for AI Agents
2
3This document provides a comprehensive guide to writing SML (Some Madeup Language) rules in Osprey. SML is a statically-typed subset of Python designed for writing detection and classification rules.
4
5## Table of Contents
6
71. [Overview](#overview)
82. [Basic Concepts](#basic-concepts)
93. [Rule Structure](#rule-structure)
104. [Data Types](#data-types)
115. [Operators](#operators)
126. [Built-in Functions (UDFs)](#built-in-functions-udfs)
137. [File Organization](#file-organization)
148. [Wiring Rules to Effects](#wiring-rules-to-effects)
159. [Null Handling](#null-handling-critical)
1610. [Complete Examples](#complete-examples)
1711. [Common Patterns](#common-patterns)
1812. [Validation Rules](#validation-rules)
19
20---
21
22## Overview
23
24SML rules are used to evaluate incoming action data and trigger effects like verdicts and labels. Key characteristics:
25
26- **Statically typed**: All types are checked at validation time
27- **Python-like syntax**: Familiar syntax with restrictions for safety
28- **Stateless logic**: Rules define conditions, not procedures
29- **Event-driven**: Rules evaluate against incoming action JSON data
30
31---
32
33## Basic Concepts
34
35### What is a Rule?
36
37A rule is a named boolean expression that evaluates conditions against action data:
38
39```python
40MyRule = Rule(
41 when_all=[
42 # List of conditions - ALL must be True for the rule to pass
43 Condition1,
44 Condition2,
45 ],
46 description='Human-readable description of what this rule detects'
47)
48```
49
50### What is an Effect?
51
52Effects are actions taken when rules pass. They are triggered through `WhenRules()`:
53
54- `DeclareVerdict(verdict='reject')` - Returns a verdict to the caller
55- `LabelAdd(entity=UserId, label='flagged')` - Adds a label to an entity
56- `LabelRemove(entity=UserId, label='flagged')` - Removes a label from an entity
57
58### What is an Entity?
59
60Entities are typed identifiers (like User IDs, emails, IPs) that can have labels attached:
61
62```python
63UserId: Entity[str] = EntityJson(type='User', path='$.user_id')
64```
65
66---
67
68## Rule Structure
69
70### Basic Rule Definition
71
72```python
73RuleName = Rule(
74 when_all=[
75 # Conditions go here - ALL must be True
76 ],
77 description='Description string or f-string'
78)
79```
80
81### Syntax Requirements
82
831. **Rule names must be non-local variables** (cannot start with `_`):
84 ```python
85 # Valid
86 MyRule = Rule(...)
87
88 # Invalid - will fail validation
89 _MyRule = Rule(...)
90 ```
91
922. **Descriptions must be string or f-string literals** (not variables):
93 ```python
94 # Valid
95 description='Static description'
96 description=f'User {UserId} triggered rule'
97
98 # Invalid - will fail validation
99 my_desc = 'description'
100 description=my_desc
101 ```
102
1033. **when_all accepts a list of boolean conditions**:
104 - All conditions must evaluate to True for the rule to pass
105 - Conditions can be comparisons, function calls, other rules, or boolean values
106
107---
108
109## Data Types
110
111### Basic Types
112
113```python
114# Integer
115Count: int = JsonData(path='$.count')
116
117# String
118Name: str = JsonData(path='$.name')
119
120# Boolean
121IsActive: bool = JsonData(path='$.active')
122
123# List
124Items: list = JsonData(path='$.items')
125```
126
127### Entity Types
128
129Entities are special types for identifiers that can have labels:
130
131```python
132# Entity with string ID
133UserId: Entity[str] = EntityJson(
134 type='User',
135 path='$.user_id',
136 coerce_type=True # Optional: convert value to expected type
137)
138
139# Entity with integer ID
140PostId: Entity[int] = EntityJson(
141 type='Post',
142 path='$.post_id'
143)
144
145# Manually created entity
146MyEntity = Entity(type='MyType', id='some_value')
147```
148
149### Optional Types
150
151For fields that may not exist:
152
153```python
154# Optional field - won't fail if missing
155OptionalField: Optional[str] = JsonData(path='$.maybe_exists', required=False)
156```
157
158---
159
160## Operators
161
162### Comparison Operators
163
164```python
165Value == 5 # Equals
166Value != 5 # Not equals
167Value > 5 # Greater than
168Value >= 5 # Greater than or equal
169Value < 5 # Less than
170Value <= 5 # Less than or equal
171Value in [1, 2, 3] # In list
172Value not in [1, 2, 3] # Not in list
173```
174
175### Arithmetic Operators
176
177```python
1785 + 3 # Addition
1795 - 3 # Subtraction
1805 * 3 # Multiplication
1815 / 3 # Division
1825 // 3 # Floor division
1835 % 3 # Modulo
1845 ** 3 # Power
185```
186
187### Boolean Operators
188
189```python
190Condition1 and Condition2 # Logical AND
191Condition1 or Condition2 # Logical OR
192not Condition1 # Logical NOT
193
194# In when_all, conditions are implicitly AND-ed:
195Rule(when_all=[
196 Cond1,
197 Cond2, # Both must be True
198])
199
200# Use 'or' for explicit OR logic:
201Rule(when_all=[
202 (Cond1 or Cond2), # Either Cond1 or Cond2
203 Cond3, # AND Cond3
204])
205```
206
207### Null Checking
208
209```python
210Value != Null # Check if value is NOT null
211Value == Null # Check if value IS null
212```
213
214---
215
216## File Organization
217
218### Import - Include Other Files
219
220Use `Import` to include rules/features from other files:
221
222```python
223Import(rules=[
224 'models/base.sml',
225 'models/user.sml',
226 'rules/common.sml',
227])
228```
229
230**Requirements:**
231- File paths must be relative
232- List must be **lexicographically sorted**
233- No duplicates allowed
234- Imported variables/rules are accessible in current file
235
236### Require - Conditionally Include Files
237
238Use `Require` for conditional or template-based includes:
239
240```python
241# Always include
242Require(rule='expensive_check.sml')
243
244# Conditional include
245Require(rule='ai_check.sml', require_if=ActionName == 'register')
246
247# Template-based (f-string)
248Require(rule=f'actions/{ActionName}.sml')
249```
250
251**Note:** Unlike Import, outputs from Required files are NOT accessible in the parent file.
252
253### Typical File Structure
254
255```
256rules/
257├── main.sml # Entry point
258├── models/
259│ ├── base.sml # Common entities (UserId, etc.)
260│ ├── user.sml # User-related features
261│ └── content.sml # Content-related features
262├── rules/
263│ ├── spam.sml # Spam detection rules
264│ └── abuse.sml # Abuse detection rules
265└── actions/
266 ├── register.sml # Action-specific rules
267 └── send_message.sml
268```
269
270---
271
272## Wiring Rules to Effects
273
274Rules by themselves don't do anything. Use `WhenRules()` to connect rules to effects:
275
276```python
277WhenRules(
278 rules_any=[
279 Rule1,
280 Rule2,
281 Rule3,
282 ],
283 then=[
284 DeclareVerdict(verdict='reject'),
285 LabelAdd(entity=UserId, label='flagged'),
286 ],
287)
288```
289
290**Semantics:**
291- If **ANY** rule in `rules_any` evaluates to True, **ALL** effects in `then` execute
292- Failed rules don't prevent other rules from being checked
293- Failed effects don't prevent other effects from executing
294
295### Conditional Effects
296
297Use `apply_if` to make individual effects conditional:
298
299```python
300WhenRules(
301 rules_any=[MainRule],
302 then=[
303 LabelAdd(entity=UserId, label='basic_flag'),
304 LabelAdd(entity=UserId, label='severe_flag', apply_if=SevereRule),
305 LabelAdd(entity=UserId, label='repeat_offender', apply_if=RepeatOffenderRule),
306 ],
307)
308```
309
310---
311
312## Null Handling (CRITICAL)
313
314SML has unique null semantics that differ from most languages. **Understanding this is critical.**
315
316### Null Propagation Rule
317
318If a value evaluates to Null:
3191. The containing rule evaluates to **Null** (not False!)
3202. Any rule depending on it also becomes **Null**
3213. This propagates through the entire dependency chain
322
323### Example of Null Propagation
324
325```python
326# If $.missing_property doesn't exist...
327Thing: int = JsonData(path='$.missing_property')
328
329# This rule becomes Null (NOT False)
330MyRule = Rule(when_all=[
331 Thing > 1,
332])
333
334# This rule ALSO becomes Null (propagates!)
335DependentRule = Rule(when_all=[
336 MyRule,
337])
338```
339
340### Solutions to Null Issues
341
342**Solution 1: Use `required=False`**
343```python
344Thing: Optional[int] = JsonData(path='$.maybe_exists', required=False)
345```
346
347**Solution 2: Explicit null checks**
348```python
349SafeRule = Rule(when_all=[
350 Thing != Null, # Guard against null
351 Thing > 1,
352])
353```
354
355**Solution 3: Use ResolveOptional**
356```python
357SafeThing: int = ResolveOptional(
358 optional_value=MaybeThing,
359 default_value=0
360)
361```
362
363---
364
365## Complete Examples
366
367### Example 1: Basic Spam Detection
368
369```python
370# models/base.sml
371UserId: Entity[str] = EntityJson(
372 type='User',
373 path='$.user_id',
374 coerce_type=True
375)
376
377MessageText: str = JsonData(path='$.message.text')
378EventType: str = JsonData(path='$.event_type')
379
380# rules/spam.sml
381Import(rules=['models/base.sml'])
382
383MessageLength = StringLength(s=MessageText)
384
385ContainsSpamWords = RegexMatch(
386 target=MessageText,
387 pattern=r'(free money|click here|buy now)',
388 case_insensitive=True
389)
390
391SpamMessage = Rule(
392 when_all=[
393 EventType == 'send_message',
394 ContainsSpamWords,
395 MessageLength > 50,
396 ],
397 description=f'Spam detected from user {UserId}'
398)
399
400WhenRules(
401 rules_any=[SpamMessage],
402 then=[
403 DeclareVerdict(verdict='reject'),
404 LabelAdd(entity=UserId, label='spammer', expires_after=TimeDelta(days=30)),
405 ],
406)
407```
408
409### Example 2: New Account Risk Detection
410
411```python
412# models/user.sml
413Import(rules=['models/base.sml'])
414
415AccountCreatedAt: str = JsonData(path='$.user.created_at')
416AccountAge = TimeSince(timestamp=AccountCreatedAt)
417IsNewAccount = AccountAge < TimeDelta(days=7)
418
419EmailAddress: Entity[str] = EntityJson(
420 type='Email',
421 path='$.user.email',
422 coerce_type=True
423)
424
425EmailDomainStr = EmailDomain(email=JsonData(path='$.user.email'))
426
427# rules/new_account.sml
428Import(rules=['models/base.sml', 'models/user.sml'])
429
430IsSuspiciousEmailDomain = EmailDomainStr in ['tempmail.com', 'throwaway.net']
431
432HighRiskNewAccount = Rule(
433 when_all=[
434 IsNewAccount,
435 IsSuspiciousEmailDomain,
436 not HasLabel(entity=UserId, label='verified'),
437 ],
438 description=f'High-risk new account: {UserId} using {EmailAddress}'
439)
440
441WhenRules(
442 rules_any=[HighRiskNewAccount],
443 then=[
444 DeclareVerdict(verdict='challenge'),
445 LabelAdd(entity=UserId, label='needs_verification'),
446 LabelAdd(entity=EmailAddress, label='suspicious_domain'),
447 ],
448)
449```
450
451### Example 3: Multi-Tier Detection
452
453```python
454Import(rules=['models/base.sml', 'models/user.sml'])
455
456# Tier 1: Basic suspicious activity
457BasicSuspicious = Rule(
458 when_all=[
459 HasLabel(entity=UserId, label='previously_warned'),
460 EventType == 'create_post',
461 ],
462 description=f'Previously warned user {UserId} creating content'
463)
464
465# Tier 2: Escalated risk
466EscalatedRisk = Rule(
467 when_all=[
468 BasicSuspicious,
469 HasLabel(entity=UserId, label='multiple_violations'),
470 ],
471 description=f'Repeat offender {UserId} detected'
472)
473
474# Tier 3: Severe risk
475SevereRisk = Rule(
476 when_all=[
477 EscalatedRisk,
478 IsNewAccount,
479 ],
480 description=f'Severe risk: new repeat offender {UserId}'
481)
482
483WhenRules(
484 rules_any=[BasicSuspicious, EscalatedRisk, SevereRisk],
485 then=[
486 # Always apply basic flag
487 LabelAdd(entity=UserId, label='flagged'),
488 # Conditional escalations
489 LabelAdd(entity=UserId, label='review_queue', apply_if=EscalatedRisk),
490 DeclareVerdict(verdict='reject', apply_if=SevereRisk),
491 ],
492)
493```
494
495---
496
497## Common Patterns
498
499### Pattern 1: Safe Field Access
500
501```python
502# For potentially missing fields, use required=False
503MaybeField: Optional[str] = JsonData(path='$.optional.field', required=False)
504
505# Then check for null before using
506SafeRule = Rule(when_all=[
507 MaybeField != Null,
508 StringLength(s=MaybeField) > 10,
509])
510```
511
512### Pattern 2: Action-Specific Rules
513
514```python
515# main.sml
516ActionName = GetActionName()
517Require(rule=f'actions/{ActionName}.sml')
518```
519
520### Pattern 3: Reusable Feature Definitions
521
522```python
523# models/features.sml
524MessageLength = StringLength(s=MessageText)
525IsLongMessage = MessageLength > 500
526IsShortMessage = MessageLength < 10
527ContainsUrls = ListLength(list=StringExtractURLs(s=MessageText)) > 0
528
529# rules/detection.sml
530Import(rules=['models/features.sml'])
531
532SuspiciousLongMessage = Rule(when_all=[
533 IsLongMessage,
534 ContainsUrls,
535])
536```
537
538### Pattern 4: Label-Based State Machine
539
540```python
541# First offense
542FirstOffense = Rule(when_all=[
543 ViolatesPolicy,
544 not HasLabel(entity=UserId, label='warned'),
545])
546
547# Second offense
548SecondOffense = Rule(when_all=[
549 ViolatesPolicy,
550 HasLabel(entity=UserId, label='warned'),
551 not HasLabel(entity=UserId, label='suspended'),
552])
553
554WhenRules(
555 rules_any=[FirstOffense],
556 then=[
557 LabelAdd(entity=UserId, label='warned', expires_after=TimeDelta(days=30)),
558 ],
559)
560
561WhenRules(
562 rules_any=[SecondOffense],
563 then=[
564 LabelAdd(entity=UserId, label='suspended'),
565 DeclareVerdict(verdict='reject'),
566 ],
567)
568```
569
570---
571
572## Validation Rules
573
574SML validates rules at compile time. Common validation errors:
575
576| Error | Cause | Fix |
577|-------|-------|-----|
578| "rules must be stored in non-local features" | Rule name starts with `_` | Remove underscore prefix |
579| "use local feaures when possible" | Feature that isn't useful to a moderator or evaluator in the UI starts with `_` | Add underscore prefix |
580| "requires either a string literal or an f-string" | Using a variable for description | Use string or f-string literal |
581| "import rules are not sorted" | Import list not alphabetized | Sort imports lexicographically |
582| "imported file not found" | Invalid path in Import | Check file path exists |
583| "unknown label" | Label not in config | Add label to labels config |
584| "invalid regex pattern" | Bad regex syntax | Fix regex pattern |
585
586---
587
588## Quick Reference
589
590### Rule Definition
591```python
592RuleName = Rule(
593 when_all=[conditions],
594 description='string' or f'f-string with {Variable}'
595)
596```
597
598### Effect Wiring
599```python
600WhenRules(
601 rules_any=[Rule1, Rule2],
602 then=[Effect1, Effect2],
603)
604
605
606### Null Safety Checklist
607- [ ] Check if optional fields use `required=False`
608- [ ] Add explicit `!= Null` checks before using potentially null values
609- [ ] Consider using `ResolveOptional` for default values
610- [ ] Test rules with missing data scenarios