sandbox/APPROVAL_SYSTEM.md at main

zzstoatzz.io / bot
fork atom
a digital entity named phi that roams bsky
fork atom
bot / sandbox / APPROVAL_SYSTEM.md
at main 174 lines 6.6 kB view raw view rendered
wrap content
zzstoatzz.io refactor: remove unused approval system, document for future reference 4mo ago
6c5db0b8
  1# approval system (deprecated)
  2
  3## purpose
  4
  5the approval system was designed to enable phi to modify itself through conditional operator permission. the idea: phi could take certain actions that would be executed only after the operator (nate) explicitly approved them.
  6
  7## use case: self-modification
  8
  9the primary motivation was **personality/identity editing through empirical learning**. for example:
 10
 111. phi observes through interactions that certain responses work better
 122. phi proposes a modification to its personality file or core memories
 133. this proposal is stored as an "approval request" in sqlite
 144. the operator is notified (via bluesky thread or other channel)
 155. operator reviews and approves/denies via some interface
 166. if approved, phi applies the change to itself
 17
 18## implementation (removed)
 19
 20the system was implemented in `src/bot/database.py` (now removed) with:
 21
 22### database schema
 23```sql
 24CREATE TABLE approval_requests (
 25    id INTEGER PRIMARY KEY AUTOINCREMENT,
 26    request_type TEXT NOT NULL,              -- e.g., "personality_edit", "memory_update"
 27    request_data TEXT NOT NULL,              -- JSON with the proposed change
 28    status TEXT NOT NULL DEFAULT 'pending',  -- 'pending', 'approved', 'denied', 'expired'
 29    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
 30    resolved_at TIMESTAMP,
 31    resolver_comment TEXT,
 32    applied_at TIMESTAMP,
 33    thread_uri TEXT,                         -- bluesky thread where request was made
 34    notified_at TIMESTAMP,                   -- when thread was notified of resolution
 35    operator_notified_at TIMESTAMP           -- when operator was notified of request
 36)
 37```
 38
 39### api methods
 40- `create_approval_request(request_type, request_data, thread_uri)` - create new request
 41- `get_pending_approvals(include_notified=True)` - fetch pending requests
 42- `resolve_approval(approval_id, approved, comment)` - approve/deny
 43- `get_approval_by_id(approval_id)` - fetch specific request
 44- `mark_approval_notified(approval_id)` - mark thread notified
 45- `mark_operator_notified(approval_ids)` - mark operator notified
 46
 47## why it was removed
 48
 49the approval system was never integrated with the current MCP-based architecture. it was built for an earlier iteration of phi and became orphaned code (164 lines) during the refactor to pydanticai + MCP.
 50
 51## future integration considerations
 52
 53if we want to reintroduce self-modification with approval, here's how it could work with the current architecture:
 54
 55### option 1: mcp tool for approval requests
 56
 57create an MCP tool `request_operator_approval(action_type, proposal)` that:
 581. stores the request in turbopuffer (not sqlite) with metadata
 592. posts to a dedicated bluesky thread for operator review
 603. operator replies with "approved" or "denied"
 614. phi polls for operator's response and executes if approved
 62
 63**pros:**
 64- uses existing memory infrastructure (turbopuffer)
 65- natural interface (bluesky threads)
 66- no additional database needed
 67
 68**cons:**
 69- approval state is in turbopuffer, which is append-only
 70- need to poll bluesky threads for operator responses
 71
 72### option 2: dedicated approval service
 73
 74build a separate service (fastapi endpoint or slack bot) that:
 751. phi calls via MCP tool
 762. service sends notification to operator (email, slack, webhook)
 773. operator approves via web UI or slack command
 784. service stores approval in postgres/sqlite
 795. phi polls service for approval status
 80
 81**pros:**
 82- clean separation of concerns
 83- flexible notification channels
 84- persistent approval history
 85
 86**cons:**
 87- more infrastructure
 88- another service to run and maintain
 89
 90### option 3: human-in-the-loop via pydanticai
 91
 92use pydanticai's built-in human-in-the-loop features:
 931. agent proposes action that requires approval
 942. pydanticai pauses execution and waits for human input
 953. operator provides approval via some interface
 964. agent resumes and executes
 97
 98**pros:**
 99- leverages pydanticai primitives
100- minimal custom code
101
102**cons:**
103- unclear how this works with async/notification-driven architecture
104- may require blocking operations
105
106## recommended approach
107
108if we reintroduce this, i'd recommend **option 1** (mcp tool + turbopuffer):
109
110```python
111# in MCP server
112@server.tool()
113async def request_operator_approval(
114    action_type: str,  # "personality_edit", "memory_update", etc.
115    proposal: str,     # description of what phi wants to do
116    justification: str # why phi thinks this is a good idea
117) -> str:
118    """request operator approval for a self-modification action"""
119
120    # store in turbopuffer with special namespace
121    approval_id = await memory.store_approval_request(
122        action_type=action_type,
123        proposal=proposal,
124        justification=justification
125    )
126
127    # post to operator's bluesky mentions
128    await atproto.post(
129        f"🤖 approval request #{approval_id}\n\n"
130        f"action: {action_type}\n"
131        f"proposal: {proposal}\n\n"
132        f"justification: {justification}\n\n"
133        f"reply 'approve' or 'deny'"
134    )
135
136    return f"approval request #{approval_id} submitted"
137```
138
139then in the notification handler, check for operator replies to approval threads and execute the approved action.
140
141## examples of self-modification actions
142
143what kinds of things might phi want operator approval for?
144
1451. **personality edits** - "i notice people respond better when i'm more concise. can i add 'prefer brevity' to my guidelines?"
146
1472. **capability expansion** - "i've been asked about weather 5 times this week. can i add a weather API tool?"
148
1493. **memory pruning** - "i have 10,000 memories for @alice but most are low-value small talk. can i archive memories older than 30 days with low importance?"
150
1514. **behavior changes** - "i'm getting rate limited on likes. can i reduce my like threshold from 0.7 to 0.8?"
152
1535. **relationship updates** - "based on our conversations, i think @bob prefers technical depth over casual chat. can i update his user context?"
154
155## philosophical notes
156
157self-modification with approval is interesting because:
158
159- it preserves operator agency (you control what phi becomes)
160- it enables empirical learning (phi adapts based on real interactions)
161- it creates a collaborative evolution (phi proposes, you decide)
162
163but it also raises questions:
164
165- what if phi proposes changes you don't understand?
166- what if approval becomes a bottleneck (too many requests)?
167- what if phi learns to game the approval system?
168
169worth thinking through before reintroducing.
170
171## references
172
173- original implementation: `git log --all --grep="approval"` (if committed)
174- related: `sandbox/void_self_modification.md` (void's approach to self-modification)