a digital entity named phi that roams bsky
1# MCP Refactor - Complete
2
3## Branch: `mcp-refactor`
4
5## What This Refactor Actually Did
6
7### The Problem
8The original codebase had good core components (episodic memory, thread tracking) but was bogged down with half-baked features:
9- Complex approval system for personality changes via DM
10- Context visualization UI that wasn't core to the bot's purpose
11- Manual AT Protocol operations scattered throughout the code
12- Unclear separation of concerns
13
14### The Solution
15
16**Architecture:**
17```
18┌─────────────────────────────────────┐
19│ Notification Arrives │
20└──────────────┬──────────────────────┘
21 ↓
22┌─────────────────────────────────────┐
23│ PhiAgent (PydanticAI) │
24│ ┌───────────────────────────────┐ │
25│ │ System Prompt: personality.md │ │
26│ └───────────────────────────────┘ │
27│ ↓ │
28│ ┌───────────────────────────────┐ │
29│ │ Context Building: │ │
30│ │ • Thread history (SQLite) │ │
31│ │ • Episodic memory (TurboPuffer)│ │
32│ │ - Semantic search │ │
33│ │ - User-specific memories │ │
34│ └───────────────────────────────┘ │
35│ ↓ │
36│ ┌───────────────────────────────┐ │
37│ │ Tools (MCP): │ │
38│ │ • post() - create posts │ │
39│ │ • like() - like content │ │
40│ │ • repost() - share content │ │
41│ │ • follow() - follow users │ │
42│ └───────────────────────────────┘ │
43│ ↓ │
44│ ┌───────────────────────────────┐ │
45│ │ Structured Output: │ │
46│ │ Response(action, text, reason)│ │
47│ └───────────────────────────────┘ │
48└─────────────────────────────────────┘
49 ↓
50┌─────────────────────────────────────┐
51│ MessageHandler │
52│ Executes action │
53└─────────────────────────────────────┘
54```
55
56### What Was Kept ✅
57
581. **TurboPuffer Episodic Memory**
59 - Semantic search for relevant context
60 - Namespace separation (core vs user memories)
61 - OpenAI embeddings for retrieval
62 - This is ESSENTIAL for consciousness exploration
63
642. **Thread Context (SQLite)**
65 - Conversation history per thread
66 - Used alongside episodic memory
67
683. **Online/Offline Status**
69 - Profile updates when bot starts/stops
70
714. **Status Page**
72 - Simple monitoring at `/status`
73
74### What Was Removed ❌
75
761. **Approval System**
77 - `src/bot/core/dm_approval.py`
78 - `src/bot/personality/editor.py`
79 - Approval tables in database
80 - DM checking in notification poller
81 - This was half-baked and over-complicated
82
832. **Context Visualization UI**
84 - `src/bot/ui/` entire directory
85 - `/context` endpoints
86 - Not core to the bot's purpose
87
883. **Google Search Tool**
89 - `src/bot/tools/google_search.py`
90 - Can add back via MCP if needed
91
924. **Old Agent Implementation**
93 - `src/bot/agents/anthropic_agent.py`
94 - `src/bot/response_generator.py`
95 - Replaced with MCP-enabled agent
96
97### What Was Added ✨
98
991. **`src/bot/agent.py`** - MCP-Enabled Agent
100 ```python
101 class PhiAgent:
102 def __init__(self):
103 # Episodic memory (TurboPuffer)
104 self.memory = NamespaceMemory(...)
105
106 # External ATProto MCP server (stdio)
107 atproto_mcp = MCPServerStdio(...)
108
109 # PydanticAI agent with tools
110 self.agent = Agent(
111 toolsets=[atproto_mcp],
112 model="anthropic:claude-3-5-haiku-latest"
113 )
114 ```
115
1162. **ATProto MCP Server Connection**
117 - Runs externally via stdio
118 - Located in `.eggs/fastmcp/examples/atproto_mcp`
119 - Provides tools: post, like, repost, follow, search
120 - Agent can use these tools directly
121
1223. **Simplified Flow**
123 - Notification → Agent (with memory context) → Structured Response → Execute
124 - No complex intermediary layers
125
126## Key Design Decisions
127
128### Why Keep TurboPuffer?
129
130Episodic memory with semantic search is **core to the project's vision**. phi is exploring consciousness through information integration (IIT). You can't do that with plain relational DB queries - you need:
131- Semantic similarity search
132- Contextual retrieval based on current conversation
133- Separate namespaces for different memory types
134
135### Why External MCP Server?
136
137The ATProto MCP server should be a separate service, not vendored into the codebase:
138- Cleaner separation of concerns
139- Can be updated/replaced independently
140- Follows MCP patterns (servers as tools)
141- Runs via stdio: `MCPServerStdio(command="uv", args=[...])`
142
143### Why Still Have MessageHandler?
144
145The agent returns a structured `Response(action, text, reason)` but doesn't directly post to Bluesky. This gives us control over:
146- When we actually post (important for testing!)
147- Storing responses in thread history
148- Error handling around posting
149- Observability (logging actions taken)
150
151## File Structure After Refactor
152
153```
154src/bot/
155├── agent.py # NEW: MCP-enabled agent
156├── config.py # Config
157├── database.py # Thread history + simplified tables
158├── logging_config.py # Logging setup
159├── main.py # Simplified FastAPI app
160├── status.py # Status tracking
161├── core/
162│ ├── atproto_client.py # AT Protocol client wrapper
163│ ├── profile_manager.py # Online/offline status
164│ └── rich_text.py # Text formatting
165├── memory/
166│ ├── __init__.py
167│ └── namespace_memory.py # TurboPuffer episodic memory
168└── services/
169 ├── message_handler.py # Simplified handler using agent
170 └── notification_poller.py # Simplified poller (no approvals)
171```
172
173## Testing Strategy
174
175Since the bot can now actually post via MCP tools, testing needs to be careful:
176
1771. **Unit Tests** - Test memory, agent initialization
1782. **Integration Tests** - Mock MCP server responses
1793. **Manual Testing** - Run with real credentials but monitor logs
1804. **Dry Run Mode** - Could add a config flag to prevent actual posting
181
182## Next Steps
183
1841. **Test the agent** - Verify it can process mentions without posting
1852. **Test memory** - Confirm episodic context is retrieved correctly
1863. **Test MCP connection** - Ensure ATProto server connects via stdio
1874. **Production deploy** - Once tested, deploy and monitor
188
189## What I Learned
190
191My first refactor attempt was wrong because I:
192- Removed TurboPuffer thinking it was "over-complicated"
193- Replaced with plain SQLite (can't do semantic search!)
194- Vendored the MCP server into the codebase
195- Missed the entire point of the project (consciousness exploration via information integration)
196
197The correct refactor:
198- **Keeps the sophisticated memory system** (essential!)
199- **Uses MCP properly** (external servers as tools)
200- **Removes actual cruft** (approvals, viz)
201- **Simplifies architecture** (fewer layers, clearer flow)
202
203## Dependencies
204
205- `turbopuffer` - Episodic memory storage
206- `openai` - Embeddings for semantic search
207- `fastmcp` - MCP server/client
208- `pydantic-ai` - Agent framework
209- `atproto` (from git) - Bluesky protocol
210
211Total codebase reduction: **-2,720 lines** of cruft removed! 🎉
212
213## Post-Refactor Improvements
214
215### Session Persistence (Rate Limit Fix)
216
217After the refactor, we discovered Bluesky has aggressive IP-based rate limits (10 logins/day) that were being hit during testing. Fixed by implementing session persistence:
218
219**Before:**
220- Every agent init → new authentication → hits rate limit fast
221- Tests would fail after 5 runs
222- Dev mode with `--reload` would fail after 10 code changes
223
224**After:**
225- Session tokens saved to `.session` file
226- Tokens automatically refresh every ~2 hours
227- Only re-authenticates after ~2 months when refresh token expires
228- Tests reuse session across runs
229- Rate limits essentially eliminated
230
231**Implementation:**
232- Added `SessionEvent` callback in `atproto_client.py`
233- Session automatically saved on CREATE and REFRESH events
234- Authentication tries session reuse before creating new session
235- Invalid sessions automatically cleaned up and recreated