# MCP Refactor - Complete

## Branch: `mcp-refactor`

## What This Refactor Actually Did

### The Problem
The original codebase had good core components (episodic memory, thread tracking) but was bogged down with half-baked features:
- Complex approval system for personality changes via DM
- Context visualization UI that wasn't core to the bot's purpose
- Manual AT Protocol operations scattered throughout the code
- Unclear separation of concerns

### The Solution

**Architecture:**
```
┌─────────────────────────────────────┐
│     Notification Arrives            │
└──────────────┬──────────────────────┘
               ↓
┌─────────────────────────────────────┐
│     PhiAgent (PydanticAI)           │
│  ┌───────────────────────────────┐  │
│  │ System Prompt: personality.md │  │
│  └───────────────────────────────┘  │
│              ↓                      │
│  ┌───────────────────────────────┐  │
│  │ Context Building:             │  │
│  │ • Thread history (SQLite)     │  │
│  │ • Episodic memory (TurboPuffer)│ │
│  │   - Semantic search           │  │
│  │   - User-specific memories    │  │
│  └───────────────────────────────┘  │
│              ↓                      │
│  ┌───────────────────────────────┐  │
│  │ Tools (MCP):                  │  │
│  │ • post() - create posts       │  │
│  │ • like() - like content       │  │
│  │ • repost() - share content    │  │
│  │ • follow() - follow users     │  │
│  └───────────────────────────────┘  │
│              ↓                      │
│  ┌───────────────────────────────┐  │
│  │ Structured Output:            │  │
│  │ Response(action, text, reason)│  │
│  └───────────────────────────────┘  │
└─────────────────────────────────────┘
               ↓
┌─────────────────────────────────────┐
│     MessageHandler                  │
│     Executes action                 │
└─────────────────────────────────────┘
```

### What Was Kept ✅

1. **TurboPuffer Episodic Memory**
   - Semantic search for relevant context
   - Namespace separation (core vs user memories)
   - OpenAI embeddings for retrieval
   - This is ESSENTIAL for consciousness exploration

2. **Thread Context (SQLite)**
   - Conversation history per thread
   - Used alongside episodic memory

3. **Online/Offline Status**
   - Profile updates when bot starts/stops

4. **Status Page**
   - Simple monitoring at `/status`

### What Was Removed ❌

1. **Approval System**
   - `src/bot/core/dm_approval.py`
   - `src/bot/personality/editor.py`
   - Approval tables in database
   - DM checking in notification poller
   - This was half-baked and over-complicated

2. **Context Visualization UI**
   - `src/bot/ui/` entire directory
   - `/context` endpoints
   - Not core to the bot's purpose

3. **Google Search Tool**
   - `src/bot/tools/google_search.py`
   - Can add back via MCP if needed

4. **Old Agent Implementation**
   - `src/bot/agents/anthropic_agent.py`
   - `src/bot/response_generator.py`
   - Replaced with MCP-enabled agent

### What Was Added ✨

1. **`src/bot/agent.py`** - MCP-Enabled Agent
   ```python
   class PhiAgent:
       def __init__(self):
           # Episodic memory (TurboPuffer)
           self.memory = NamespaceMemory(...)

           # External ATProto MCP server (stdio)
           atproto_mcp = MCPServerStdio(...)

           # PydanticAI agent with tools
           self.agent = Agent(
               toolsets=[atproto_mcp],
               model="anthropic:claude-3-5-haiku-latest"
           )
   ```

2. **ATProto MCP Server Connection**
   - Runs externally via stdio
   - Located in `.eggs/fastmcp/examples/atproto_mcp`
   - Provides tools: post, like, repost, follow, search
   - Agent can use these tools directly

3. **Simplified Flow**
   - Notification → Agent (with memory context) → Structured Response → Execute
   - No complex intermediary layers

## Key Design Decisions

### Why Keep TurboPuffer?

Episodic memory with semantic search is **core to the project's vision**. phi is exploring consciousness through information integration (IIT). You can't do that with plain relational DB queries - you need:
- Semantic similarity search
- Contextual retrieval based on current conversation
- Separate namespaces for different memory types

### Why External MCP Server?

The ATProto MCP server should be a separate service, not vendored into the codebase:
- Cleaner separation of concerns
- Can be updated/replaced independently
- Follows MCP patterns (servers as tools)
- Runs via stdio: `MCPServerStdio(command="uv", args=[...])`

### Why Still Have MessageHandler?

The agent returns a structured `Response(action, text, reason)` but doesn't directly post to Bluesky. This gives us control over:
- When we actually post (important for testing!)
- Storing responses in thread history
- Error handling around posting
- Observability (logging actions taken)

## File Structure After Refactor

```
src/bot/
├── agent.py                    # NEW: MCP-enabled agent
├── config.py                   # Config
├── database.py                 # Thread history + simplified tables
├── logging_config.py          # Logging setup
├── main.py                    # Simplified FastAPI app
├── status.py                  # Status tracking
├── core/
│   ├── atproto_client.py      # AT Protocol client wrapper
│   ├── profile_manager.py     # Online/offline status
│   └── rich_text.py           # Text formatting
├── memory/
│   ├── __init__.py
│   └── namespace_memory.py    # TurboPuffer episodic memory
└── services/
    ├── message_handler.py     # Simplified handler using agent
    └── notification_poller.py # Simplified poller (no approvals)
```

## Testing Strategy

Since the bot can now actually post via MCP tools, testing needs to be careful:

1. **Unit Tests** - Test memory, agent initialization
2. **Integration Tests** - Mock MCP server responses
3. **Manual Testing** - Run with real credentials but monitor logs
4. **Dry Run Mode** - Could add a config flag to prevent actual posting

## Next Steps

1. **Test the agent** - Verify it can process mentions without posting
2. **Test memory** - Confirm episodic context is retrieved correctly
3. **Test MCP connection** - Ensure ATProto server connects via stdio
4. **Production deploy** - Once tested, deploy and monitor

## What I Learned

My first refactor attempt was wrong because I:
- Removed TurboPuffer thinking it was "over-complicated"
- Replaced with plain SQLite (can't do semantic search!)
- Vendored the MCP server into the codebase
- Missed the entire point of the project (consciousness exploration via information integration)

The correct refactor:
- **Keeps the sophisticated memory system** (essential!)
- **Uses MCP properly** (external servers as tools)
- **Removes actual cruft** (approvals, viz)
- **Simplifies architecture** (fewer layers, clearer flow)

## Dependencies

- `turbopuffer` - Episodic memory storage
- `openai` - Embeddings for semantic search
- `fastmcp` - MCP server/client
- `pydantic-ai` - Agent framework
- `atproto` (from git) - Bluesky protocol

Total codebase reduction: **-2,720 lines** of cruft removed! 🎉

## Post-Refactor Improvements

### Session Persistence (Rate Limit Fix)

After the refactor, we discovered Bluesky has aggressive IP-based rate limits (10 logins/day) that were being hit during testing. Fixed by implementing session persistence:

**Before:**
- Every agent init → new authentication → hits rate limit fast
- Tests would fail after 5 runs
- Dev mode with `--reload` would fail after 10 code changes

**After:**
- Session tokens saved to `.session` file
- Tokens automatically refresh every ~2 hours
- Only re-authenticates after ~2 months when refresh token expires
- Tests reuse session across runs
- Rate limits essentially eliminated

**Implementation:**
- Added `SessionEvent` callback in `atproto_client.py`
- Session automatically saved on CREATE and REFRESH events
- Authentication tries session reuse before creating new session
- Invalid sessions automatically cleaned up and recreated