# Following No Reposts Feed Generator A production-ready Bluesky feed generator written in Rust that shows posts from people you follow, excluding all reposts. Built using Jetstream for efficient real-time data consumption with full JWT signature verification. ## Table of Contents - [Features](#features) - [How It Works](#how-it-works) - [Prerequisites](#prerequisites) - [Installation](#installation) - [Configuration](#configuration) - [Running Locally](#running-locally) - [Deployment](#deployment) - [Publishing Your Feed](#publishing-your-feed) - [Architecture](#architecture) - [API Endpoints](#api-endpoints) - [Performance](#performance) - [Development](#development) - [Troubleshooting](#troubleshooting) - [Contributing](#contributing) - [License](#license) ## Features - **🚫 No Reposts**: Automatically filters out all reposts, showing only original content - **πŸ‘₯ Personalized**: Shows only posts from accounts you follow - **⚑ Real-time**: Updates in real-time as new posts are created - **πŸ”’ Secure**: Full ES256K JWT signature verification with DID resolution - **πŸ“‘ Efficient**: Uses Jetstream for lightweight event consumption (~850 MB/day vs 200+ GB/day) - **πŸ—„οΈ Smart Caching**: Automatic cleanup of posts older than 48 hours - **πŸ”„ Auto-recovery**: Automatic reconnection on Jetstream disconnects - **πŸ“Š Observable**: Structured logging with configurable verbosity - **πŸ—οΈ Production Ready**: Battle-tested error handling and recovery mechanisms ## How It Works This feed generator: 1. **Consumes Events**: Connects to Bluesky's Jetstream to receive real-time events for posts and follows 2. **Filters Content**: Only subscribes to `app.bsky.feed.post` and `app.bsky.graph.follow` collections 3. **Stores Data**: Maintains a local SQLite database of recent posts and follow relationships 4. **Serves Feeds**: Provides personalized feeds via AT Protocol's `app.bsky.feed.getFeedSkeleton` endpoint 5. **Authenticates Users**: Validates JWT tokens by resolving user DIDs and verifying signatures ## Prerequisites - **Rust** 1.70+ (install via [rustup](https://rustup.rs)) - **SQLite** 3.35+ (usually pre-installed on modern systems) - **Domain** with HTTPS (required for production deployment) - **Bluesky Account** (for publishing the feed) ## Installation ### 1. Clone the Repository ```bash git clone https://github.com/vitorpy/noreposts-atproto-feed.git cd noreposts-atproto-feed ``` ### 2. Build the Project ```bash cargo build --release ``` The compiled binary will be at `target/release/following-no-reposts-feed`. ## Configuration ### Environment Variables Create a `.env` file in the project root (see `.env.example` for reference): ```bash # Required: Database location DATABASE_URL=sqlite:./feed.db # Required: Server port PORT=3000 # Required: Your domain name FEEDGEN_HOSTNAME=your-domain.com # Required: Your service DID FEEDGEN_SERVICE_DID=did:web:your-domain.com # Optional: Jetstream server (defaults to jetstream1.us-east.bsky.network) JETSTREAM_HOSTNAME=jetstream1.us-east.bsky.network ``` ### Service DID Setup Your `FEEDGEN_SERVICE_DID` should match your domain. For `did:web`, it's typically: - Domain: `feed.example.com` β†’ DID: `did:web:feed.example.com` - Domain: `example.com` β†’ DID: `did:web:example.com` ## Running Locally ### Quick Start ```bash # Set up environment cp .env.example .env # Edit .env with your configuration nano .env # Run the server cargo run --release ``` The server will: 1. Automatically run database migrations 2. Connect to Jetstream and start consuming events 3. Start the HTTP server on the configured port 4. Serve the DID document at `/.well-known/did.json` ### Testing Locally ```bash # Test DID document endpoint curl http://localhost:3000/.well-known/did.json # Test feed endpoint (requires authentication in production) curl "http://localhost:3000/xrpc/app.bsky.feed.getFeedSkeleton?feed=at://did:web:your-domain.com/app.bsky.feed.generator/following-no-reposts&limit=10" ``` ### Command-Line Options ```bash # Override environment variables ./following-no-reposts-feed --port 8080 --hostname feed.example.com # Run database migrations only ./following-no-reposts-feed migrate # Publish feed to your Bluesky account ./following-no-reposts-feed publish \ --handle your-handle.bsky.social \ --password your-app-password \ --record-name following-no-reposts \ --display-name "Following (No Reposts)" \ --description "See posts from people you follow, without any reposts" # Backfill posts from firehose (optional) ./following-no-reposts-feed backfill --cursor ``` ## Deployment ### 1. Build for Production ```bash cargo build --release --locked ``` ### 2. Set Up Your Server Transfer the binary to your server: ```bash scp target/release/following-no-reposts-feed user@your-server:/opt/feed-generator/ ``` ### 3. Create a Systemd Service Create `/etc/systemd/system/feed-generator.service`: ```ini [Unit] Description=Bluesky Feed Generator - Following No Reposts After=network.target [Service] Type=simple User=feedgen WorkingDirectory=/opt/feed-generator Environment="DATABASE_URL=sqlite:/opt/feed-generator/feed.db" Environment="PORT=3000" Environment="FEEDGEN_HOSTNAME=your-domain.com" Environment="FEEDGEN_SERVICE_DID=did:web:your-domain.com" ExecStart=/opt/feed-generator/following-no-reposts-feed Restart=always RestartSec=10 [Install] WantedBy=multi-user.target ``` Enable and start the service: ```bash sudo systemctl daemon-reload sudo systemctl enable feed-generator sudo systemctl start feed-generator sudo systemctl status feed-generator ``` ### 4. Configure Reverse Proxy The feed generator must be accessible via HTTPS. Configure your reverse proxy: #### Nginx ```nginx server { listen 443 ssl http2; server_name your-domain.com; ssl_certificate /etc/letsencrypt/live/your-domain.com/fullchain.pem; ssl_certificate_key /etc/letsencrypt/live/your-domain.com/privkey.pem; location / { proxy_pass http://127.0.0.1:3000; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Proto $scheme; # WebSocket support for Jetstream proxy_http_version 1.1; proxy_set_header Upgrade $http_upgrade; proxy_set_header Connection "upgrade"; } } ``` #### Caddy ```caddyfile your-domain.com { reverse_proxy localhost:3000 } ``` ### 5. Verify Deployment ```bash # Test DID document curl https://your-domain.com/.well-known/did.json # Check if feed endpoint is accessible curl https://your-domain.com/xrpc/app.bsky.feed.getFeedSkeleton ``` ## Publishing Your Feed Once your feed generator is deployed and accessible via HTTPS: ### Method 1: Using the Built-in Publish Command ```bash ./following-no-reposts-feed publish \ --handle your-handle.bsky.social \ --password your-app-password \ --record-name following-no-reposts \ --display-name "Following (No Reposts)" \ --description "See posts from people you follow, without any reposts" \ --avatar ./avatar.png ``` **Note**: Use an [App Password](https://bsky.app/settings/app-passwords), not your main account password! ### Method 2: Manual Publishing 1. **Get your DID**: ```bash curl "https://bsky.social/xrpc/com.atproto.identity.resolveHandle?handle=yourhandle.bsky.social" ``` 2. **Create a session**: ```bash curl -X POST https://bsky.social/xrpc/com.atproto.server.createSession \ -H "Content-Type: application/json" \ -d '{"identifier": "yourhandle.bsky.social", "password": "your-app-password"}' ``` 3. **Publish the feed generator record**: ```bash curl -X POST https://bsky.social/xrpc/com.atproto.repo.putRecord \ -H "Authorization: Bearer YOUR_ACCESS_JWT" \ -H "Content-Type: application/json" \ -d '{ "repo": "your.did", "collection": "app.bsky.feed.generator", "rkey": "following-no-reposts", "record": { "$type": "app.bsky.feed.generator", "did": "did:web:your-domain.com", "displayName": "Following (No Reposts)", "description": "See posts from people you follow, without any reposts", "createdAt": "2025-01-01T00:00:00.000Z" } }' ``` ### Finding Your Feed After publishing, your feed will be available at: ``` https://bsky.app/profile/yourhandle.bsky.social/feed/following-no-reposts ``` ## Architecture ### System Components ``` β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Jetstream │────────▢│ Consumer │────────▢│ Database β”‚ β”‚ (WebSocket) β”‚ Events β”‚ (Async) β”‚ Store β”‚ (SQLite) β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ Read β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Bluesky │────────▢│ HTTP Server │────────▢│ Feed β”‚ β”‚ App β”‚ JWT β”‚ (Axum) β”‚ Query β”‚ Algorithm β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ ``` ### Code Structure - **`main.rs`**: Application entry point, HTTP server setup, routing - **`jetstream_consumer.rs`**: WebSocket client for Jetstream events - **`database.rs`**: SQLite abstraction layer, queries, and migrations - **`feed_algorithm.rs`**: Feed generation logic (filtering by follows, excluding reposts) - **`auth.rs`**: JWT validation with ES256K signature verification - **`backfill.rs`**: Optional historical data backfilling from firehose - **`publish.rs`**: Feed generator publishing utilities - **`admin_socket.rs`**: Unix socket for admin commands - **`types.rs`**: Shared data structures ### Data Flow 1. **Event Ingestion**: Jetstream sends `commit` events when posts are created or follows happen 2. **Event Processing**: Consumer parses events and extracts relevant data 3. **Database Storage**: Posts and follows are stored in SQLite with TTL 4. **Feed Requests**: Bluesky app requests feed via `getFeedSkeleton` 5. **Authentication**: JWT is validated by resolving DID and verifying signature 6. **Feed Generation**: Algorithm queries posts from followed users, excludes reposts 7. **Response**: Ordered list of post URIs returned with pagination cursor ## API Endpoints ### `GET /.well-known/did.json` Returns the DID document for the feed generator service. **Response**: ```json { "@context": ["https://www.w3.org/ns/did/v1"], "id": "did:web:your-domain.com", "service": [{ "id": "#bsky_fg", "type": "BskyFeedGenerator", "serviceEndpoint": "https://your-domain.com" }] } ``` ### `GET /xrpc/app.bsky.feed.getFeedSkeleton` Returns a personalized feed skeleton for the authenticated user. **Query Parameters**: - `feed` (required): Feed AT-URI (e.g., `at://did:web:your-domain.com/app.bsky.feed.generator/following-no-reposts`) - `limit` (optional): Number of posts (1-100, default: 50) - `cursor` (optional): Pagination cursor **Headers**: - `Authorization`: Bearer JWT token from Bluesky app **Response**: ```json { "feed": [ {"post": "at://did:plc:xxx/app.bsky.feed.post/abc123"}, {"post": "at://did:plc:yyy/app.bsky.feed.post/def456"} ], "cursor": "1234567890" } ``` ## Performance ### Resource Usage - **Memory**: ~50-100 MB (depends on database size) - **CPU**: Minimal (<1% on modern hardware) - **Bandwidth**: ~850 MB/day (Jetstream with compression) - **Storage**: ~100-500 MB (48-hour post retention) ### Scalability The feed generator can handle: - **10,000+** active users - **1M+** posts/day ingestion - **100+** requests/second ### Database Optimization ```sql -- Efficient indexes CREATE INDEX idx_posts_author_did ON posts(author_did); CREATE INDEX idx_posts_indexed_at ON posts(indexed_at); CREATE INDEX idx_follows_follower ON follows(follower_did); -- Automatic cleanup (posts > 48 hours old) DELETE FROM posts WHERE indexed_at < datetime('now', '-2 days'); ``` ## Development ### Running Tests ```bash cargo test ``` ### Database Migrations Create a new migration: ```bash sqlx migrate add your_migration_name ``` Run migrations manually: ```bash cargo install sqlx-cli --no-default-features --features sqlite sqlx migrate run ``` ### Logging Set the `RUST_LOG` environment variable: ```bash # Debug level (verbose) RUST_LOG=debug cargo run # Info level (default) RUST_LOG=info cargo run # Specific module logging RUST_LOG=following_no_reposts_feed::jetstream_consumer=debug cargo run ``` ### Code Quality ```bash # Format code cargo fmt # Run linter cargo clippy --all-targets # Check for security vulnerabilities cargo audit ``` ## Troubleshooting ### Jetstream Connection Issues **Problem**: Cannot connect to Jetstream **Solutions**: - Verify network connectivity: `ping jetstream1.us-east.bsky.network` - Check firewall rules (port 443 outbound) - Try alternative Jetstream servers - Monitor logs for specific error messages ### Database Locked Errors **Problem**: `database is locked` errors **Solutions**: - Ensure only one instance is running - Check for long-running transactions - Increase `busy_timeout` in database configuration - Consider using WAL mode: `PRAGMA journal_mode=WAL;` ### Authentication Failures **Problem**: JWT validation errors **Solutions**: - Verify `FEEDGEN_SERVICE_DID` matches your domain - Check network access to `plc.directory` for DID resolution - Enable debug logging: `RUST_LOG=following_no_reposts_feed::auth=debug` - Verify your domain's HTTPS certificate is valid ### Feed Not Updating **Problem**: Feed shows stale content **Solutions**: - Check Jetstream connection: look for "Connected to Jetstream" in logs - Verify database is being updated: `sqlite3 feed.db "SELECT COUNT(*) FROM posts;"` - Check for errors in logs: `journalctl -u feed-generator -n 100` - Restart the service: `systemctl restart feed-generator` ### High Memory Usage **Problem**: Memory usage growing over time **Solutions**: - Verify automatic cleanup is working: check `posts` table size - Manually trigger cleanup: `DELETE FROM posts WHERE indexed_at < datetime('now', '-2 days');` - Reduce retention period in code if needed - Monitor with: `ps aux | grep following-no-reposts-feed` ### Debug Mode Enable comprehensive debugging: ```bash RUST_LOG=debug,hyper=info,tokio=info cargo run ``` Test endpoints directly: ```bash # Test DID endpoint curl -v https://your-domain.com/.well-known/did.json # Test feed endpoint with authentication curl -v -H "Authorization: Bearer YOUR_JWT" \ "https://your-domain.com/xrpc/app.bsky.feed.getFeedSkeleton?feed=at://did:web:your-domain.com/app.bsky.feed.generator/following-no-reposts&limit=5" ``` ## Contributing Contributions are welcome! Please: 1. Fork the repository 2. Create a feature branch (`git checkout -b feature/amazing-feature`) 3. Make your changes 4. Add tests if applicable 5. Run `cargo fmt` and `cargo clippy` 6. Commit your changes (`git commit -m 'Add amazing feature'`) 7. Push to the branch (`git push origin feature/amazing-feature`) 8. Open a Pull Request ### Development Guidelines - Follow Rust best practices and idioms - Add tests for new functionality - Update documentation for user-facing changes - Keep commits atomic and well-described - Ensure CI passes before submitting PR ## License This project is licensed under the GNU General Public License v3.0 - see the [LICENSE](LICENSE) file for details. This ensures the code remains free and open source. If you modify and distribute this software, you must: - Disclose your source code - License your modifications under GPLv3 - State significant changes made - Include the original copyright notice ## Resources - [AT Protocol Documentation](https://atproto.com) - [Bluesky API Reference](https://docs.bsky.app) - [Jetstream Documentation](https://github.com/bluesky-social/jetstream) - [ATrium Rust Library](https://github.com/sugyan/atrium) - [Feed Generator Guide](https://docs.bsky.app/docs/starter-templates/custom-feeds) ## Acknowledgments - Built with [ATrium](https://github.com/sugyan/atrium) - Rust libraries for AT Protocol - Uses [Jetstream](https://github.com/bluesky-social/jetstream) for efficient event streaming - Inspired by the Bluesky community's work on custom feeds --- **Made with ❀️ for the Bluesky community**