fix: migrate rate limiting from in-memory to Redis (#1044)

audio streaming app plyr.fm

rate limits were tracked per Fly Machine instance, so 2 machines meant
2x the configured limit. use the existing docket Redis for global
counters. falls back to memory:// when DOCKET_URL is not set (local dev).

closes #1043

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

authored by zzstoatzz.io

Claude Opus 4.6 and committed by

GitHub 4 days ago 67ebb28f 5153d514

+41 -13

3 changed files

expand all

unified split

backend

src

backend

utilities

rate_limit.py

tests

test_rate_limit.py

docs-internal

rate-limiting.md

+1 -1

backend/src/backend/utilities/rate_limit.py

··· 9 9 key_func=get_remote_address, 10 10 enabled=settings.rate_limit.enabled, 11 11 default_limits=[settings.rate_limit.default_limit], 12 - storage_uri="memory://", 12 + storage_uri=settings.docket.url or "memory://", 13 13 )

+35

backend/tests/test_rate_limit.py

··· 1 + """Tests for rate limiting configuration.""" 2 + 3 + from unittest.mock import patch 4 + 5 + from backend.config import settings 6 + 7 + 8 + def test_limiter_uses_redis_when_docket_url_set() -> None: 9 + """limiter should use docket Redis URL for storage when available.""" 10 + with patch.object(settings.docket, "url", "redis://localhost:6379/0"): 11 + # re-import to pick up patched settings 12 + import importlib 13 + 14 + import backend.utilities.rate_limit as rl_module 15 + 16 + importlib.reload(rl_module) 17 + 18 + assert rl_module.limiter._storage_uri == "redis://localhost:6379/0" 19 + 20 + # reload again to restore original state 21 + importlib.reload(rl_module) 22 + 23 + 24 + def test_limiter_falls_back_to_memory_when_no_docket_url() -> None: 25 + """limiter should fall back to in-memory storage when DOCKET_URL is empty.""" 26 + with patch.object(settings.docket, "url", ""): 27 + import importlib 28 + 29 + import backend.utilities.rate_limit as rl_module 30 + 31 + importlib.reload(rl_module) 32 + 33 + assert rl_module.limiter._storage_uri == "memory://" 34 + 35 + importlib.reload(rl_module)

+5 -12

docs-internal/rate-limiting.md

··· 13 13 | `RATE_LIMIT_ENABLED` | `true` | Enable/disable rate limiting globally. | 14 14 | `RATE_LIMIT_DEFAULT_LIMIT` | `100/minute` | Global limit applied to all endpoints by default. | 15 15 | `RATE_LIMIT_AUTH_LIMIT` | `10/minute` | Strict limit for auth endpoints (`/auth/start`, `/auth/exchange`). | 16 - | `RATE_LIMIT_UPLOAD_LIMIT` | `5/minute` | Strict limit for file uploads (`/tracks/`). | 16 + | `RATE_LIMIT_UPLOAD_LIMIT` | `20/minute` | Strict limit for file uploads (`/tracks/`). | 17 17 18 18 ## Architecture 19 19 20 - The current implementation uses **in-memory storage**. 20 + The implementation uses **Redis-backed storage** via the existing docket Redis instance (`DOCKET_URL`). 21 21 22 - * **Per-Instance:** Limits are tracked per application instance (Fly Machine). 23 - * **Scaling:** With multiple replicas (e.g., 2 machines), the **effective global limit** scales linearly. 24 - * Example: A limit of `100/minute` with 2 machines results in a total capacity of roughly `200/minute`. 22 + * **Global Counters:** Limits are shared across all application instances (Fly Machines). A `100/minute` limit means 100 requests total, regardless of which machine handles them. 25 23 * **Keying:** Limits are applied by **IP address** (`get_remote_address`). 26 - 27 - ### Why in-memory? 28 - For our current scale, in-memory is sufficient and avoids the complexity/cost of a dedicated Redis cluster. This provides effective protection against single-source flooding (DDoS/brute-force) directed at any specific instance. 29 - 30 - ### Future State (Redis) 31 - If strict global synchronization or complex tier-based limiting is required in the future, we will migrate to a Redis-backed limiter. `slowapi` supports Redis out of the box, which would allow maintaining shared counters across all application instances. 24 + * **Fallback:** When `DOCKET_URL` is not set (e.g., local dev without Redis), falls back to in-memory storage automatically. 32 25 33 26 ## Adding Limits to Endpoints 34 27 ··· 50 43 51 44 ## Monitoring 52 45 53 - Rate limit hits return `429 Too Many Requests`. These events are logged and will appear in Logfire traces with the `429` status code. 46 + Rate limit hits return `429 Too Many Requests`. These events are logged and will appear in Logfire traces with the `429` status code.