audio transcoder service#
overview#
the transcoder is a standalone rust-based HTTP service that handles audio format conversion using ffmpeg. it runs as a separate fly.io app to isolate CPU-intensive transcoding operations from the main backend API.
architecture#
why separate service?#
ffmpeg operations are CPU-intensive and can block the event loop in async python applications. separating the transcoder provides:
- isolation: transcoding doesn't affect API latency
- performance: rust + tokio provides better concurrency for blocking operations
- scalability: can scale transcoder independently from main backend
- resource allocation: dedicated CPU/memory for transcoding work
technology stack#
- rust: high-performance systems language
- axum: async web framework built on tokio
- ffmpeg: industry-standard media processing
- fly.io: deployment platform with auto-scaling
API#
POST /transcode#
convert audio file to target format.
authentication: bearer token via X-Transcoder-Key header
request: multipart/form-data
file: audio file to transcodetarget(optional query param): target format (default: "mp3")
example:
curl -X POST https://plyr-transcoder.fly.dev/transcode?target=mp3 \
-H "X-Transcoder-Key: $TRANSCODER_AUTH_TOKEN" \
-F "file=@input.wav" \
--output output.mp3
response: transcoded audio file (binary)
headers:
Content-Type: appropriate media type for target formatContent-Disposition: attachment with original filename + new extension
supported formats:
- mp3 (MPEG Layer 3)
- m4a (AAC in MP4 container)
- wav (PCM audio)
- flac (lossless compression)
- ogg (Vorbis codec)
status codes:
- 200: transcoding successful, returns audio file
- 400: invalid input (unsupported format, missing file, etc.)
- 401: missing or invalid authentication token
- 413: file too large (>1GB)
- 500: transcoding failed (ffmpeg error, I/O error, etc.)
GET /health#
health check endpoint (no authentication required).
response:
{
"status": "ok"
}
authentication#
bearer token authentication#
the transcoder uses a simple bearer token authentication scheme via the X-Transcoder-Key header.
configuration:
# set via fly secrets
fly secrets set TRANSCODER_AUTH_TOKEN="your-secret-token-here" -a plyr-transcoder
local development:
# .env file
TRANSCODER_AUTH_TOKEN=dev-token-change-me
# or run without auth (dev mode)
# just run transcoder without setting token
security notes:
- token should be a random, high-entropy string (use
openssl rand -base64 32) - main backend should store token in environment variables
- health endpoint bypasses authentication
- invalid/missing tokens return 401 unauthorized
transcoding process#
workflow#
- receive upload: client sends audio file via multipart form
- create temp directory: isolated workspace for this request
- save input file: write uploaded bytes to temp file
- determine format: sanitize and validate target format
- run ffmpeg: spawn ffmpeg process with appropriate codec settings
- stream output: return transcoded file directly to client
- cleanup: delete temp directory (automatic)
ffmpeg command#
the service constructs ffmpeg commands based on target format:
# example: convert to MP3
ffmpeg -i input.wav -codec:a libmp3lame -qscale:a 2 -map_metadata 0 output.mp3
# example: convert to M4A (AAC)
ffmpeg -i input.wav -codec:a aac -b:a 192k -map_metadata 0 output.m4a
# example: convert to FLAC (lossless)
ffmpeg -i input.flac -codec:a flac -compression_level 8 -map_metadata 0 output.flac
flags explained:
-i input.wav: input file-codec:a <codec>: audio codec to use-qscale:a 2: variable bitrate quality (0-9, lower = better)-b:a 192k: constant bitrate (for AAC)-map_metadata 0: preserve metadata (artist, title, etc.)-compression_level 8: FLAC compression (0-12, higher = smaller file)
codec selection#
| format | codec | container | typical use case |
|---|---|---|---|
| mp3 | libmp3lame | MPEG | universal compatibility |
| m4a | aac | MP4 | modern devices, good compression |
| wav | pcm_s16le | WAV | lossless, uncompressed |
| flac | flac | FLAC | lossless, compressed |
| ogg | libvorbis | OGG | open format, good compression |
deployment#
fly.io configuration#
app name: plyr-transcoder
region: iad (us-east, washington DC)
fly.toml:
app = "plyr-transcoder"
primary_region = "iad"
[http_service]
internal_port = 8080
force_https = true
auto_stop_machines = "stop"
auto_start_machines = true
min_machines_running = 0
[[vm]]
cpu_kind = "shared"
cpus = 1
memory = "1gb"
[env]
TRANSCODER_HOST = "0.0.0.0"
TRANSCODER_PORT = "8080"
TRANSCODER_MAX_UPLOAD_BYTES = "1073741824" # 1GB
key settings:
- auto_stop_machines: stops VM when idle (cost optimization)
- auto_start_machines: starts VM on first request (zero cold-start within seconds)
- min_machines_running: 0 (no always-on instances, purely on-demand)
- memory: 1GB (sufficient for transcoding typical audio files)
deployment commands#
# deploy from transcoder directory
cd services/transcoder && fly deploy
# check status
fly status -a plyr-transcoder
# view logs (blocking - use ctrl+c to exit)
fly logs -a plyr-transcoder
# scale up (for high traffic)
fly scale count 2 -a plyr-transcoder
# scale down (back to auto-scale)
fly scale count 1 -a plyr-transcoder
note: deployment is done manually from the transcoder directory, not via main backend CI/CD.
secrets management#
# set authentication token
fly secrets set TRANSCODER_AUTH_TOKEN="$(openssl rand -base64 32)" -a plyr-transcoder
# list secrets (values hidden)
fly secrets list -a plyr-transcoder
# unset secret
fly secrets unset TRANSCODER_AUTH_TOKEN -a plyr-transcoder
integration with main backend#
the transcoder is integrated into the upload pipeline for lossless audio support (AIFF/FLAC). when a user uploads a non-web-playable format, the backend:
- saves the original file to R2 (
original_file_id) - calls the transcoder to convert to MP3
- saves the transcoded file to R2 (
file_id) - ATProto record points to MP3 for browser compatibility
- export returns the original lossless file
this enables "best of both worlds": universal browser playback + lossless export for data portability.
backend configuration#
transcoder settings in src/backend/config.py:
class TranscoderSettings(AppSettingsSection):
"""Transcoder service configuration for lossless audio conversion."""
enabled: bool = True # set to False to reject lossless uploads
url: str = "https://plyr-transcoder.fly.dev"
auth_token: str = "" # set via TRANSCODER_AUTH_TOKEN env var
timeout: int = 300 # 5 minutes for large files
environment variables:
TRANSCODER_ENABLED: enable/disable transcoding (default: true)TRANSCODER_URL: transcoder service URLTRANSCODER_AUTH_TOKEN: bearer token for authentication
calling from backend#
import httpx
async def transcode_audio(
file: BinaryIO,
target_format: str = "mp3"
) -> bytes:
"""transcode audio file using transcoder service."""
async with httpx.AsyncClient() as client:
response = await client.post(
f"{settings.transcoder.url}/transcode",
params={"target": target_format},
files={"file": file},
headers={"X-Transcoder-Key": settings.transcoder.auth_token},
timeout=300.0 # 5 minutes for large files
)
response.raise_for_status()
return response.content
error handling#
try:
transcoded = await transcode_audio(file, "mp3")
except httpx.HTTPStatusError as e:
if e.response.status_code == 401:
logger.error("transcoder authentication failed")
raise HTTPException(500, "transcoding service unavailable")
elif e.response.status_code == 413:
raise HTTPException(413, "file too large for transcoding")
else:
logger.error(f"transcoding failed: {e}")
raise HTTPException(500, "transcoding failed")
except httpx.TimeoutException:
logger.error("transcoding timed out")
raise HTTPException(504, "transcoding took too long")
local development#
prerequisites#
- rust toolchain (install via
rustup) - ffmpeg (install via
brew install ffmpegon macOS)
running locally#
# from transcoder directory
cd services/transcoder && cargo run
# with custom port
TRANSCODER_PORT=9000 cargo run
# with debug logging
RUST_LOG=debug cargo run
note: the transcoder runs on port 8080 by default (configured in fly.toml).
testing locally#
# start transcoder
just transcoder run
# test health endpoint
curl http://localhost:8082/health
# test transcoding (no auth required in dev mode)
curl -X POST http://localhost:8082/transcode?target=mp3 \
-F "file=@test.wav" \
--output transcoded.mp3
# test with authentication
export TRANSCODER_AUTH_TOKEN="dev-token"
cargo run &
curl -X POST http://localhost:8082/transcode?target=mp3 \
-H "X-Transcoder-Key: dev-token" \
-F "file=@test.wav" \
--output transcoded.mp3
performance characteristics#
typical transcoding times#
transcoding performance depends on:
- input file size and duration
- source codec complexity
- target codec and quality settings
- available CPU
benchmarks (shared-cpu-1x on fly.io):
- 3-minute MP3 (5MB) → MP3: ~2-3 seconds
- 3-minute WAV (30MB) → MP3: ~4-5 seconds
- 10-minute FLAC (50MB) → MP3: ~10-15 seconds
resource usage#
memory:
- base process: ~20MB
- active transcoding: +100-200MB per request
- 1GB VM supports 4-5 concurrent transcodes
CPU:
- ffmpeg uses 100% of allocated CPU
- single-core sufficient for typical workload
- multi-core would enable parallel processing
scaling considerations#
when to scale up:
- average response time >30 seconds
- frequent 503 errors (all VMs busy)
- queue depth increasing
scaling options:
- horizontal: increase machine count (
fly scale count 2) - vertical: increase memory/CPU (
fly scale vm shared-cpu-2x) - regional: deploy to multiple regions for geo-distribution
monitoring#
metrics to track#
-
transcoding success rate
- total requests
- successful transcodes
- failed transcodes (by error type)
-
performance
- average transcoding time
- p50, p95, p99 latency
- throughput (transcodes/minute)
-
resource usage
- CPU utilization
- memory usage
- disk I/O (temp files)
-
errors
- authentication failures
- ffmpeg errors
- timeout errors
- 413 file too large
fly.io metrics#
# view metrics dashboard
fly dashboard -a plyr-transcoder
# check recent requests
fly logs -a plyr-transcoder | grep "POST /transcode"
# monitor resource usage
fly vm status -a plyr-transcoder
troubleshooting#
common issues#
ffmpeg not found:
error: ffmpeg command failed: No such file or directory
solution: ensure ffmpeg is installed in docker image (check Dockerfile)
authentication fails in production:
error: 401 unauthorized
solution: verify TRANSCODER_AUTH_TOKEN is set on both transcoder and backend
timeouts on large files:
error: request timeout after 120s
solution: increase timeout in backend client (timeout=300.0)
413 entity too large:
error: 413 payload too large
solution: increase TRANSCODER_MAX_UPLOAD_BYTES or reject large files earlier
VM not starting automatically:
error: no instances available
solution: check auto_start_machines = true in fly.toml
future enhancements#
potential improvements#
-
progress tracking
- stream ffmpeg progress updates
- return progress via server-sent events
- enable client-side progress bar
-
format detection
- auto-detect input format via ffprobe
- validate format before transcoding
- reject unsupported formats early
-
quality presets
- high quality (320kbps MP3, 256kbps AAC)
- standard quality (192kbps)
- low quality (128kbps for previews)
-
metadata preservation
- extract metadata from input
- apply metadata to output
- handle artwork/cover images
-
batch processing
- accept multiple files
- process in parallel
- return as zip archive
-
caching
- cache transcoded files by content hash
- serve cached versions instantly
- implement LRU eviction
references#
- source code:
services/transcoder/src/main.rs - justfile:
services/transcoder/justfile - fly config:
services/transcoder/fly.toml - dockerfile:
services/transcoder/Dockerfile - ffmpeg docs: https://ffmpeg.org/documentation.html
- fly.io docs: https://fly.io/docs/