audio transcoder service#

overview#

the transcoder is a standalone rust-based HTTP service that handles audio format conversion using ffmpeg. it runs as a separate fly.io app to isolate CPU-intensive transcoding operations from the main backend API.

architecture#

why separate service?#

ffmpeg operations are CPU-intensive and can block the event loop in async python applications. separating the transcoder provides:

isolation: transcoding doesn't affect API latency
performance: rust + tokio provides better concurrency for blocking operations
scalability: can scale transcoder independently from main backend
resource allocation: dedicated CPU/memory for transcoding work

technology stack#

rust: high-performance systems language
axum: async web framework built on tokio
ffmpeg: industry-standard media processing
fly.io: deployment platform with auto-scaling

API#

POST /transcode#

convert audio file to target format.

authentication: bearer token via X-Transcoder-Key header

request: multipart/form-data

file: audio file to transcode
target (optional query param): target format (default: "mp3")

example:

curl -X POST https://plyr-transcoder.fly.dev/transcode?target=mp3 \
  -H "X-Transcoder-Key: $TRANSCODER_AUTH_TOKEN" \
  -F "file=@input.wav" \
  --output output.mp3

response: transcoded audio file (binary)

headers:

Content-Type: appropriate media type for target format
Content-Disposition: attachment with original filename + new extension

supported formats:

mp3 (MPEG Layer 3)
m4a (AAC in MP4 container)
wav (PCM audio)
flac (lossless compression)
ogg (Vorbis codec)

status codes:

200: transcoding successful, returns audio file
400: invalid input (unsupported format, missing file, etc.)
401: missing or invalid authentication token
413: file too large (>1GB)
500: transcoding failed (ffmpeg error, I/O error, etc.)

GET /health#

health check endpoint (no authentication required).

response:

{
  "status": "ok"
}

authentication#

bearer token authentication#

the transcoder uses a simple bearer token authentication scheme via the X-Transcoder-Key header.

configuration:

# set via fly secrets
fly secrets set TRANSCODER_AUTH_TOKEN="your-secret-token-here" -a plyr-transcoder

local development:

# .env file
TRANSCODER_AUTH_TOKEN=dev-token-change-me

# or run without auth (dev mode)
# just run transcoder without setting token

security notes:

token should be a random, high-entropy string (use openssl rand -base64 32)
main backend should store token in environment variables
health endpoint bypasses authentication
invalid/missing tokens return 401 unauthorized

transcoding process#

workflow#

receive upload: client sends audio file via multipart form
create temp directory: isolated workspace for this request
save input file: write uploaded bytes to temp file
determine format: sanitize and validate target format
run ffmpeg: spawn ffmpeg process with appropriate codec settings
stream output: return transcoded file directly to client
cleanup: delete temp directory (automatic)

ffmpeg command#

the service constructs ffmpeg commands based on target format:

# example: convert to MP3
ffmpeg -i input.wav -codec:a libmp3lame -qscale:a 2 -map_metadata 0 output.mp3

# example: convert to M4A (AAC)
ffmpeg -i input.wav -codec:a aac -b:a 192k -map_metadata 0 output.m4a

# example: convert to FLAC (lossless)
ffmpeg -i input.flac -codec:a flac -compression_level 8 -map_metadata 0 output.flac

flags explained:

-i input.wav: input file
-codec:a <codec>: audio codec to use
-qscale:a 2: variable bitrate quality (0-9, lower = better)
-b:a 192k: constant bitrate (for AAC)
-map_metadata 0: preserve metadata (artist, title, etc.)
-compression_level 8: FLAC compression (0-12, higher = smaller file)

codec selection#

format	codec	container	typical use case
mp3	libmp3lame	MPEG	universal compatibility
m4a	aac	MP4	modern devices, good compression
wav	pcm_s16le	WAV	lossless, uncompressed
flac	flac	FLAC	lossless, compressed
ogg	libvorbis	OGG	open format, good compression

deployment#

fly.io configuration#

app name: plyr-transcoder region: iad (us-east, washington DC)

fly.toml:

app = "plyr-transcoder"
primary_region = "iad"

[http_service]
  internal_port = 8080
  force_https = true
  auto_stop_machines = "stop"
  auto_start_machines = true
  min_machines_running = 0

[[vm]]
  cpu_kind = "shared"
  cpus = 1
  memory = "1gb"

[env]
  TRANSCODER_HOST = "0.0.0.0"
  TRANSCODER_PORT = "8080"
  TRANSCODER_MAX_UPLOAD_BYTES = "1073741824"  # 1GB

key settings:

auto_stop_machines: stops VM when idle (cost optimization)
auto_start_machines: starts VM on first request (zero cold-start within seconds)
min_machines_running: 0 (no always-on instances, purely on-demand)
memory: 1GB (sufficient for transcoding typical audio files)

deployment commands#

# deploy from transcoder directory
cd services/transcoder && fly deploy

# check status
fly status -a plyr-transcoder

# view logs (blocking - use ctrl+c to exit)
fly logs -a plyr-transcoder

# scale up (for high traffic)
fly scale count 2 -a plyr-transcoder

# scale down (back to auto-scale)
fly scale count 1 -a plyr-transcoder

note: deployment is done manually from the transcoder directory, not via main backend CI/CD.

secrets management#

# set authentication token
fly secrets set TRANSCODER_AUTH_TOKEN="$(openssl rand -base64 32)" -a plyr-transcoder

# list secrets (values hidden)
fly secrets list -a plyr-transcoder

# unset secret
fly secrets unset TRANSCODER_AUTH_TOKEN -a plyr-transcoder

integration with main backend#

the transcoder is integrated into the upload pipeline for lossless audio support (AIFF/FLAC). when a user uploads a non-web-playable format, the backend:

saves the original file to R2 (original_file_id)
calls the transcoder to convert to MP3
saves the transcoded file to R2 (file_id)
ATProto record points to MP3 for browser compatibility
export returns the original lossless file

this enables "best of both worlds": universal browser playback + lossless export for data portability.

backend configuration#

transcoder settings in src/backend/config.py:

class TranscoderSettings(AppSettingsSection):
    """Transcoder service configuration for lossless audio conversion."""

    enabled: bool = True  # set to False to reject lossless uploads
    url: str = "https://plyr-transcoder.fly.dev"
    auth_token: str = ""  # set via TRANSCODER_AUTH_TOKEN env var
    timeout: int = 300  # 5 minutes for large files

environment variables:

TRANSCODER_ENABLED: enable/disable transcoding (default: true)
TRANSCODER_URL: transcoder service URL
TRANSCODER_AUTH_TOKEN: bearer token for authentication

calling from backend#

import httpx

async def transcode_audio(
    file: BinaryIO,
    target_format: str = "mp3"
) -> bytes:
    """transcode audio file using transcoder service."""
    async with httpx.AsyncClient() as client:
        response = await client.post(
            f"{settings.transcoder.url}/transcode",
            params={"target": target_format},
            files={"file": file},
            headers={"X-Transcoder-Key": settings.transcoder.auth_token},
            timeout=300.0  # 5 minutes for large files
        )
        response.raise_for_status()
        return response.content

error handling#

try:
    transcoded = await transcode_audio(file, "mp3")
except httpx.HTTPStatusError as e:
    if e.response.status_code == 401:
        logger.error("transcoder authentication failed")
        raise HTTPException(500, "transcoding service unavailable")
    elif e.response.status_code == 413:
        raise HTTPException(413, "file too large for transcoding")
    else:
        logger.error(f"transcoding failed: {e}")
        raise HTTPException(500, "transcoding failed")
except httpx.TimeoutException:
    logger.error("transcoding timed out")
    raise HTTPException(504, "transcoding took too long")

local development#

prerequisites#

rust toolchain (install via rustup)
ffmpeg (install via brew install ffmpeg on macOS)

running locally#

# from transcoder directory
cd services/transcoder && cargo run

# with custom port
TRANSCODER_PORT=9000 cargo run

# with debug logging
RUST_LOG=debug cargo run

note: the transcoder runs on port 8080 by default (configured in fly.toml).

testing locally#

# start transcoder
just transcoder run

# test health endpoint
curl http://localhost:8082/health

# test transcoding (no auth required in dev mode)
curl -X POST http://localhost:8082/transcode?target=mp3 \
  -F "file=@test.wav" \
  --output transcoded.mp3

# test with authentication
export TRANSCODER_AUTH_TOKEN="dev-token"
cargo run &

curl -X POST http://localhost:8082/transcode?target=mp3 \
  -H "X-Transcoder-Key: dev-token" \
  -F "file=@test.wav" \
  --output transcoded.mp3

performance characteristics#

typical transcoding times#

transcoding performance depends on:

input file size and duration
source codec complexity
target codec and quality settings
available CPU

benchmarks (shared-cpu-1x on fly.io):

3-minute MP3 (5MB) → MP3: ~2-3 seconds
3-minute WAV (30MB) → MP3: ~4-5 seconds
10-minute FLAC (50MB) → MP3: ~10-15 seconds

resource usage#

memory:

base process: ~20MB
active transcoding: +100-200MB per request
1GB VM supports 4-5 concurrent transcodes

CPU:

ffmpeg uses 100% of allocated CPU
single-core sufficient for typical workload
multi-core would enable parallel processing

scaling considerations#

when to scale up:

average response time >30 seconds
frequent 503 errors (all VMs busy)
queue depth increasing

scaling options:

horizontal: increase machine count (fly scale count 2)
vertical: increase memory/CPU (fly scale vm shared-cpu-2x)
regional: deploy to multiple regions for geo-distribution

monitoring#

metrics to track#

transcoding success rate
- total requests
- successful transcodes
- failed transcodes (by error type)
performance
- average transcoding time
- p50, p95, p99 latency
- throughput (transcodes/minute)
resource usage
- CPU utilization
- memory usage
- disk I/O (temp files)
errors
- authentication failures
- ffmpeg errors
- timeout errors
- 413 file too large

fly.io metrics#

# view metrics dashboard
fly dashboard -a plyr-transcoder

# check recent requests
fly logs -a plyr-transcoder | grep "POST /transcode"

# monitor resource usage
fly vm status -a plyr-transcoder

troubleshooting#

common issues#

ffmpeg not found:

error: ffmpeg command failed: No such file or directory

solution: ensure ffmpeg is installed in docker image (check Dockerfile)

authentication fails in production:

error: 401 unauthorized

solution: verify TRANSCODER_AUTH_TOKEN is set on both transcoder and backend

timeouts on large files:

error: request timeout after 120s

solution: increase timeout in backend client (timeout=300.0)

413 entity too large:

error: 413 payload too large

solution: increase TRANSCODER_MAX_UPLOAD_BYTES or reject large files earlier

VM not starting automatically:

error: no instances available

solution: check auto_start_machines = true in fly.toml

future enhancements#

potential improvements#

progress tracking
- stream ffmpeg progress updates
- return progress via server-sent events
- enable client-side progress bar
format detection
- auto-detect input format via ffprobe
- validate format before transcoding
- reject unsupported formats early
quality presets
- high quality (320kbps MP3, 256kbps AAC)
- standard quality (192kbps)
- low quality (128kbps for previews)
metadata preservation
- extract metadata from input
- apply metadata to output
- handle artwork/cover images
batch processing
- accept multiple files
- process in parallel
- return as zip archive
caching
- cache transcoded files by content hash
- serve cached versions instantly
- implement LRU eviction

references#

source code: services/transcoder/src/main.rs
justfile: services/transcoder/justfile
fly config: services/transcoder/fly.toml
dockerfile: services/transcoder/Dockerfile
ffmpeg docs: https://ffmpeg.org/documentation.html
fly.io docs: https://fly.io/docs/