audio streaming app plyr.fm

title: "audio transcoder service"#

overview#

the transcoder is a standalone rust-based HTTP service that handles audio format conversion using ffmpeg. it runs as a separate fly.io app to isolate CPU-intensive transcoding operations from the main backend API.

architecture#

why separate service?#

ffmpeg operations are CPU-intensive and can block the event loop in async python applications. separating the transcoder provides:

  • isolation: transcoding doesn't affect API latency
  • performance: rust + tokio provides better concurrency for blocking operations
  • scalability: can scale transcoder independently from main backend
  • resource allocation: dedicated CPU/memory for transcoding work

technology stack#

  • rust: high-performance systems language
  • axum: async web framework built on tokio
  • ffmpeg: industry-standard media processing
  • fly.io: deployment platform with auto-scaling

API#

POST /transcode#

convert audio file to target format.

authentication: bearer token via X-Transcoder-Key header

request: multipart/form-data

  • file: audio file to transcode
  • target (optional query param): target format (default: "mp3")

example:

curl -X POST https://plyr-transcoder.fly.dev/transcode?target=mp3 \
  -H "X-Transcoder-Key: $TRANSCODER_AUTH_TOKEN" \
  -F "file=@input.wav" \
  --output output.mp3

response: transcoded audio file (binary)

headers:

  • Content-Type: appropriate media type for target format
  • Content-Disposition: attachment with original filename + new extension

supported formats:

  • mp3 (MPEG Layer 3)
  • m4a (AAC in MP4 container)
  • wav (PCM audio)
  • flac (lossless compression)
  • ogg (Vorbis codec)

status codes:

  • 200: transcoding successful, returns audio file
  • 400: invalid input (unsupported format, missing file, etc.)
  • 401: missing or invalid authentication token
  • 413: file too large (>1GB)
  • 500: transcoding failed (ffmpeg error, I/O error, etc.)

GET /health#

health check endpoint (no authentication required).

response:

{
  "status": "ok"
}

authentication#

bearer token authentication#

the transcoder uses a simple bearer token authentication scheme via the X-Transcoder-Key header.

configuration:

# set via fly secrets
fly secrets set TRANSCODER_AUTH_TOKEN="your-secret-token-here" -a plyr-transcoder

local development:

# .env file
TRANSCODER_AUTH_TOKEN=dev-token-change-me

# or run without auth (dev mode)
# just run transcoder without setting token

security notes:

  • token should be a random, high-entropy string (use openssl rand -base64 32)
  • main backend should store token in environment variables
  • health endpoint bypasses authentication
  • invalid/missing tokens return 401 unauthorized

transcoding process#

workflow#

  1. receive upload: client sends audio file via multipart form
  2. create temp directory: isolated workspace for this request
  3. save input file: write uploaded bytes to temp file
  4. determine format: sanitize and validate target format
  5. run ffmpeg: spawn ffmpeg process with appropriate codec settings
  6. stream output: return transcoded file directly to client
  7. cleanup: delete temp directory (automatic)

ffmpeg command#

the service constructs ffmpeg commands based on target format:

# example: convert to MP3
ffmpeg -i input.wav -codec:a libmp3lame -qscale:a 2 -map_metadata 0 output.mp3

# example: convert to M4A (AAC)
ffmpeg -i input.wav -codec:a aac -b:a 192k -map_metadata 0 output.m4a

# example: convert to FLAC (lossless)
ffmpeg -i input.flac -codec:a flac -compression_level 8 -map_metadata 0 output.flac

flags explained:

  • -i input.wav: input file
  • -codec:a <codec>: audio codec to use
  • -qscale:a 2: variable bitrate quality (0-9, lower = better)
  • -b:a 192k: constant bitrate (for AAC)
  • -map_metadata 0: preserve metadata (artist, title, etc.)
  • -compression_level 8: FLAC compression (0-12, higher = smaller file)

codec selection#

format codec container typical use case
mp3 libmp3lame MPEG universal compatibility
m4a aac MP4 modern devices, good compression
wav pcm_s16le WAV lossless, uncompressed
flac flac FLAC lossless, compressed
ogg libvorbis OGG open format, good compression

deployment#

fly.io configuration#

app name: plyr-transcoder region: iad (us-east, washington DC)

fly.toml:

app = "plyr-transcoder"
primary_region = "iad"

[http_service]
  internal_port = 8080
  force_https = true
  auto_stop_machines = "stop"
  auto_start_machines = true
  min_machines_running = 0

[[vm]]
  cpu_kind = "shared"
  cpus = 1
  memory = "1gb"

[env]
  TRANSCODER_HOST = "0.0.0.0"
  TRANSCODER_PORT = "8080"
  TRANSCODER_MAX_UPLOAD_BYTES = "1073741824"  # 1GB

key settings:

  • auto_stop_machines: stops VM when idle (cost optimization)
  • auto_start_machines: starts VM on first request (zero cold-start within seconds)
  • min_machines_running: 0 (no always-on instances, purely on-demand)
  • memory: 1GB (sufficient for transcoding typical audio files)

deployment commands#

# deploy from transcoder directory
cd services/transcoder && fly deploy

# check status
fly status -a plyr-transcoder

# view logs (blocking - use ctrl+c to exit)
fly logs -a plyr-transcoder

# scale up (for high traffic)
fly scale count 2 -a plyr-transcoder

# scale down (back to auto-scale)
fly scale count 1 -a plyr-transcoder

note: deployment is done manually from the transcoder directory, not via main backend CI/CD.

secrets management#

# set authentication token
fly secrets set TRANSCODER_AUTH_TOKEN="$(openssl rand -base64 32)" -a plyr-transcoder

# list secrets (values hidden)
fly secrets list -a plyr-transcoder

# unset secret
fly secrets unset TRANSCODER_AUTH_TOKEN -a plyr-transcoder

integration with main backend#

the transcoder is integrated into the upload pipeline for lossless audio support (AIFF/FLAC). when a user uploads a non-web-playable format, the backend:

  1. saves the original file to R2 (original_file_id)
  2. calls the transcoder to convert to MP3
  3. saves the transcoded file to R2 (file_id)
  4. ATProto record points to MP3 for browser compatibility
  5. export returns the original lossless file

this enables "best of both worlds": universal browser playback + lossless export for data portability.

backend configuration#

transcoder settings in src/backend/config.py:

class TranscoderSettings(AppSettingsSection):
    """Transcoder service configuration for lossless audio conversion."""

    enabled: bool = True  # set to False to reject lossless uploads
    url: str = "https://plyr-transcoder.fly.dev"
    auth_token: str = ""  # set via TRANSCODER_AUTH_TOKEN env var
    timeout: int = 300  # 5 minutes for large files

environment variables:

  • TRANSCODER_ENABLED: enable/disable transcoding (default: true)
  • TRANSCODER_URL: transcoder service URL
  • TRANSCODER_AUTH_TOKEN: bearer token for authentication

calling from backend#

import httpx

async def transcode_audio(
    file: BinaryIO,
    target_format: str = "mp3"
) -> bytes:
    """transcode audio file using transcoder service."""
    async with httpx.AsyncClient() as client:
        response = await client.post(
            f"{settings.transcoder.url}/transcode",
            params={"target": target_format},
            files={"file": file},
            headers={"X-Transcoder-Key": settings.transcoder.auth_token},
            timeout=300.0  # 5 minutes for large files
        )
        response.raise_for_status()
        return response.content

error handling#

try:
    transcoded = await transcode_audio(file, "mp3")
except httpx.HTTPStatusError as e:
    if e.response.status_code == 401:
        logger.error("transcoder authentication failed")
        raise HTTPException(500, "transcoding service unavailable")
    elif e.response.status_code == 413:
        raise HTTPException(413, "file too large for transcoding")
    else:
        logger.error(f"transcoding failed: {e}")
        raise HTTPException(500, "transcoding failed")
except httpx.TimeoutException:
    logger.error("transcoding timed out")
    raise HTTPException(504, "transcoding took too long")

local development#

prerequisites#

  • rust toolchain (install via rustup)
  • ffmpeg (install via brew install ffmpeg on macOS)

running locally#

# from transcoder directory
cd services/transcoder && cargo run

# with custom port
TRANSCODER_PORT=9000 cargo run

# with debug logging
RUST_LOG=debug cargo run

note: the transcoder runs on port 8080 by default (configured in fly.toml).

testing locally#

# start transcoder
just transcoder run

# test health endpoint
curl http://localhost:8082/health

# test transcoding (no auth required in dev mode)
curl -X POST http://localhost:8082/transcode?target=mp3 \
  -F "file=@test.wav" \
  --output transcoded.mp3

# test with authentication
export TRANSCODER_AUTH_TOKEN="dev-token"
cargo run &

curl -X POST http://localhost:8082/transcode?target=mp3 \
  -H "X-Transcoder-Key: dev-token" \
  -F "file=@test.wav" \
  --output transcoded.mp3

performance characteristics#

typical transcoding times#

transcoding performance depends on:

  • input file size and duration
  • source codec complexity
  • target codec and quality settings
  • available CPU

benchmarks (shared-cpu-1x on fly.io):

  • 3-minute MP3 (5MB) → MP3: ~2-3 seconds
  • 3-minute WAV (30MB) → MP3: ~4-5 seconds
  • 10-minute FLAC (50MB) → MP3: ~10-15 seconds

resource usage#

memory:

  • base process: ~20MB
  • active transcoding: +100-200MB per request
  • 1GB VM supports 4-5 concurrent transcodes

CPU:

  • ffmpeg uses 100% of allocated CPU
  • single-core sufficient for typical workload
  • multi-core would enable parallel processing

scaling considerations#

when to scale up:

  • average response time >30 seconds
  • frequent 503 errors (all VMs busy)
  • queue depth increasing

scaling options:

  1. horizontal: increase machine count (fly scale count 2)
  2. vertical: increase memory/CPU (fly scale vm shared-cpu-2x)
  3. regional: deploy to multiple regions for geo-distribution

monitoring#

metrics to track#

  1. transcoding success rate

    • total requests
    • successful transcodes
    • failed transcodes (by error type)
  2. performance

    • average transcoding time
    • p50, p95, p99 latency
    • throughput (transcodes/minute)
  3. resource usage

    • CPU utilization
    • memory usage
    • disk I/O (temp files)
  4. errors

    • authentication failures
    • ffmpeg errors
    • timeout errors
    • 413 file too large

fly.io metrics#

# view metrics dashboard
fly dashboard -a plyr-transcoder

# check recent requests
fly logs -a plyr-transcoder | grep "POST /transcode"

# monitor resource usage
fly vm status -a plyr-transcoder

troubleshooting#

common issues#

ffmpeg not found:

error: ffmpeg command failed: No such file or directory

solution: ensure ffmpeg is installed in docker image (check Dockerfile)

authentication fails in production:

error: 401 unauthorized

solution: verify TRANSCODER_AUTH_TOKEN is set on both transcoder and backend

timeouts on large files:

error: request timeout after 120s

solution: increase timeout in backend client (timeout=300.0)

413 entity too large:

error: 413 payload too large

solution: increase TRANSCODER_MAX_UPLOAD_BYTES or reject large files earlier

VM not starting automatically:

error: no instances available

solution: check auto_start_machines = true in fly.toml

future enhancements#

potential improvements#

  1. progress tracking

    • stream ffmpeg progress updates
    • return progress via server-sent events
    • enable client-side progress bar
  2. format detection

    • auto-detect input format via ffprobe
    • validate format before transcoding
    • reject unsupported formats early
  3. quality presets

    • high quality (320kbps MP3, 256kbps AAC)
    • standard quality (192kbps)
    • low quality (128kbps for previews)
  4. metadata preservation

    • extract metadata from input
    • apply metadata to output
    • handle artwork/cover images
  5. batch processing

    • accept multiple files
    • process in parallel
    • return as zip archive
  6. caching

    • cache transcoded files by content hash
    • serve cached versions instantly
    • implement LRU eviction

references#