genre classification#
ML-based genre classification for tracks using the effnet-discogs model on Replicate.
architecture#
track uploaded (or on-demand request)
|
v
Replicate API
(effnet-discogs, CPU, ~2s)
|
v
genre predictions JSON
(top N labels + confidence scores)
|
v
stored in track.extra["genre_predictions"]
how it works#
- audio URL is sent to Replicate's effnet-discogs model
- model returns genre labels from the Discogs taxonomy with confidence scores
- raw labels use
Genre---Subgenreformat (e.g.,Electronic---Ambient) — we split these into separate tags (electronic,ambient) and deduplicate, keeping the highest confidence score - predictions are stored in
track.extra["genre_predictions"]as JSON
when classification runs#
- on upload: if
REPLICATE_ENABLED=true, classification is scheduled as a docket background task after upload - on demand:
GET /tracks/{id}/recommended-tagsclassifies on the fly if no stored predictions exist - backfill:
scripts/backfill_genres.pyprocesses existing tracks
API#
GET /tracks/{track_id}/recommended-tags?limit=5#
no auth required. returns genre predictions for a track, excluding tags the track already has.
response:
{
"track_id": 668,
"tags": [
{"name": "audiobook non-music", "score": 0.2129},
{"name": "spoken word non-music", "score": 0.1817},
{"name": "monolog non-music", "score": 0.1227}
],
"available": true
}
available: falsewhen Replicate is disabled- empty
tagswithavailable: truemeans the track has no R2 URL or classification returned no results scoreis the model's confidence (0-1)
auto-tag at upload#
users can check "auto-tag with recommended genres" on the upload form. when enabled:
auto_tag: trueis stored intrack.extraduring uploadclassify_genresbackground task runs as usual- after storing predictions, the task checks for the
auto_tagflag - applies top genre tags using ratio-to-top filter: tags scoring >= 50% of the top score, capped at 5
- cleans up the
auto_tagflag fromtrack.extra
auto-tags are additive with manual tags — if the user also typed tags, both appear on the track.
key files: backend/src/backend/api/tracks/uploads.py (form param + UploadContext), backend/src/backend/_internal/tasks/ml.py (apply logic in classify_genres)
auditing#
scripts/ml_audit.py reports which tracks and artists have been processed by genre classification (and other ML features). useful for privacy/ToS auditing.
cd backend && uv run python ../scripts/ml_audit.py --verbose
storage format#
predictions are stored in track.extra["genre_predictions"]:
[
{"name": "electronic", "confidence": 0.1999},
{"name": "ambient", "confidence": 0.1999},
{"name": "experimental", "confidence": 0.1673},
{"name": "synth-pop", "confidence": 0.122}
]
once classified, the predictions are cached — subsequent API requests read from the database without calling Replicate again.
cache invalidation#
predictions are keyed by genre_predictions_file_id (stored alongside predictions in track.extra). when a track's audio file is replaced, the file_id changes and the cached predictions are discarded — the next request triggers reclassification.
{
"genre_predictions": [...],
"genre_predictions_file_id": "a1b2c3d4e5f67890"
}
backfill#
# dry run — shows eligible tracks (missing genre_predictions in extra)
uv run scripts/backfill_genres.py --dry-run
# classify first 5 tracks
uv run scripts/backfill_genres.py --limit 5
# full backfill with custom concurrency
uv run scripts/backfill_genres.py --concurrency 10
requires env vars: DATABASE_URL, REPLICATE_ENABLED=true, REPLICATE_API_TOKEN.
environment variables#
| variable | purpose | default |
|---|---|---|
REPLICATE_ENABLED |
enable genre classification | false |
REPLICATE_API_TOKEN |
Replicate API token | — |
REPLICATE_TOP_N |
number of predictions to keep | 10 |
REPLICATE_TIMEOUT_SECONDS |
request timeout | 120 |
cost#
effnet-discogs runs on CPU at $0.00019/run ($0.11 per 575 tracks). Replicate scales to zero when idle.
model details#
- model: mtg/effnet-discogs (EfficientNet trained on Discogs)
- taxonomy: Discogs genre/subgenre labels (~400 categories)
- inference: CPU, ~2s per track
- SDK note: the
replicatePython SDK is incompatible with Python 3.14 (pydantic v1 dependency). we use httpx directly against the Replicate HTTP API withPrefer: waitfor synchronous predictions.
frontend UX#
when editing a track on the portal page, the edit modal fetches recommended tags and displays them as clickable dashed-border chips below the tag input. clicking a chip adds the tag. uses $derived to reactively hide suggestions that match manually-typed tags.
- loading state:
WaveLoading size="sm"with "suggested" label - silent failures: section hidden if replicate disabled, fetch fails, or no predictions
- implemented in:
frontend/src/routes/portal/+page.svelte
key files#
backend/src/backend/_internal/clients/replicate.py— Replicate HTTP clientbackend/src/backend/_internal/tasks/ml.py—classify_genrestask (+ auto-tag logic)backend/src/backend/api/tracks/uploads.py—auto_tagform param andUploadContextbackend/src/backend/api/tracks/tags.py—recommended-tagsendpointbackend/src/backend/config.py—ReplicateSettingsscripts/backfill_genres.py— batch classification scriptscripts/ml_audit.py— ML processing audit scriptfrontend/src/routes/upload/+page.svelte— auto-tag checkbox on upload formfrontend/src/routes/portal/+page.svelte— suggested tags UI in edit modal