commits
add content_hash (wyhash of title+content) to documents table. on
ingest, skip documents where the same author already has identical
content under a different rkey (cross-platform publishing dedup).
frontend: add date filter (any/week/month/year) with since param,
URL state sync, and active filter bar.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- backfill-pds now handles com.whtwnd.blog.entry collection
- extracts markdown content from whitewind's content field
- sets platform to "whitewind", skips visibility:"author" entries
- prefers publishedAt over createdAt for date extraction
- update tangled.sh URLs to tangled.org in build.zig.zon
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- platform filter button in search UI
- homepage platform links
- MCP server Platform type and tool descriptions
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The /stats endpoint was missing the started_at field entirely. Also
added diagnostic logging to refreshCachedStats so turso query failures
are visible instead of silently swallowed.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
WhiteWind blog entries use three visibility values: "public", "url",
and "author". "url" means publicly accessible via link. Our filter
was dropping everything except "public", which meant every WhiteWind
entry with visibility "url" was silently discarded. Now only "author"
entries are skipped.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
getStatsLocal() required stats_buffer cache to be initialized, but
refreshCachedStats was only called at startup (which could fail) and
when search deltas were non-zero. If init failed and no searches
happened, cache stayed uninitialized, /stats returned all zeros.
- always refresh cache in sync loop (not just when deltas exist)
- getStatsLocal() no longer fails when cache isn't initialized — returns
local counts with 0 for cached fields instead of aborting entirely
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
fetchLocal() queried COUNT(*) WHERE embedded_at IS NOT NULL, but the
local schema never had that column. Every dashboard request failed,
falling through to turso batch (which also returned zeros).
- add embedded_at column migration to LocalDb schema
- sync embedded_at from turso in full and incremental sync
- add logfire warnings when fetchLocal fails or turso batch returns no rows
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
root cause: processMessage (which writes to turso via HTTP) ran
synchronously in the websocket readLoop callback. when turso was
slow or hung, the readLoop blocked — no messages read, no ACKs
sent, TAP outbox grew unboundedly (4222 events stuck).
fix: send ACK immediately upon receipt, push message data to a
bounded queue, process in a separate worker thread. readLoop
never blocks on turso. if turso is slow, queue fills and oldest
messages are dropped (already ACK'd, indexing is idempotent via
ON CONFLICT DO UPDATE).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Tracks ack_count and no_id_count to determine whether extractMessageId
returns null (no ACK sent) or ACKs are sent but not received by TAP.
Logs first 3 ACK payloads and first 5 no-id messages.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Backend support is in place but no WhiteWind documents have been
indexed yet. Remove the filter button and footer link until we
have actual results to show.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- add com.whtwnd.blog.entry to tap collection filters and document routing
- add content-as-string fallback in extractor (whitewind stores markdown in content field)
- add visibility filter to skip non-public whitewind entries
- add whitewind platform to frontend (filter button, URL pattern, config)
- add stats link to header
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Related items (for the top result) were staying in the DOM when
"load more" was clicked, causing new results to appear below them.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
base_path queries (publication name matches) were bypassing the since
filter, leaking old results. added since-aware turso query variants
and post-fetch date filtering in searchLocal.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- deduplicate search results by (did, title) to collapse cross-platform
duplicates (same content published to multiple ATProto apps)
- add date filter buttons (any/week/month/year) wired to since param
- load more button shows remaining count from v2 total
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- content previews for semantic/similar results via local SQLite lookup
- RRF score field in hybrid search results
- opt-in v2 response wrapper (?format=v2) with total/hasMore metadata
- pagination via limit/offset params with "load more" in frontend
- all consumers (frontend, MCP) handle both v1 and v2 formats
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
the 0.6 cosine distance cutoff was filtering out all results for
oblique/indirect queries (e.g. "guy from south africa with lots of kids"
→ elon musk). tpuf already returns results sorted by distance, so
natural ordering handles relevance without an arbitrary cutoff.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
measured actual distance distributions across 8 test queries using
scripts/measure-distances. voyage-4-lite 1024d best matches range
0.32-0.51, and the 0.5 threshold completely killed queries like
"community builders" (best=0.506) and "atproto federation" (best=0.505).
0.6 captures all clearly relevant top results while cutting off noise
that starts around 0.61+.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This reverts commit 704eb7427ddae23ee2d8e659df332f790341ae0b.
voyage-4-lite 1024d produces tighter cosine distance ranges than
voyage-3-lite 512d, so the old 0.5 threshold was filtering out all
results for many queries (e.g. "community builders" returned 0 results).
Raise to 0.75 to let tpuf's natural ranking handle quality.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- add URI dedup in searchSemantic() (same doc appeared twice from tpuf)
- rewrite scripts/rebuild-vector-index for tpuf namespace reset workflow
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
voyage-3-lite (512 dims) produced poor semantic search quality — only 4
results for "consciousness" vs 39 on greengale.app. voyage-4-lite was
released Jan 2026 with significantly better retrieval accuracy.
- model: voyage-3-lite → voyage-4-lite
- dims: 512 → 1024
- explicit output_dimension parameter for Matryoshka support
- tpuf namespace deleted, embedded_at cleared for full re-embed
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
single-request hybrid mode merges keyword (FTS5) and semantic (voyage +
tpuf) results using Reciprocal Rank Fusion scoring. adds mode toggle to
frontend, source badges on results, per-mode latency tracking, and
embeddings count on dashboard.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- embedder: skip docs with content < 50 chars or test titles
- searchSemantic: over-fetch 40, filter dist > 0.5 + empty titles, cap at 20
- frontend: remove mode toggle (keep backend support for when quality is ready)
- scripts: add cleanup-vector-index to purge junk vectors from tpuf
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- tpuf.zig: embedQuery() calls Voyage API with input_type="query" for asymmetric search
- search.zig: SearchMode enum, searchSemantic() dispatches to tpuf, keyword path untouched
- server.zig: parse mode query param, pass to search
- site: mode toggle (keyword/semantic/hybrid), hybrid shows keyword instantly + appends semantic
- docs: document mode parameter on /search endpoint
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
getVectorById was using the deprecated include_vectors parameter which
the v2 API rejects, causing /similar to always return empty.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
turbopuffer has a 64-byte ID limit but AT-URIs are 60-96 bytes.
use SHA256 truncated to 128 bits (32 hex chars) as tpuf document ID.
store full URI as metadata attribute for result serialization.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
strftime in parameterized turso pipeline queries may not evaluate
as expected. generate the ISO timestamp in zig and pass as arg.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
queryBatch approach had lifetime/type issues with args.
individual exec() calls are simpler and each gets its own
db.query span for observability.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
the loop-local &.{doc.uri} temporaries were dead by the time
queryBatch executed. allocate args array on heap so pointers
remain valid.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
individual exec() calls per doc were 20 sequential HTTP round-trips.
use queryBatch() to send all updates in one request.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- findSimilar now uses tpuf.getVectorById + tpuf.query (ANN, ~100ms vs 2-3s brute-force cosine)
- embedder writes only to turbopuffer, marks docs with embedded_at timestamp
- remove similarity_cache table, cosine queries, and all cache machinery
- add embedded_at column migration (replaces turso embedding column)
- preserve embedded_at in indexer ON CONFLICT clause
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
tpuf.zig: standalone client for turbopuffer REST API (upsert, query, delete).
stores full document metadata alongside vectors so query results can be
returned directly without a DB roundtrip.
embedder: after writing embeddings to turso, also upserts to turbopuffer
when TURBOPUFFER_API_KEY is set. uses zql.Query + fromRow pattern for
the expanded SELECT (metadata needed for tpuf sync).
to backfill: clear turso embeddings, let the embedder re-process everything.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
re-enables the pre-a6660bc setup: background embedder (1 worker, 20 batch,
512-dim voyage-3-lite) and brute-force cosine /similar with two-level cache
(local SQLite + turso). keyword search path is untouched.
cleared stale embeddings in turso before deploy so embedder starts fresh.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
root cause: otel-zig's BatchSpanProcessor held its mutex during HTTP
export (1-2s TLS round-trip), blocking any thread calling span.end().
fixed upstream in otel-zig, propagated through logfire-zig.
other fixes included in this commit:
- add publication_uri index (600ms -> <1ms for base_path joins)
- increase HTTP_BUF_SIZE 8KB -> 64KB (fewer flush syscalls)
- add mmap + cache_size pragmas on sqlite read connection
- add search.iterate.* spans for iteration visibility
- add queue_ms tracking + socket timeout 30s -> 5s
- document the full investigation in docs/performance-saga.md
- add zig fetch --save tip to CLAUDE.md
result: consistent 75-110ms with full logfire observability
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
server-side search is 0-2.5ms but TTFB is 1-2.5s for some requests.
the new http.request span wraps the full request lifecycle including
receiveHead(), and logs a warning when receive takes >100ms. this
will reveal whether the delay is in Fly proxy, connection queueing,
or request reading.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Search queries were blocked 2-4s waiting on the mutex while sync held
the write lock for batch inserts. WAL mode supports concurrent readers
with separate connections — search now uses a dedicated read-only
connection with no mutex, so it never blocks on writes.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
/similar now returns [] immediately. Removed findSimilar, all similarity
cache code, and similarity_cache sync. The brute-force vector_distance_cos
queries were taking 10-15s each and blocking keyword search (30-57s).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
embedder was holding the shared turso client mutex during Voyage API
calls (seconds), blocking all other turso requests including search
fallback and stats_buffer sync. this caused the entire server to hang.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
was always doing full sync on every deploy (3+ min, blocks search).
now checks last_sync timestamp — if local db has data from a previous
run on the persistent volume, marks ready immediately and does a quick
incremental sync. full sync only on first-ever startup.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
sync and search were sharing one turso client with a mutex, so sync
batch fetches blocked search queries. now sync has its own client —
search queries go through independently even during full sync.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
per-request http.Client caused all turso requests to hang (likely
TLS handshake overhead). restore shared client with mutex — the
sync not-ready fix means search won't hit turso during sync anyway.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
search was blocking on local db mutex during sync batch writes.
now local stays not-ready for the entire sync duration — search
goes straight to turso (which has no mutex). when sync finishes,
local becomes ready and search uses the fast local path.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
logs whether search hits local sqlite or falls back to turso,
and why (local not ready, or local query failed with error name).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
the shared mutex on Client.zig serialized ALL turso requests across
threads. during full sync, this blocked search queries from reaching
turso even when local sqlite wasn't ready yet.
now each request creates its own http client, so sync and search
can hit turso concurrently.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
full sync was holding the local db mutex for the entire 3+ minute
duration, blocking all search queries. now fetches each batch from
turso without the lock, then briefly locks to write to local sqlite.
search can use the (progressively filling) local db during sync
instead of being blocked for minutes.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
backend reverted to 6689407 state:
- server.zig: no mode param, just calls search.search()
- search.zig: remove searchSemantic, searchHybrid, embedQuery
- embedder.zig: back to voyage-3-lite, batch 20, 512 dims
frontend: keyword-only search, no semantic phase, no mode selector
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
DiskANN index creation times out on turso for 5500+ vectors.
revert to working brute-force vector_distance_cos until resolved.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3 concurrent workers writing embeddings blocks all other queries
(search, dashboard, sync) when there's a large backlog.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
CREATE INDEX IF NOT EXISTS for vector indexes hangs on turso when the
index already exists, blocking app startup. the index is managed via
scripts/rebuild-documents-table instead.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This reverts commit 951e02fca7b4771efc9cb8f6713187d4502d5d5f.
This reverts commit 41cf1dc7d40bf21034353b8802fd98f81201f1ef.
brute-force vector_distance_cos() scans all ~5500 embeddings, taking
~20s on cold queries. switch to vector_top_k() with a DiskANN index
for O(log n) lookups instead of O(n) full scans.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
encodeURIComponent turns did:plc:xxx into did%3Aplc%3Axxx which
bsky.app doesn't resolve. DIDs are URL-safe, just need HTML escaping.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
add content_hash (wyhash of title+content) to documents table. on
ingest, skip documents where the same author already has identical
content under a different rkey (cross-platform publishing dedup).
frontend: add date filter (any/week/month/year) with since param,
URL state sync, and active filter bar.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- backfill-pds now handles com.whtwnd.blog.entry collection
- extracts markdown content from whitewind's content field
- sets platform to "whitewind", skips visibility:"author" entries
- prefers publishedAt over createdAt for date extraction
- update tangled.sh URLs to tangled.org in build.zig.zon
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
WhiteWind blog entries use three visibility values: "public", "url",
and "author". "url" means publicly accessible via link. Our filter
was dropping everything except "public", which meant every WhiteWind
entry with visibility "url" was silently discarded. Now only "author"
entries are skipped.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
getStatsLocal() required stats_buffer cache to be initialized, but
refreshCachedStats was only called at startup (which could fail) and
when search deltas were non-zero. If init failed and no searches
happened, cache stayed uninitialized, /stats returned all zeros.
- always refresh cache in sync loop (not just when deltas exist)
- getStatsLocal() no longer fails when cache isn't initialized — returns
local counts with 0 for cached fields instead of aborting entirely
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
fetchLocal() queried COUNT(*) WHERE embedded_at IS NOT NULL, but the
local schema never had that column. Every dashboard request failed,
falling through to turso batch (which also returned zeros).
- add embedded_at column migration to LocalDb schema
- sync embedded_at from turso in full and incremental sync
- add logfire warnings when fetchLocal fails or turso batch returns no rows
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
root cause: processMessage (which writes to turso via HTTP) ran
synchronously in the websocket readLoop callback. when turso was
slow or hung, the readLoop blocked — no messages read, no ACKs
sent, TAP outbox grew unboundedly (4222 events stuck).
fix: send ACK immediately upon receipt, push message data to a
bounded queue, process in a separate worker thread. readLoop
never blocks on turso. if turso is slow, queue fills and oldest
messages are dropped (already ACK'd, indexing is idempotent via
ON CONFLICT DO UPDATE).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- add com.whtwnd.blog.entry to tap collection filters and document routing
- add content-as-string fallback in extractor (whitewind stores markdown in content field)
- add visibility filter to skip non-public whitewind entries
- add whitewind platform to frontend (filter button, URL pattern, config)
- add stats link to header
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- deduplicate search results by (did, title) to collapse cross-platform
duplicates (same content published to multiple ATProto apps)
- add date filter buttons (any/week/month/year) wired to since param
- load more button shows remaining count from v2 total
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- content previews for semantic/similar results via local SQLite lookup
- RRF score field in hybrid search results
- opt-in v2 response wrapper (?format=v2) with total/hasMore metadata
- pagination via limit/offset params with "load more" in frontend
- all consumers (frontend, MCP) handle both v1 and v2 formats
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
the 0.6 cosine distance cutoff was filtering out all results for
oblique/indirect queries (e.g. "guy from south africa with lots of kids"
→ elon musk). tpuf already returns results sorted by distance, so
natural ordering handles relevance without an arbitrary cutoff.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
measured actual distance distributions across 8 test queries using
scripts/measure-distances. voyage-4-lite 1024d best matches range
0.32-0.51, and the 0.5 threshold completely killed queries like
"community builders" (best=0.506) and "atproto federation" (best=0.505).
0.6 captures all clearly relevant top results while cutting off noise
that starts around 0.61+.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
voyage-4-lite 1024d produces tighter cosine distance ranges than
voyage-3-lite 512d, so the old 0.5 threshold was filtering out all
results for many queries (e.g. "community builders" returned 0 results).
Raise to 0.75 to let tpuf's natural ranking handle quality.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
voyage-3-lite (512 dims) produced poor semantic search quality — only 4
results for "consciousness" vs 39 on greengale.app. voyage-4-lite was
released Jan 2026 with significantly better retrieval accuracy.
- model: voyage-3-lite → voyage-4-lite
- dims: 512 → 1024
- explicit output_dimension parameter for Matryoshka support
- tpuf namespace deleted, embedded_at cleared for full re-embed
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- embedder: skip docs with content < 50 chars or test titles
- searchSemantic: over-fetch 40, filter dist > 0.5 + empty titles, cap at 20
- frontend: remove mode toggle (keep backend support for when quality is ready)
- scripts: add cleanup-vector-index to purge junk vectors from tpuf
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- tpuf.zig: embedQuery() calls Voyage API with input_type="query" for asymmetric search
- search.zig: SearchMode enum, searchSemantic() dispatches to tpuf, keyword path untouched
- server.zig: parse mode query param, pass to search
- site: mode toggle (keyword/semantic/hybrid), hybrid shows keyword instantly + appends semantic
- docs: document mode parameter on /search endpoint
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- findSimilar now uses tpuf.getVectorById + tpuf.query (ANN, ~100ms vs 2-3s brute-force cosine)
- embedder writes only to turbopuffer, marks docs with embedded_at timestamp
- remove similarity_cache table, cosine queries, and all cache machinery
- add embedded_at column migration (replaces turso embedding column)
- preserve embedded_at in indexer ON CONFLICT clause
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
tpuf.zig: standalone client for turbopuffer REST API (upsert, query, delete).
stores full document metadata alongside vectors so query results can be
returned directly without a DB roundtrip.
embedder: after writing embeddings to turso, also upserts to turbopuffer
when TURBOPUFFER_API_KEY is set. uses zql.Query + fromRow pattern for
the expanded SELECT (metadata needed for tpuf sync).
to backfill: clear turso embeddings, let the embedder re-process everything.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
re-enables the pre-a6660bc setup: background embedder (1 worker, 20 batch,
512-dim voyage-3-lite) and brute-force cosine /similar with two-level cache
(local SQLite + turso). keyword search path is untouched.
cleared stale embeddings in turso before deploy so embedder starts fresh.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
root cause: otel-zig's BatchSpanProcessor held its mutex during HTTP
export (1-2s TLS round-trip), blocking any thread calling span.end().
fixed upstream in otel-zig, propagated through logfire-zig.
other fixes included in this commit:
- add publication_uri index (600ms -> <1ms for base_path joins)
- increase HTTP_BUF_SIZE 8KB -> 64KB (fewer flush syscalls)
- add mmap + cache_size pragmas on sqlite read connection
- add search.iterate.* spans for iteration visibility
- add queue_ms tracking + socket timeout 30s -> 5s
- document the full investigation in docs/performance-saga.md
- add zig fetch --save tip to CLAUDE.md
result: consistent 75-110ms with full logfire observability
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
server-side search is 0-2.5ms but TTFB is 1-2.5s for some requests.
the new http.request span wraps the full request lifecycle including
receiveHead(), and logs a warning when receive takes >100ms. this
will reveal whether the delay is in Fly proxy, connection queueing,
or request reading.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Search queries were blocked 2-4s waiting on the mutex while sync held
the write lock for batch inserts. WAL mode supports concurrent readers
with separate connections — search now uses a dedicated read-only
connection with no mutex, so it never blocks on writes.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
was always doing full sync on every deploy (3+ min, blocks search).
now checks last_sync timestamp — if local db has data from a previous
run on the persistent volume, marks ready immediately and does a quick
incremental sync. full sync only on first-ever startup.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
search was blocking on local db mutex during sync batch writes.
now local stays not-ready for the entire sync duration — search
goes straight to turso (which has no mutex). when sync finishes,
local becomes ready and search uses the fast local path.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
the shared mutex on Client.zig serialized ALL turso requests across
threads. during full sync, this blocked search queries from reaching
turso even when local sqlite wasn't ready yet.
now each request creates its own http client, so sync and search
can hit turso concurrently.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
full sync was holding the local db mutex for the entire 3+ minute
duration, blocking all search queries. now fetches each batch from
turso without the lock, then briefly locks to write to local sqlite.
search can use the (progressively filling) local db during sync
instead of being blocked for minutes.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
backend reverted to 6689407 state:
- server.zig: no mode param, just calls search.search()
- search.zig: remove searchSemantic, searchHybrid, embedQuery
- embedder.zig: back to voyage-3-lite, batch 20, 512 dims
frontend: keyword-only search, no semantic phase, no mode selector
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>