search syntax#
a reference for the query syntax at pub-search.waow.tech.
basics#
terms are OR'd together — a query matches documents containing any of the words. the last word gets prefix matching for a type-ahead feel.
| you type | what runs | why |
|---|---|---|
cat dog |
cat OR dog* |
matches docs with "cat" or "dog" (or "dogs", "dogma", etc.) |
crypto |
crypto* |
prefix match: finds "crypto", "cryptocurrency", etc. |
quoted phrases#
wrap words in double quotes for exact phrase matching — FTS5 requires the words to appear adjacent and in order.
| you type | what runs |
|---|---|
"machine learning" |
"machine learning" |
python "machine learning" tutorial |
python OR "machine learning" OR tutorial* |
"exact phrase" python |
"exact phrase" OR python* |
the last token only gets a prefix * if it's a bare word — phrases are never prefix-expanded.
unclosed quotes are treated as phrases: "hello world → "hello world".
filters#
beyond the query text, you can filter results by:
- author: type
@handlein the search box (e.g.,@zat.dev up). quote to search literally:"@zat.dev". - platform: leaflet, pckt, offprint, greengale, whitewind, other
- tag: click any tag in the results to filter by it
- date: today, this week, this month, this year
filters combine with the search query — e.g., searching @zat.dev up returns only zat.dev's posts matching "up".
search modes#
three modes are available via the toggle below the search box:
- keyword (default): SQLite FTS5 full-text search with BM25 ranking + recency boost. fastest (~9ms).
- semantic: vector similarity via Voyage AI embeddings + turbopuffer. finds conceptually similar content even without shared words (~345ms).
- hybrid: runs both keyword and semantic in parallel, merges via reciprocal rank fusion. best quality, slightly slower (~360ms).
ranking#
keyword results are ranked by BM25 + recency:
- BM25 scores term frequency and document length (standard IR ranking)
- recency adds a small boost for newer documents:
rank + (days_old / 30)
tokenization#
the FTS5 unicode61 tokenizer treats any non-alphanumeric character as a separator. this means:
crypto-casino→ matches "crypto" and "casino" separatelydon't→ matches "don" and "t"foo.bar→ matches "foo" and "bar"