search for standard sites
pub-search.waow.tech
search
zig
blog
atproto
1# search syntax
2
3a reference for the query syntax at [pub-search.waow.tech](https://pub-search.waow.tech).
4
5## basics
6
7terms are OR'd together — a query matches documents containing *any* of the words. the last word gets prefix matching for a type-ahead feel.
8
9| you type | what runs | why |
10|----------|-----------|-----|
11| `cat dog` | `cat OR dog*` | matches docs with "cat" or "dog" (or "dogs", "dogma", etc.) |
12| `crypto` | `crypto*` | prefix match: finds "crypto", "cryptocurrency", etc. |
13
14## quoted phrases
15
16wrap words in double quotes for exact phrase matching — FTS5 requires the words to appear adjacent and in order.
17
18| you type | what runs |
19|----------|-----------|
20| `"machine learning"` | `"machine learning"` |
21| `python "machine learning" tutorial` | `python OR "machine learning" OR tutorial*` |
22| `"exact phrase" python` | `"exact phrase" OR python*` |
23
24the last token only gets a prefix `*` if it's a bare word — phrases are never prefix-expanded.
25
26unclosed quotes are treated as phrases: `"hello world` → `"hello world"`.
27
28## filters
29
30beyond the query text, you can filter results by:
31
32- **author**: type `@handle` in the search box (e.g., `@zat.dev up`). quote to search literally: `"@zat.dev"`.
33- **platform**: leaflet, pckt, offprint, greengale, whitewind, other
34- **tag**: click any tag in the results to filter by it
35- **date**: today, this week, this month, this year
36
37filters combine with the search query — e.g., searching `@zat.dev up` returns only zat.dev's posts matching "up".
38
39## search modes
40
41three modes are available via the toggle below the search box:
42
43- **keyword** (default): SQLite FTS5 full-text search with BM25 ranking + recency boost. fastest (~9ms).
44- **semantic**: vector similarity via Voyage AI embeddings + turbopuffer. finds conceptually similar content even without shared words (~345ms).
45- **hybrid**: runs both keyword and semantic in parallel, merges via reciprocal rank fusion. best quality, slightly slower (~360ms).
46
47## ranking
48
49keyword results are ranked by `BM25 + recency`:
50- BM25 scores term frequency and document length (standard IR ranking)
51- recency adds a small boost for newer documents: `rank + (days_old / 30)`
52
53## tokenization
54
55the FTS5 unicode61 tokenizer treats any non-alphanumeric character as a separator. this means:
56- `crypto-casino` → matches "crypto" and "casino" separately
57- `don't` → matches "don" and "t"
58- `foo.bar` → matches "foo" and "bar"