semantic bufo search find-bufo.com
bufo

add academic documentation and update landing page

- add comprehensive module documentation to search.rs explaining the
multimodal early fusion approach with references to voyage AI research
- update landing page to link "semantic search" directly to our search.rs
implementation on tangled
- remove outdated "multimodal hybrid search" link to turbopuffer docs
(no longer accurate since we removed BM25 hybrid search)
- simplify landing page header by removing the secondary info line

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

+31 -19
+23
src/search.rs
··· 1 + //! multimodal semantic search using early fusion embeddings 2 + //! 3 + //! this implementation uses voyage AI's multimodal-3 model which employs a 4 + //! unified transformer encoder for early fusion of text and image modalities. 5 + //! 6 + //! ## approach 7 + //! 8 + //! - filename text (e.g., "bufo-jumping-on-bed" → "bufo jumping on bed") is combined 9 + //! with image content in a single embedding request 10 + //! - the unified encoder processes both modalities together, creating a single 1024-dim 11 + //! vector that captures semantic meaning from both text and visual features 12 + //! - vector search against turbopuffer using cosine distance similarity 13 + //! 14 + //! ## research backing 15 + //! 16 + //! voyage AI's multimodal-3 demonstrates 41.44% improvement on table/figure retrieval 17 + //! tasks when combining text + images vs images alone, validating the early fusion approach. 18 + //! 19 + //! references: 20 + //! - voyage multimodal embeddings: https://docs.voyageai.com/docs/multimodal-embeddings 21 + //! - early fusion methodology: text and images are combined in the embedding generation 22 + //! phase rather than fusing separate embeddings (late fusion) 23 + 1 24 use crate::config::Config; 2 25 use crate::embedding::EmbeddingClient; 3 26 use crate::turbopuffer::{QueryRequest, TurbopufferClient};
+8 -19
static/index.html
··· 47 47 margin-bottom: 15px; 48 48 } 49 49 50 - .info { 51 - color: rgba(255, 255, 255, 0.85); 52 - font-size: 0.9em; 53 - text-align: center; 54 - margin-top: 10px; 50 + .subtitle a { 51 + color: inherit; 52 + text-decoration: underline; 53 + text-decoration-color: rgba(255, 255, 255, 0.4); 54 + transition: text-decoration-color 0.2s; 55 55 } 56 56 57 - .info a { 58 - color: rgba(255, 255, 255, 0.95); 59 - text-decoration: none; 60 - border-bottom: 1px solid rgba(255, 255, 255, 0.4); 61 - transition: border-color 0.2s; 62 - } 63 - 64 - .info a:hover { 65 - border-bottom-color: rgba(255, 255, 255, 0.9); 57 + .subtitle a:hover { 58 + text-decoration-color: rgba(255, 255, 255, 0.9); 66 59 } 67 60 68 61 .search-box { ··· 210 203 <div class="container"> 211 204 <div class="header"> 212 205 <h1>find bufo</h1> 213 - <p class="subtitle">semantic search for <a href="https://bufo.zone" target="_blank" style="color: inherit; text-decoration: underline;">bufo.zone</a></p> 214 - <p class="info"> 215 - <a href="https://turbopuffer.com/docs/hybrid-search" target="_blank">multimodal hybrid search</a> · 216 - <a href="https://git.tangled.sh/zzstoatzz.io/find-bufo" target="_blank">source</a> 217 - </p> 206 + <p class="subtitle"><a href="https://git.tangled.sh/zzstoatzz.io/find-bufo/src/branch/main/src/search.rs" target="_blank">semantic search</a> for <a href="https://bufo.zone" target="_blank">bufo.zone</a></p> 218 207 </div> 219 208 220 209 <div class="search-box">