fix(mcp): handle site.standard.document content structure · zzstoatzz.io/leaflet-search@8ab1789

search for standard sites pub-search.waow.tech

search zig blog atproto

fix(mcp): handle site.standard.document content structure

get_document was only looking for pages[] at the top level, which works
for pub.leaflet.document records. site.standard.document records nest
pages under content.pages[] instead.

now checks both locations to extract plaintext content.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

zzstoatzz.io 1 month ago 8ab17899 64a6377a

+12 -2

1 changed file

expand all

unified split

mcp

src

pub_search

server.py

+12 -2

mcp/src/pub_search/server.py

··· 154 154 value = dict(value) 155 155 156 156 # extract content from leaflet's block structure 157 - # pages[].blocks[].block.plaintext 157 + # pub.leaflet.document: pages[].blocks[].block.plaintext 158 + # site.standard.document: content.pages[].blocks[].block.plaintext 158 159 content_parts = [] 159 - for page in value.get("pages", []): 160 + 161 + # handle both formats: top-level pages (pub.leaflet.document) 162 + # or nested under content (site.standard.document) 163 + pages = value.get("pages", []) 164 + if not pages: 165 + content_obj = value.get("content", {}) 166 + if isinstance(content_obj, dict): 167 + pages = content_obj.get("pages", []) 168 + 169 + for page in pages: 160 170 for block_wrapper in page.get("blocks", []): 161 171 block = block_wrapper.get("block", {}) 162 172 plaintext = block.get("plaintext", "")