fix benchmark harness to capture server errors without blocking
- use DEVNULL during normal benchmark runs to avoid pipe buffer deadlock
- on startup failure, restart with PIPE to capture error output
- display actual error message instead of generic "failed to start server"
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>