Fix attribute/element name sanitization for stable roundtrips
- Add attribute name validation to skip attributes with invalid names
(control chars, whitespace, quotes, angle brackets, slash, equals)
- Add strict element name sanitization (ASCII-only, 0x21-0x7E excluding
special HTML chars) to ensure consistent reparsing
- Skip attributes with invalid names during serialization instead of
outputting malformed HTML
- Element names with invalid chars are sanitized by removing invalid
bytes and defaulting to "span" if empty
- Add (allow_empty) to html5rw-js package in dune-project since lib/js
was removed
- Add test_crash.ml for analyzing fuzz crash files with roundtrip debug
- Add test_pre.ml for testing pre/textarea newline handling
Fixes roundtrip instability found by AFL fuzzing. After these fixes,
86/104 crash corpus files pass roundtrip tests. The remaining 18 are
edge cases involving complex svg+table interactions in severely
malformed input where HTML5 error recovery produces non-deterministic
DOM structures.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>