Keith, day 0: byte-exact or bust
Starting a build-in-public log for Keith, a scriptable HTTP/1.1/2/3 desync and parser-differential prober. The premise in one line: a conformant HTTP client is the wrong tool for finding HTTP parser bugs, because it normalises away exactly the malformed framing you need to send.
Duplicate Content-Length, a bare \n ending a header, an obfuscated Transfer-Encoding. A good client quietly fixes all of these before they hit the wire. But those fixes are the discrepancies that smuggling lives in. So the first thing we built is a request type that does no fixing at all.
The day-0 test that anchors the design:
let req = Request::new("POST", "/")
.header("Content-Length", "0")
.header("Content-Length", "44");
let wire = String::from_utf8(req.serialize()).unwrap();
assert_eq!(wire.matches("Content-Length").count(), 2); // both survive
A typed header map would collapse those into one. Keith keeps both, in order, on the wire, because which one the front-end honours and which one the back-end honours is the whole game.
Two more rules went in on day 0, both scar tissue from earlier work:
- No bare booleans. Every verdict carries its evidence: the baseline and probe timings, the response statuses, the byte counts. A "discrepancy" you cannot show your working for is a false positive waiting to embarrass you.
- Safe by default. The default mode prints the exact bytes and sends nothing. Live traffic requires naming the target twice. Building a probe and running one are deliberately different acts.
Next up: porting the HTTP/2 downgrade re-serialisation probes, then the HTTP/3 engine, which is the part that is actually under-tooled. More soon.
Related Posts
Meet Keith, and why we're keeping it closed
We built our own HTTP engine from scratch. No normalisation, no typed header map, no helpfulness at all, because a well-behaved client quietly fixes the exact malformations you need to send. Here is what Keith is, and why we changed our minds about open-sourcing it.
How we hunt request smuggling without breaking anything
A timing hunch is not a finding. The discipline that separates real desync research from noise is the part nobody photographs: a lab of real proxies, a back-end you own that logs the literal forwarded bytes, and a hard line about who you're allowed to point any of it at.
We slipped a path past Cloudflare's edge. The fix is one checkbox.
Cloudflare resolves dot-segments in a URL only far enough to reject the obvious escapes, then forwards the raw, still-encoded path to your origin, which quietly resolves it the rest of the way. Your edge rules see one path; your server serves another. Cloudflare even warns you about it, in a banner most people scroll past.
