NullRabbit Logo
Back to Research Hub

Building the Jig: Why the Hard Part of Inline Defence Isn't the Code

NullRabbit Research4 min read

One of the hardest parts of building NullRabbit hasn't been writing kernel code. It's been building the tooling.

The XDP logic itself came together reasonably quickly. What followed was weeks of building infrastructure whose sole purpose was to answer a single question: does this actually work on the real internet?

Not in a lab. Not on localhost. Not against replayed pcaps.

In carpentry, you don't build a table by eyeballing the cuts. You build jigs. Security tooling is the same -- except the jig is often harder than the thing you're trying to validate.

What I thought I was building: an inline kernel defence layer. That's it.

What I actually built: a distributed environment capable of generating, routing, shaping, and observing hostile traffic under real-world constraints -- plus the defence layer.

Talk about testing frameworks.

Here's a sample attack

Basic attack demonstration

And this is the structure of the attack.

The jig. The defence code lives in one box. Everything else exists to prove it works.
ComponentTime
Inline XDP logicDays
Infrastructure to prove it worksWeeks

That 80/20 inversion matters more than most people realise.

What You Actually Need

You cannot validate an inline network defence by running it on your laptop, sending packets from the same host, or replaying captured traffic. Those approaches tell you whether code executes -- not whether it holds up when exposed.

Real validation requires:

  • Internet-facing hosts you don't fully control
  • Multiple traffic sources with asymmetric routing
  • Kernel behaviour that actively fights you
  • Failure modes you didn't anticipate
  • Tooling that lets you reproduce all of it without guessing what changed

Everything That Went Wrong

Almost nothing fails cleanly. Most failures don't look like crashes -- they look like things quietly not doing what you thought.

AI assistants actively block you. Anything involving traffic generation or kernel hooks triggers refusal modes. Budget time for negotiating with guardrails that don't understand defensive research.

The kernel is not neutral. Modern kernels mitigate the behaviours you're trying to observe. They "helpfully" rewrite or drop traffic. The most dangerous failure mode is believing your defence works because the kernel never allowed the failure to manifest.

Cloud providers assume you're malicious. Testing defensive systems properly looks indistinguishable from abuse. Accounts get flagged. Instances get terminated. Two full account shutdowns later: you must design testing infrastructure assuming it will be interrupted.

Tuning is a chain reaction. Buffers affect timing. Timing affects ordering. Ordering affects detection. Fix one thing, three others shift. Reproducibility matters more than optimisation early on.

Failure signals are subtle. Some of the most valuable failures looked like this: target is fine, latency is normal -- the attack hit the wrong IP. Nothing crashed. Nothing alerted. These generate false confidence.

Bugs you cannot tolerate. Silent misconfiguration. Partial activation. "Looks enabled" states. Finding and eliminating those took longer than writing the fast path.

Capture, Replay, Responsible Testing

None of this relies on uncontrolled live attacks.

The pattern -- used by teams like Cloudflare -- separates observation from reproduction: observe hostile behaviour at the edge, record it as inert artefacts, replay in isolated environments, study system behaviour without re-executing attack logic.

That boundary is intentional.

The Point

The sophistication of a security system is capped by the sophistication of its testing infrastructure.

If your jig is naive, your conclusions are naive -- no matter how advanced the code looks.

If building the defence feels harder than breaking it, you're probably doing it right.

Related Posts