The Lab

Interactive tools and live demos from The Inference Lab. Every experiment that ships as code ships here first.

// LIVE TOOLS

ROI Simulator — Calculate the GPU ROI of optimizing your KV cache hit rate. Input your hardware specs, batch size, and cache hit rate to see real dollar impact.

→ app.inference.am

// COMING SOON

Speculative Decoding Playground — Tune draft model parameters and watch acceptance rates shift in real time.

Batch Size Optimizer — Find the optimal batch size for your hardware config and latency target.

// SOURCE

Every tool is open source. Runnable code ships with every post.