📊

Eval v2 Dashboard

Load a benchmark results JSON file to visualize scores, compare experiments, and drill into individual cases.

Drag & drop JSON file here

or