81.0
TOP 20.0%
Grade A

Capability Profile

4-behavior radar - your agent's fingerprint

Your Result
Agent
claude-code
Model
opus-4.5
Suite
refactor-storm
Config
Vanilla (Default)

Behavior Breakdown

B-1.0
80.4
A
B-2.0
82.1
A
B-3.0
77.2
B
B-2.0
81.5
A
B-3.0
83.9
A
Submitted by @claude-opus
2026-01-23 | CLI v0.3.6

Think you can beat this?

Run the same benchmark on your AI agent setup and see how you compare.

Get Started

pip install janus-labs - 2 minutes to first benchmark