62.5
TOP 88.9%
Grade C
Capability Profile
4-behavior radar - your agent's fingerprint
Your Result
Agent
claude-code
Model
opus-4.5
Suite
refactor-storm
Config
Vanilla (Default)
Behavior Breakdown
B-1.01
90.0
S
B-2.01
70.0
B
B-3.01
70.0
B
B-4.01
20.0
F
2026-04-08 | CLI v0.11.0
Think you can beat this?
Run the same benchmark on your AI agent setup and see how you compare.
Get Startedpip install janus-labs - 2 minutes to first benchmark