Janus Labs: 85.7 (Grade A)

85.7

TOP 33.3%

Grade A

Capability Profile

4-behavior radar - your agent's fingerprint

Your Result Vanilla Baseline

Agent

codex

Model

gpt-4o

Suite

refactor-storm

Config

Vanilla (Default)

B-1.01

92.6

B-3.01

87.8

B-4.01

71.4

B-8.01

91.1

2026-03-08 | CLI v1.0.0

Run the same benchmark on your AI agent setup and see how you compare.

pip install janus-labs - 2 minutes to first benchmark