Coding Agent Leaderboard

Compare coding agents across models and harnesses

8 Results 2 Models 3 Harnesses 2 Benchmarks

A Coding Agent is more than just a model - it's the combination of a Model and a Harness (the tool/framework driving the model). This leaderboard tracks how these components work together, because the same model can perform very differently depending on the harness it's paired with.

{
  • "headers": [
    • " ",
    • "Model",
    • "Harness",
    • "Precision",
    • "Model License",
    • "Harness License",
    • "Model Num Params (B)",
    • "Avg Score",
    • "swe-bench-verified",
    • "swe-bench-pro--ansible"
    ],
  • "data": [
    • [
      • "馃敹",
      • "[Sonnet 4.6](https://www.anthropic.com/news/claude-sonnet-4-6)",
      • "[Claude Code](https://github.com/anthropics/claude-code)",
      • "bf16",
      • "FOSS",
      • "Proprietary",
      • 1000,
      • 64.8,
      • 79.6,
      • 50
      ],
    • [
      • "馃煚",
      • "[RedHatAI/Qwen3.6-35B-A3B-NVFP4](https://huggingface.co/RedHatAI/Qwen3.6-35B-A3B-NVFP4)",
      • "[Pi](https://github.com/earendil-works/pi/tree/main)",
      • "nvfp4",
      • "FOSS",
      • "FOSS",
      • 35,
      • 56.5,
      • 65,
      • 47.9
      ],
    • [
      • "馃敹",
      • "[RedHatAI/Qwen3.6-35B-A3B-NVFP4](https://huggingface.co/RedHatAI/Qwen3.6-35B-A3B-NVFP4)",
      • "[Claude Code](https://github.com/anthropics/claude-code)",
      • "nvfp4",
      • "FOSS",
      • "Proprietary",
      • 35,
      • 54.5,
      • 63.2,
      • 45.8
      ],
    • [
      • "馃煚",
      • "[RedHatAI/Qwen3.6-35B-A3B-NVFP4](https://huggingface.co/RedHatAI/Qwen3.6-35B-A3B-NVFP4)",
      • "[OpenCode](https://github.com/anomalyco/opencode)",
      • "nvfp4",
      • "FOSS",
      • "FOSS",
      • 35,
      • 46.1,
      • 54.8,
      • 37.5
      ]
    ],
  • "metadata": null
}