AI is now driving production commits.
84–91% of developers use AI coding tools. 51% of professional developers use AI tools daily, merging 60% more pull requests per week than non-users. The dominant pattern has shifted to 'frontier planner + cheap executor' networks (e.g. Opus/GPT-5.5 planning, while Sonnet/DeepSeek executing).
- ▸SWE-bench Pro Leaderboard: Claude Opus 4.8 (69.2%), GPT-5.4 (57.7%), DeepSeek V4 Pro (55.4%), Gemini 3.1 Pro (54.2%). Models drop 19-26 percentage points compared to Verified.
- ▸SWE-bench Verified Leaderboard: Claude Opus 4.8 (88.6%), Claude Opus 4.6 (80.8%), Gemini 3.1 Pro (80.6%), MiniMax M2.5 (80.2%), GPT-5.4 (~80%).
- ▸Data Contamination: OpenAI has abandoned SWE-bench Verified reporting due to data contamination, finding 59.4% of hard tasks contained flawed tests.
- ▸Enterprise Adoption: Anthropic's Claude Code reached $2.5B annualized revenue by February 2026, just nine months post-release.