SWE-bench Press

2025-05-21 • PyTorch • PyTorch Expert Exchange: Towards Autonomous Language Model Systems

2025-05-08 • MIT Technology Review • How to build a better AI benchmark

2025-04-17 • Databricks • SWE bench & SWE agent | Data Brew | Episode 44

2025-04-10 • TechCrunch • AI models still struggle to debug software, Microsoft study shows

2025-01-13 • Jay Alammar • SWE-Bench authors reflect on the state of LLM agents at Neurips 2024

2025-01-06 • Anthropic • Raising the bar on SWE-bench Verified with Claude 3.5 Sonnet

2024-10-30 • Weaviate Podcast • SWE-bench with John Yang and Carlos E. Jimenez - Weaviate Podcast #107!

2024-08-15 • Weights & Biases • NeurIPS Hacker Cup AI: SWEAgent

2024-07-18 • Wired • I glimpsed the future of coding

2024-07-18 • Wired • The AI-Powered Future of Coding Is Near

2024-04-10 • DeepLearning.AI • Autonomous Coding Agents, Instability at Stability AI, Mamba Mania, What Users Do With GenAI

2024-04-05 • Matthew Berman • AI Agent Automatically Codes WITH TOOLS - SWE-Agent Tutorial

2024-04-02 • Ofir Press • SWE-agent: A deep dive

2024-04-01 • Carlos E. Jimenez • A First Look at SWE-agent

2023-11-03 • Rohan Alexander • John Yang - SWE-bench: Can Language Models Resolve Real-World GitHub Issues?