Press
Check out news and articles about SWE-bench / agent / smith.
2025-05-21 • PyTorch • PyTorch Expert Exchange: Towards Autonomous Language Model Systems
2025-05-08 • MIT Technology Review • How to build a better AI benchmark
2025-04-17 • Databricks • SWE bench & SWE agent | Data Brew | Episode 44
2025-04-10 • TechCrunch • AI models still struggle to debug software, Microsoft study shows
2025-01-13 • Jay Alammar • SWE-Bench authors reflect on the state of LLM agents at Neurips 2024
2025-01-06 • Anthropic • Raising the bar on SWE-bench Verified with Claude 3.5 Sonnet
2024-10-30 • Weaviate Podcast • SWE-bench with John Yang and Carlos E. Jimenez - Weaviate Podcast #107!
2024-08-15 • Weights & Biases • NeurIPS Hacker Cup AI: SWEAgent
2024-07-18 • Wired • I glimpsed the future of coding
2024-07-18 • Wired • The AI-Powered Future of Coding Is Near
2024-04-10 • DeepLearning.AI • Autonomous Coding Agents, Instability at Stability AI, Mamba Mania, What Users Do With GenAI
2024-04-05 • Matthew Berman • AI Agent Automatically Codes WITH TOOLS - SWE-Agent Tutorial
2024-04-02 • Ofir Press • SWE-agent: A deep dive
2024-04-01 • Carlos E. Jimenez • A First Look at SWE-agent
2023-11-03 • Rohan Alexander • John Yang - SWE-bench: Can Language Models Resolve Real-World GitHub Issues?