2026
an archive of posts from this year
| Mar 10, 2026 | Evaluating Evolving Agent Systems at Scale with Frontier-CS |
|---|---|
| Feb 26, 2026 | LLM Defeated in Open-ended Problems |
| Feb 10, 2026 | Evaluating the Hardest CS Problems in the Age of LLMs |
| Feb 3, 2026 | Frontier-CS 1.0 Release |