AutoLab is released, a benchmark for evaluating AI agents on frontier research tasks across system optimization, model development, and algorithmic challenges. See the code, blog, live lab, and X thread.