2026_04_09_clawbench
🌐 Released ClawBench, a benchmark for evaluating whether AI agents can complete everyday online tasks across live websites. Good job of Yuxuan Zhang. I also shared it on X.
🌐 Released ClawBench, a benchmark for evaluating whether AI agents can complete everyday online tasks across live websites. Good job of Yuxuan Zhang. I also shared it on X.