blankdot
Home
Bookmarks
Inbox
Collections
Profile
Search
Create
Sign in
Command Palette
Search for a command to run...
Home
Explore
Bookmarks
Inbox
Collections
Profile
The METR evals for Gemini 3.0 and Opus 4.5 are taking incredibly long--GPT 5.1 codex max was benchmarked almost instantly as well as others. Why is that? — Blankdot