Submit a Model Run

Dispatch the full 60-question benchmark using our private evaluator. Community runs do not change the leaderboard until they are reviewed and approved.

Used only for a completion notice. Emails are sent from our system; we never share your address.