UABUnbiased AI BenchGlass box for model evals.
Every leaderboard, with receipts.
Home/Sources/Scale Labs
Scale Labs
Live · updated continuously
Browse sectionsScale Labs
SL · benchmark platform

Scale Labs

Rubric-heavy frontier evals across agentic coding, visual-language understanding, spoken dialogue, tutoring, and hard reasoning.
verification status
verified
Last checked May 1, 2026

Evidence ledger

Modalitiestext, code, vision, audioCadencerelease-basedAPInot publicEvaluations98VerificationverifiedVerified runtime70Manual verified0Relay / mirrored0Backfilled28

Relay sources mirror another provider's public page; manual rows are checked against the cited page; backfilled rows are historical inserts; seeded rows are demo fixtures. Relay rows are supporting evidence, not first-party measurements.

Operational state

snapshot
Latest pull

May 1, 2026

json
parser
Loaded 70 verified benchmark records for scale-labs.

0.1.0

ok
verify
scale-labs verification finished with status verified.

May 1, 2026

verified

Benchmarks from this source

EnigmaEval
Hard reasoning
Pass rate
VISTA
Vision-language understanding
Score
TutorBench
STEM tutoring quality
Score
VTB
Vision-language reasoning
APR
PRBench Legal
Professional legal reasoning
Score
HiL-Bench
Human-in-the-loop software tasks
Success rate
MASK
Hidden-goal honesty and safety
Honesty score
MultiNRC
Multilingual reasoning
Score

Latest change explanation

scale-labs matched scale-labs-20260501T202649Z with no notable change causes detected.