Visible tradeoffsThis is a composite signal, so it bundles multiple ingredients and should not be treated as a single clean primitive.
source
Artificial Analysis
metric
Index (index)
judge
Composite
direction
higher better
group id
aa_intelligence_current
domain
Chat / text
What it measures vs what it misses
✓ Measures
Text-focused benchmark composite performance.
✗ Misses
Multimodal quality unless separate tracks are selected.
Why this countsIt tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.Comparable-group ruleThis percentile only compares models inside the exact benchmark/version group shown here. It is not a universal score.What it missesIt does not prove deeper reasoning, tool use, or enterprise workflow reliability.