Visible tradeoffsThis is a human preference signal, so it tells you what people liked side by side, not what is formally correct.
source
Artificial Analysis
metric
Arena rating (rating)
judge
Human
direction
higher better
group id
aa_text_to_video_current
domain
Video generation
What it measures vs what it misses
✓ Measures
Observed user preference over generated videos from the same prompt. Relative motion, coherence, and prompt fit under blind comparisons.
✗ Misses
Objective motion fidelity metrics. End-to-end production ergonomics.
Why this countsObserved user preference over generated videos from the same prompt.Comparable-group ruleThis percentile only compares models inside the exact benchmark/version group shown here. It is not a universal score.What it missesObjective motion fidelity metrics.