UABUnbiased AI BenchGlass box for model evals.
Every leaderboard, with receipts.
Home/Benchmarks/Vision Arena
Vision Arena
Live · updated continuously
Benchmarks · /benchmarks/arena-vision

Vision Arena

Blind multimodal preference arena for image understanding tasks.
Source · Arena
Version · arena snapshot 2026-05-01
Scores · 102

Passport

Visible tradeoffsThis is a human preference signal, so it tells you what people liked side by side, not what is formally correct.
source
Arena
metric
Arena rating (rating)
judge
Human
direction
higher better
group id
arena_vision_2026_q2
domain
Vision understanding

What it measures vs what it misses

✓ Measures

Observed user preference on vision-grounded prompts and multimodal comparisons. How often a model wins when people judge image-aware answers head to head.

✗ Misses

Ground-truth visual reasoning accuracy. Task-specific breakdowns such as OCR, charts, or diagram-heavy workflows.

Why this countsIt is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.Comparable-group ruleThis percentile only compares models inside the exact benchmark/version group shown here. It is not a universal score.What it missesIt does not tell you whether the model can generate or edit images well.

Leaderboard · this benchmark version

#1 · Claude Opus 4.7
AR · May 1, 2026
1,299
#2 · muse-spark
AR · May 1, 2026
1,294
#3 · Claude Opus 4.6
AR · May 1, 2026
1,293
#4 · Gemini 3 Pro Preview
AR · May 1, 2026
1,288
#5 · Gemini 3.1 Pro Preview
AR · May 1, 2026
1,277
#6 · GPT-5.5
AR · May 1, 2026
1,274
#7 · Claude Sonnet 4.6
AR · May 1, 2026
1,272
#8 · Kimi K2.6
AR · May 1, 2026
1,261
#9 · Gemini 3 Flash
AR · May 1, 2026
1,259
#10 · dola-seed-2.0-pro
AR · May 1, 2026
1,259
#11 · Grok 4.3
AR · May 1, 2026
1,249
#12 · kimi-k2.5-thinking
AR · May 1, 2026
1,247
#13 · Gemini 2.5 Pro
AR · May 1, 2026
1,246
#14 · Grok 4.20
AR · May 1, 2026
1,244
#15 · Qwen3.5 397B A17B
AR · May 1, 2026
1,242
#16 · Gemini 3.1 Flash-Lite Preview
AR · May 1, 2026
1,239
#17 · kimi-k2.5-instant
AR · May 1, 2026
1,239
#18 · GPT-5.1
AR · May 1, 2026
1,237
#19 · GPT-5.2
AR · May 1, 2026
1,230
#20 · glm-5v-turbo
AR · May 1, 2026
1,227
#21 · Qwen3.5 27B
AR · May 1, 2026
1,226
#22 · GPT-4.5 Preview
AR · May 1, 2026
1,226
#23 · Qwen3.5 122B A10B
AR · May 1, 2026
1,222
#24 · mimo-v2.5
AR · May 1, 2026
1,218
#25 · ernie-5.0-preview-1220
AR · May 1, 2026
1,218
#26 · o3
AR · May 1, 2026
1,217
#27 · Qwen3 VL 235B A22B
AR · May 1, 2026
1,215
#28 · GPT-4.1
AR · May 1, 2026
1,214
#29 · Gemini 2.5 Flash
AR · May 1, 2026
1,213
#30 · GPT-5
AR · May 1, 2026
1,211
#31 · GPT-5.4
AR · May 1, 2026
1,211
#32 · MiMo-V2-Omni
AR · May 1, 2026
1,208
#33 · GPT-4.1 mini
AR · May 1, 2026
1,203
#34 · o4 mini
AR · May 1, 2026
1,201
#35 · o1
AR · May 1, 2026
1,193
#36 · qwen3-vl-235b-a22b-thinking
AR · May 1, 2026
1,189
#37 · Claude Sonnet 4
AR · May 1, 2026
1,188
#38 · Claude Sonnet 4.5
AR · May 1, 2026
1,188
#39 · Claude Opus 4
AR · May 1, 2026
1,187
#40 · qwen-vl-max-2025-08-13
AR · May 1, 2026
1,186
#41 · Grok 4.1 Fast
AR · May 1, 2026
1,186
#42 · GPT-5.4 mini
AR · May 1, 2026
1,182
#43 · Grok 4
AR · May 1, 2026
1,182
#44 · Claude Sonnet 3.7
AR · May 1, 2026
1,176
#45 · Gemini 2.5 Flash-Lite
AR · May 1, 2026
1,174
#46 · Gemini 2.0 Flash
AR · May 1, 2026
1,171
#47 · GLM-4.6V
AR · May 1, 2026
1,164
#48 · hunyuan-vision-1.5-thinking
AR · May 1, 2026
1,161
#49 · mistral-medium-2508
AR · May 1, 2026
1,159
#50 · gemma-3-27b-it
AR · May 1, 2026
1,158
#51 · step-1o-turbo-202506
AR · May 1, 2026
1,158
#52 · mistral-medium-2505
AR · May 1, 2026
1,156
#53 · GLM-4.5V
AR · May 1, 2026
1,156
#54 · hunyuan-large-vision
AR · May 1, 2026
1,148
#55 · GPT-5.4 nano
AR · May 1, 2026
1,147
#56 · llama-4-maverick-17b-128e-instruct
AR · May 1, 2026
1,147
#57 · Claude Sonnet 3.5
AR · May 1, 2026
1,146
#58 · step-3
AR · May 1, 2026
1,146
#59 · mistral-small-2506
AR · May 1, 2026
1,141
#60 · Gemini 2.0 Flash-Lite
AR · May 1, 2026
1,135
#61 · mistral-small-3.1-24b-instruct-2503
AR · May 1, 2026
1,128
#62 · llama-4-scout-17b-16e-instruct
AR · May 1, 2026
1,128
#63 · Claude Haiku 3.5
AR · May 1, 2026
1,127
#64 · Claude Haiku 4.5
AR · May 1, 2026
1,127
#65 · step-1o-vision-32k-highres
AR · May 1, 2026
1,126
#66 · qwen2.5-vl-72b-instruct
AR · May 1, 2026
1,122
#67 · qwen2.5-vl-32b-instruct
AR · May 1, 2026
1,121
#68 · GPT-4o
AR · May 1, 2026
1,119
#69 · Gemini 1.5 Pro
AR · May 1, 2026
1,119
#70 · GPT-4 Turbo
AR · May 1, 2026
1,113
#71 · molmo-2-8b
AR · May 1, 2026
1,107
#72 · GPT-4o mini
AR · May 1, 2026
1,098
#73 · Pixtral Large
AR · May 1, 2026
1,095
#74 · GPT-4.1 nano
AR · May 1, 2026
1,089
#75 · qwen-vl-max-1119
AR · May 1, 2026
1,085
#76 · qwen2-vl-72b
AR · May 1, 2026
1,085
#77 · Gemini 1.5 Flash 8B
AR · May 1, 2026
1,071
#78 · step-1v-32k
AR · May 1, 2026
1,064
#79 · Claude Opus 3
AR · May 1, 2026
1,063
#80 · Gemini 1.5 Flash
AR · May 1, 2026
1,060
#81 · molmo-72b-0924
AR · May 1, 2026
1,046
#82 · hunyuan-standard-vision-2024-12-31
AR · May 1, 2026
1,043
#83 · llama-3.2-vision-90b-instruct
AR · May 1, 2026
1,032
#84 · qwen2-vl-7b-instruct
AR · May 1, 2026
1,031
#85 · pixtral-12b-2409
AR · May 1, 2026
1,026
#86 · internvl2-26b
AR · May 1, 2026
1,025
#87 · amazon-nova-lite-v1.0
AR · May 1, 2026
1,020
#88 · amazon-nova-pro-v1.0
AR · May 1, 2026
1,019
#89 · Claude Sonnet 3
AR · May 1, 2026
1,018
#90 · yi-vision
AR · May 1, 2026
1,003
#91 · Claude Haiku 3
AR · May 1, 2026
1,001
#92 · c4ai-aya-vision-32b
AR · May 1, 2026
1,000
#93 · molmo-7b-d-0924
AR · May 1, 2026
995
#94 · llama-3.2-vision-11b-instruct
AR · May 1, 2026
992
#95 · nvila-internal-15b-v1
AR · May 1, 2026
986
#96 · llava-onevision-qwen2-72b-ov
AR · May 1, 2026
979
#97 · llava-v1.6-34b
AR · May 1, 2026
965
#98 · cogvlm2-llama3-chat-19b
AR · May 1, 2026
964
#99 · minicpm-v-2_6
AR · May 1, 2026
964
#100 · internvl2-4b
AR · May 1, 2026
958
#101 · phi-3.5-vision-instruct
AR · May 1, 2026
921
#102 · phi-3-vision-128k-instruct
AR · May 1, 2026
882