Benchmarks · /benchmarks/artificial-analysis-text-to-image

Text to Image

Artificial Analysis image arena leaderboard for text-to-image generation.

Source · Artificial Analysis
Version · artificial-analysis snapshot 2026-05-01
Scores · 5

Passport

Visible tradeoffsThis is a human preference signal, so it tells you what people liked side by side, not what is formally correct.

source

Artificial Analysis

metric

Arena rating (rating)

judge

Human

direction

higher better

group id

aa_text_to_image_current

domain

Image generation

What it measures vs what it misses

✓ Measures

Observed user preference over image generations from the same prompt. Relative appeal and usefulness in blind image comparisons.

✗ Misses

Objective prompt-faithfulness metrics. Latency and cost.

Why this countsIt would matter for text-to-image quality once verified public receipts exist in the catalog.Comparable-group ruleThis percentile only compares models inside the exact benchmark/version group shown here. It is not a universal score.What it missesThis slice is currently limited because the product does not yet carry first-class image-generation receipts.

Leaderboard · this benchmark version

#1 · GPT Image 2 (high)

AA · May 1, 2026

1,335

#2 · GPT Image 1.5

AA · May 1, 2026

1,272

#3 · Nano Banana 2 (Gemini 3.1 Flash Image Preview)

AA · May 1, 2026

1,261

#4 · Nano Banana Pro (Gemini 3 Pro Image)

AA · May 1, 2026

1,216

#5 · FLUX.2 [max]

AA · May 1, 2026

1,201