Skip to content

Latest commit

 

History

History
73 lines (56 loc) · 3.34 KB

model_performance_vqacore.md

File metadata and controls

73 lines (56 loc) · 3.34 KB

Using clip-flant5-xxl for VQAScore

  • Basic Skills

Method Attribute Scene Spatial Action Part Avg
SD v2.1 0.75 0.79 0.73 0.73 0.71 0.75
SD-XL Turbo 0.81 0.82 0.78 0.79 0.78 0.80
SD-XL 0.82 0.85 0.80 0.80 0.81 0.82
DeepFloyd-IF 0.82 0.83 0.80 0.81 0.81 0.82
Midjourney v6 0.86 0.88 0.86 0.87 0.85 0.86
DALL-E 3 0.91 0.91 0.90 0.90 0.91 0.90
  • Advanced Skills

Method Count Differ Compare Negate Universal Avg
SD v2.1 0.66 0.64 0.65 0.51 0.63 0.60
SD-XL Turbo 0.71 0.68 0.69 0.52 0.66 0.63
SD-XL 0.72 0.70 0.69 0.50 0.67 0.63
DeepFloyd-IF 0.70 0.70 0.71 0.50 0.65 0.63
Midjourney v6 0.77 0.77 0.76 0.50 0.73 0.68
DALL-E 3 0.80 0.80 0.77 0.49 0.75 0.69
  • Overall Performances

Model basic advanced overall
SD v2.1 0.75 0.60 0.67
SD-XL Turbo 0.80 0.63 0.71
SD-XL 0.82 0.63 0.72
DeepFloyd-IF 0.82 0.63 0.71
Midjourney v6 0.86 0.68 0.76
DALL-E 3 0.90 0.69 0.78

Using GPT-4o for VQAScore

  • Basic Skills

Method Attribute Scene Spatial Action Part Avg
SD v2.1 0.56 0.67 0.54 0.58 0.43 0.57
SD-XL Turbo 0.72 0.76 0.70 0.75 0.63 0.72
SD-XL 0.65 0.73 0.60 0.65 0.57 0.65
DeepFloyd-IF 0.71 0.77 0.69 0.74 0.64 0.72
Midjourney v6 0.76 0.81 0.75 0.80 0.71 0.76
DALL-E 3 0.90 0.92 0.87 0.90 0.86 0.89
  • Advanced Skills

Method Count Differ Compare Negate Universal Avg
SD v2.1 0.29 0.20 0.30 0.31 0.41 0.33
SD-XL Turbo 0.41 0.30 0.41 0.35 0.49 0.42
SD-XL 0.32 0.25 0.28 0.29 0.45 0.34
DeepFloyd-IF 0.39 0.34 0.42 0.30 0.48 0.41
Midjourney v6 0.50 0.46 0.53 0.29 0.57 0.48
DALL-E 3 0.59 0.56 0.57 0.26 0.67 0.52
  • Overall Performances

Model basic advanced overall
SD v2.1 0.57 0.33 0.44
SD-XL Turbo 0.72 0.42 0.56
SD-XL 0.65 0.34 0.48
DeepFloyd-IF 0.72 0.41 0.55
Midjourney v6 0.76 0.48 0.61
DALL-E 3 0.89 0.52 0.68