VideoScore

Building automatic metrics for video generation quality via fine-grained human feedback. Covers visual quality, motion smoothness, text alignment, and factual consistency.

VideoScore is an automatic evaluation metric for AI-generated videos, trained on fine-grained human annotations. It covers five quality dimensions: visual quality, temporal consistency, dynamic degree, text-video alignment, and factual consistency.

Key contributions:

  • EvalVid-QA: large-scale human-annotated video evaluation dataset
  • VideoScore model trained to predict multi-dimensional quality scores
  • High correlation with human judgments across diverse video generation models
  • Enables scalable automated evaluation without human involvement

Links: GitHub · Paper · Demo