Skip to content

About us

AIGVE is a website hosting the documentations, tutorials, examples and the latest updates about the AIGVE library.

🚀 What is AIGVE?

AIGVE (AI Generated Video Evaluation Toolkit) provides a comprehensive and structured evaluation framework for assessing AI-generated video quality developed by the IFM Lab. It integrates multiple evaluation metrics, covering diverse aspects of video evaluation, including neural-network-based assessment, distribution comparison, vision-language alignment, and multi-faceted analysis.

Citing Us

If you find AIGVE library and ... papers useful in your work, please cite the papers as follows:


Library Organization

🧠 Neural Network-Based Evaluation Metrics

These metrics leverage deep learning models to assess AI-generated video quality based on learned representations.

  • GSTVQA: Video Quality Assessment using spatiotemporal deep learning models.
  • ModularBVQA: A modular framework for Blind Video Quality Assessment (BVQA).

📊 Distribution-Based Evaluation Metrics

These metrics assess the quality of generated videos by comparing the distribution of real and generated samples.

  • FID: Frechet Inception Distance (FID) measures the visual fidelity of generated samples.
  • FVD: Frechet Video Distance (FVD) extends FID for temporal coherence in videos.
  • IS: Inception Score (IS) evaluates the diversity and realism of generated content.

🔍 Vision-Language Similarity-Based Evaluation Metrics

These metrics evaluate alignment, similarity, and coherence between visual and textual representations, often using embeddings from models like CLIP and BLIP.

  • CLIPSim: Measures image-text similarity using CLIP embeddings.
  • CLIPTemp: Assesses temporal consistency in video-text alignment.
  • BLIPSim: Evaluates cross-modal similarity and retrieval-based alignment.
  • Pickscore: Ranks text-image pairs based on alignment quality.

🧠 Vision-Language Understanding-Based Evaluation Metrics

These metrics assess higher-level understanding, reasoning, and factual consistency in vision-language models.

  • VIEScore: Evaluates video grounding and entity-based alignment.
  • TIFA: Measures textual integrity and factual accuracy in video descriptions.
  • DSG: A deep structured grounding metric for assessing cross-modal comprehension.

🔄 Multi-Faceted Evaluation Metrics

These metrics integrate structured, multi-dimensional assessments to provide a holistic benchmarking framework for AI-generated videos.

  • VideoPhy: Evaluates physics-based video understanding and reasoning.
  • VideoScore: Evaluates structured video quality assessment across multiple perceptual and reasoning-based dimensions.

Key Features

  • Multi-Dimensional Evaluation: Covers video coherence, physics, and benchmarking.
  • Open-Source & Customizable: Designed for easy integration.
  • Cutting-Edge AI Assessment: Supports various AI-generated video tasks.

Copyright © 2025 IFM Lab. All rights reserved.

  • AIGVE source code is published under the terms of the MIT License.
  • AIGVE documentation and the ... papers are licensed under a Creative Commons Attribution-Share Alike 4.0 Unported License (CC BY-SA 4.0).