About us
AIGVE is a website hosting the documentations, tutorials, examples and the latest updates about the AIGVE
library.
🚀 What is AIGVE
?
AIGVE
(AI Generated Video Evaluation Toolkit) provides a comprehensive and structured evaluation framework for assessing AI-generated video quality developed by the IFM Lab. It integrates multiple evaluation metrics, covering diverse aspects of video evaluation, including neural-network-based assessment, distribution comparison, vision-language alignment, and multi-faceted analysis.
- Official Website: https://www.aigve.org/
- Github Repository: https://github.com/ShaneXiangH/VQA_Toolkit
- IFM Lab https://www.ifmlab.org/
Citing Us
If you find AIGVE
library and ...
papers useful in your work, please cite the papers as follows:
Library Organization
🧠 Neural Network-Based Evaluation Metrics
These metrics leverage deep learning models to assess AI-generated video quality based on learned representations.
- ✅ GSTVQA: Video Quality Assessment using spatiotemporal deep learning models.
- ✅ ModularBVQA: A modular framework for Blind Video Quality Assessment (BVQA).
📊 Distribution-Based Evaluation Metrics
These metrics assess the quality of generated videos by comparing the distribution of real and generated samples.
- ✅ FID: Frechet Inception Distance (FID) measures the visual fidelity of generated samples.
- ✅ FVD: Frechet Video Distance (FVD) extends FID for temporal coherence in videos.
- ✅ IS: Inception Score (IS) evaluates the diversity and realism of generated content.
🔍 Vision-Language Similarity-Based Evaluation Metrics
These metrics evaluate alignment, similarity, and coherence between visual and textual representations, often using embeddings from models like CLIP and BLIP.
- ✅ CLIPSim: Measures image-text similarity using CLIP embeddings.
- ✅ CLIPTemp: Assesses temporal consistency in video-text alignment.
- ✅ BLIPSim: Evaluates cross-modal similarity and retrieval-based alignment.
- ✅ Pickscore: Ranks text-image pairs based on alignment quality.
🧠 Vision-Language Understanding-Based Evaluation Metrics
These metrics assess higher-level understanding, reasoning, and factual consistency in vision-language models.
- ✅ VIEScore: Evaluates video grounding and entity-based alignment.
- ✅ TIFA: Measures textual integrity and factual accuracy in video descriptions.
- ✅ DSG: A deep structured grounding metric for assessing cross-modal comprehension.
🔄 Multi-Faceted Evaluation Metrics
These metrics integrate structured, multi-dimensional assessments to provide a holistic benchmarking framework for AI-generated videos.
- ✅ VideoPhy: Evaluates physics-based video understanding and reasoning.
- ✅ VideoScore: Evaluates structured video quality assessment across multiple perceptual and reasoning-based dimensions.
Key Features
- Multi-Dimensional Evaluation: Covers video coherence, physics, and benchmarking.
- Open-Source & Customizable: Designed for easy integration.
- Cutting-Edge AI Assessment: Supports various AI-generated video tasks.
License & Copyright
Copyright © 2025 IFM Lab. All rights reserved.
AIGVE
source code is published under the terms of the MIT License.AIGVE
documentation and the...
papers are licensed under a Creative Commons Attribution-Share Alike 4.0 Unported License (CC BY-SA 4.0).