aigve.utils
LoadVideoFromFile
Bases: BaseTransform
Load a video from file.
Required Keys:
- video_path_pd
Modified Keys
- video_pd
Parameters:
Name | Type | Description | Default |
---|---|---|---|
height
|
int
|
int, default is -1
Desired output height of the video, unchanged if |
-1
|
width
|
int
|
int, default is -1
Desired output width of the video, unchanged if |
-1
|
Source code in aigve/utils/loading.py
transform(results)
Functions to load video. Referred to 'https://github.com/Vchitect/VBench/blob/master/vbench/utils.py#L103'
The function supports loading video in GIF (.gif), PNG (.png), and MP4 (.mp4) formats. Depending on the format, it processes and extracts frames accordingly.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
results
|
dict
|
Result dict from
:class: |
required |
Returns:
Name | Type | Description |
---|---|---|
dict |
Optional[dict]
|
The dict contains loaded video in shape (F, C, H, W) and |
Optional[dict]
|
meta information if needed. F is the number of frames, C is the |
|
Optional[dict]
|
number of channels, H is the height, and W is the width. |
Raises:
Type | Description |
---|---|
-NotImplementedError
|
If the video format is not supported. |
The function first determines the format of the video file by its extension. For GIFs, it iterates over each frame and converts them to RGB. For PNGs, it reads the single frame, converts it to RGB. For MP4s, it reads the frames using the VideoReader class and converts them to NumPy arrays. If a data_transform is provided, it is applied to the buffer before converting it to a tensor. Finally, the tensor is permuted to match the expected (F, C, H, W) format.
Source code in aigve/utils/loading.py
read_image_detectron2(file_name, format=None)
Read an image into the given format. Will apply rotation and flipping if the image has such exif information.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
file_name
|
str
|
image file path |
required |
format
|
str
|
one of the supported image modes in PIL, or "BGR" or "YUV-BT.601". |
None
|
Returns:
Name | Type | Description |
---|---|---|
image |
ndarray
|
an HWC image in the given format, which is 0-255, uint8 for supported image modes in PIL or "BGR"; float (0-1 for Y) for YUV-BT.601. |