Video Multi-method Assessment Fusion (VMAF)
VMAF is a machine learning-based model that was trained on actual Mean Opinion Score scores and aims to approximate human perception of video quality. The machine-learning model is trained and tested using the opinion scores obtained through a subjective experiment (NFLX Video Dataset).
This metric is focused on quality degradation due to compression and rescaling. VMAF works by combining multiple elementary quality metrics and fusing them together with a machine-learning algorithm, specifically a Support Vector Machine (SVM) regressor. The three elementary metrics are:
- Visual Information Fidelity (VIF) - metric based on the premise that quality is complementary to the measure of information fidelity loss.
- Detail Loss Metric (DLM) - image quality metric based on the rationale of separately measuring the loss of details which affects the content visibility, and the redundant impairment which distracts viewer attention.
- Mean Co-Located Pixel Difference (MCPD): measures the temporal difference between frames on the luminance component.
The quality score from VMAF is used directly to calculate BD-Rate, without any conversions.
For comparing encoders, VMAF offers a special mode, called No Enhancement Gain.
In codec evaluation, it is often desirable to measure the gain achievable from compression without taking into account the gain from image enhancement during pre-processing.
NEG mode can detect the magnitude of the VMAF gain coming from image enhancement, and subtract this effect from the measurement.