Ground Truth

Prediction

SSIMStructural Similarity Index. Measures pixel-level structural similarity. Range 0-1, higher = more similar.Higher is better
0.3596
FVDFrechet Video Distance. Measures distribution-level similarity of video features. Lower = closer to real.Lower is better
21.1
LPIPSLearned Perceptual Image Patch Similarity. Measures perceptual similarity via deep features. Range 0-1, lower = more similar.Lower is better
0.2136
FramesPrediction horizon
100