We provide benchmark results on KittiCaltech Pedestrian dataset using $10\rightarrow 1$ frames prediction setting following PredNet. Metrics (MSE, MAE, SSIM, pSNR, LPIPS) of the best models are reported in three trials. Parameters (M), FLOPs (G), and V100 inference FPS (s) are also reported for all methods. The default training setup is trained 100 epochs by Adam optimizer with Onecycle scheduler on single GPU, while some computational consuming methods (denoted by *) using 4GPUs.
STL Benchmarks on KittiCaltech
Method |
Setting |
Params |
FLOPs |
FPS |
MSE |
MAE |
SSIM |
PSNR |
LPIPS |
Download |
ConvLSTM-S |
100 epoch |
15.0M |
595.0G |
33 |
139.6 |
1583.3 |
0.9345 |
27.46 |
0.08575 |
model | log |
E3D-LSTM* |
100 epoch |
54.9M |
1004G |
10 |
200.6 |
1946.2 |
0.9047 |
25.45 |
0.12602 |
model | log |
PredNet |
100 epoch |
12.5M |
42.8G |
94 |
159.8 |
1568.9 |
0.9286 |
27.21 |
0.11289 |
model | log |
PhyDNet |
100 epoch |
3.1M |
40.4G |
117 |
312.2 |
2754.8 |
0.8615 |
23.26 |
0.32194 |
model | log |
MAU |
100 epoch |
24.3M |
172.0G |
16 |
177.8 |
1800.4 |
0.9176 |
26.14 |
0.09673 |
model | log |
MIM |
100 epoch |
49.2M |
1858G |
39 |
125.1 |
1464.0 |
0.9409 |
28.10 |
0.06353 |
model | log |
PredRNN |
100 epoch |
23.7M |
1216G |
17 |
130.4 |
1525.5 |
0.9374 |
27.81 |
0.07395 |
model | log |
PredRNN++ |
100 epoch |
38.5M |
1803G |
12 |
125.5 |
1453.2 |
0.9433 |
28.02 |
0.13210 |
model | log |
PredRNN.V2 |
100 epoch |
23.8M |
1223G |
52 |
147.8 |
1610.5 |
0.9330 |
27.12 |
0.08920 |
model | log |
DMVFN |
100 epoch |
3.6M |
1.2G |
557 |
183.9 |
1531.1 |
0.9314 |
26.95 |
0.04942 |
model | log |
SimVP+IncepU |
100 epoch |
8.6M |
60.6G |
57 |
160.2 |
1690.8 |
0.9338 |
26.81 |
0.06755 |
model | log |
SimVP+gSTA-S |
100 epoch |
15.6M |
96.3G |
40 |
129.7 |
1507.7 |
0.9454 |
27.89 |
0.05736 |
model | log |
TAU |
100 epoch |
44.7M |
80.0G |
55 |
131.1 |
1507.8 |
0.9456 |
27.83 |
0.05494 |
model | log |
Benchmark of MetaFormers Based on SimVP (MetaVP)
MetaFormer |
Setting |
Params |
FLOPs |
FPS |
MSE |
MAE |
SSIM |
PSNR |
LPIPS |
Download |
IncepU (SimVPv1) |
100 epoch |
8.6M |
60.6G |
57 |
160.2 |
1690.8 |
0.9338 |
26.81 |
0.06755 |
model | log |
gSTA (SimVPv2) |
100 epoch |
15.6M |
96.3G |
40 |
129.7 |
1507.7 |
0.9454 |
27.89 |
0.05736 |
model | log |
ViT* |
100 epoch |
12.7M |
155.0G |
25 |
146.4 |
1615.8 |
0.9379 |
27.43 |
0.06659 |
model | log |
Swin Transformer |
100 epoch |
15.3M |
95.2G |
49 |
155.2 |
1588.9 |
0.9299 |
27.25 |
0.08113 |
model | log |
Uniformer* |
100 epoch |
11.8M |
104.0G |
28 |
135.9 |
1534.2 |
0.9393 |
27.66 |
0.06867 |
model | log |
MLP-Mixer |
100 epoch |
22.2M |
83.5G |
60 |
207.9 |
1835.9 |
0.9133 |
26.29 |
0.07750 |
model | log |
ConvMixer |
100 epoch |
1.5M |
23.1G |
129 |
174.7 |
1854.3 |
0.9232 |
26.23 |
0.07758 |
model | log |
Poolformer |
100 epoch |
12.4M |
79.8G |
51 |
153.4 |
1613.5 |
0.9334 |
27.38 |
0.07000 |
model | log |
ConvNeXt |
100 epoch |
12.5M |
80.2G |
54 |
146.8 |
1630.0 |
0.9336 |
27.19 |
0.06987 |
model | log |
VAN |
100 epoch |
14.9M |
92.5G |
41 |
127.5 |
1476.5 |
0.9462 |
27.98 |
0.05500 |
model | log |
HorNet |
100 epoch |
15.3M |
94.4G |
43 |
152.8 |
1637.9 |
0.9365 |
27.09 |
0.06004 |
model | log |
MogaNet |
100 epoch |
15.6M |
96.2G |
36 |
131.4 |
1512.1 |
0.9442 |
27.79 |
0.05394 |
model | log |
TAU |
100 epoch |
44.7M |
80.0G |
55 |
131.1 |
1507.8 |
0.9456 |
27.83 |
0.05494 |
model | log |