Similar to Moving MNIST, we further design the advanced version of MNIST with complex backgrounds from CIFAR-10, i.e., MMNIST-CIFAR benchmark, using $10\rightarrow 10$ frames prediction setting following PredRNN. Metrics (MSE, MAE, SSIM, pSNR) of the best models are reported in three trials. Parameters (M), FLOPs (G), and V100 inference FPS (s) are also reported for all methods. All methods are trained by Adam optimizer with Onecycle scheduler and single GPU.
STL Benchmarks on MMNIST-CIFAR
Method |
Setting |
Params |
FLOPs |
FPS |
MSE |
MAE |
SSIM |
PSNR |
Download |
ConvLSTM-S |
200 epoch |
15.5M |
58.8G |
113 |
73.31 |
338.56 |
0.9204 |
23.09 |
model | log |
ConvLSTM-L |
200 epoch |
34.4M |
130.0G |
50 |
62.86 |
291.05 |
0.9337 |
23.83 |
model | log |
PredNet |
200 epoch |
12.5M |
8.6G |
945 |
286.70 |
514.14 |
0.8139 |
17.49 |
model | log |
PhyDNet |
200 epoch |
3.1M |
15.3G |
182 |
142.54 |
700.37 |
0.8276 |
19.92 |
model | log |
PredRNN |
200 epoch |
23.8M |
116.0G |
54 |
50.09 |
225.04 |
0.9499 |
24.90 |
model | log |
PredRNN++ |
200 epoch |
38.6M |
171.7G |
38 |
44.19 |
198.27 |
0.9567 |
25.60 |
model | log |
MIM |
200 epoch |
38.8M |
183.0G |
37 |
48.63 |
213.44 |
0.9521 |
25.08 |
model | log |
MAU |
200 epoch |
4.5M |
17.8G |
201 |
58.84 |
255.76 |
0.9408 |
24.19 |
model | log |
E3D-LSTM |
200 epoch |
52.8M |
306.0G |
18 |
80.79 |
214.86 |
0.9314 |
22.89 |
model | log |
PredRNN.V2 |
200 epoch |
23.9M |
116.6G |
52 |
57.27 |
252.29 |
0.9419 |
24.24 |
model | log |
DMVFN |
200 epoch |
3.6M |
0.2G |
960 |
298.73 |
606.92 |
0.7765 |
17.07 |
model | log |
SimVP+IncepU |
200 epoch |
58.0M |
19.4G |
209 |
59.83 |
214.54 |
0.9414 |
24.15 |
model | log |
SimVP+gSTA-S |
200 epoch |
46.8M |
16.5G |
282 |
51.13 |
185.13 |
0.9512 |
24.93 |
model | log |
TAU |
200 epoch |
44.7M |
16.0G |
275 |
48.17 |
177.35 |
0.9539 |
25.21 |
model | log |
Benchmark of MetaFormers Based on SimVP (MetaVP)
MetaFormer |
Setting |
Params |
FLOPs |
FPS |
MSE |
MAE |
SSIM |
PSNR |
Download |
IncepU (SimVPv1) |
200 epoch |
58.0M |
19.4G |
209 |
59.83 |
214.54 |
0.9414 |
24.15 |
model | log |
gSTA (SimVPv2) |
200 epoch |
46.8M |
16.5G |
282 |
51.13 |
185.13 |
0.9512 |
24.93 |
model | log |
ViT |
200 epoch |
46.1M |
16.9G |
290 |
64.94 |
234.01 |
0.9354 |
23.90 |
model | log |
Swin Transformer |
200 epoch |
46.1M |
16.4G |
294 |
57.11 |
207.45 |
0.9443 |
24.34 |
model | log |
Uniformer |
200 epoch |
44.8M |
16.5G |
296 |
56.96 |
207.51 |
0.9442 |
24.38 |
model | log |
MLP-Mixer |
200 epoch |
38.2M |
14.7G |
334 |
57.03 |
206.46 |
0.9446 |
24.34 |
model | log |
ConvMixer |
200 epoch |
3.9M |
5.5G |
658 |
59.29 |
219.76 |
0.9403 |
24.17 |
model | log |
Poolformer |
200 epoch |
37.1M |
14.1G |
341 |
60.98 |
219.50 |
0.9399 |
24.16 |
model | log |
ConvNeXt |
200 epoch |
37.3M |
14.1G |
344 |
51.39 |
187.17 |
0.9503 |
24.89 |
model | log |
VAN |
200 epoch |
44.5M |
16.0G |
288 |
59.59 |
221.32 |
0.9398 |
25.20 |
model | log |
HorNet |
200 epoch |
45.7M |
16.3G |
287 |
55.79 |
202.73 |
0.9456 |
24.49 |
model | log |
MogaNet |
200 epoch |
46.8M |
16.5G |
255 |
49.48 |
184.11 |
0.9521 |
25.07 |
model | log |
TAU |
200 epoch |
44.7M |
16.0G |
275 |
48.17 |
177.35 |
0.9539 |
25.21 |
model | log |