Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Model performance on KTH 10->20 task #1

Open
kevinstan opened this issue May 15, 2019 · 12 comments
Open

Model performance on KTH 10->20 task #1

kevinstan opened this issue May 15, 2019 · 12 comments

Comments

@kevinstan
Copy link

kevinstan commented May 15, 2019

Hello, thank you for the paper and releasing the code. I'm having difficulty reproducing the results for the KTH Action task in section 4.2. I've downloaded the pre-trained weights for KTH Actions (200,000 ckpt) and used it to test the model.

System Info
python 2.7
opencv 4.1.0.25
tensorflow-gpu 1.9.0
CUDA 9.0
GPU:
name: TITAN X (Pascal) major: 6 minor: 1 memoryClockRate(GHz): 1.531
pciBusID: 0000:03:00.0
totalMemory: 11.91GiB freeMemory: 11.75GiB

script
#!/usr/bin/env bash cd .. python -u run.py \ --is_training False \ --dataset_name action \ --train_data_paths data/kth \ --valid_data_paths data/kth \ --pretrained_model kth_e3d_lstm_pretrain/model.ckpt-200000 \ --save_dir checkpoints/_kth_e3d_lstm \ --gen_frm_dir results/_kth_e3d_lstm \ --model_name e3d_lstm \ --allow_gpu_growth True \ --img_channel 1 \ --img_width 128 \ --input_length 10 \ --total_length 30 \ --filter_size 5 \ --num_hidden 64,64,64,64 \ --patch_size 8 \ --layer_norm True \ --reverse_input False \ --sampling_stop_iter 100000 \ --sampling_start_value 1.0 \ --sampling_delta_per_iter 0.00001 \ --lr 0.001 \ --batch_size 2 \ --max_iterations 1 \ --display_interval 1 \ --test_interval 1 \ --snapshot_interval 5000
output
(e3d_lstm_official) kstan@yixing:~/e3d_lstm/scripts$ ./e3d_lstm_kth_test.sh
Initializing models
2019-05-15 14:37:16.852811: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2019-05-15 14:37:19.055412: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1392] Found device 0 with properties:
name: TITAN X (Pascal) major: 6 minor: 1 memoryClockRate(GHz): 1.531
pciBusID: 0000:03:00.0
totalMemory: 11.91GiB freeMemory: 11.75GiB
2019-05-15 14:37:19.055439: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1471] Adding visible gpu devices: 0
2019-05-15 14:37:19.262277: I tensorflow/core/common_runtime/gpu/gpu_device.cc:952] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-05-15 14:37:19.262310: I tensorflow/core/common_runtime/gpu/gpu_device.cc:958] 0
2019-05-15 14:37:19.262318: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0: N
2019-05-15 14:37:19.262531: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1084] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 11376 MB memory) -> physical GPU (device: 0, name: TITAN X (Pascal), pci bus id: 0000:03:00.0, compute capability: 6.1)
load model: kth_e3d_lstm_pretrain/model.ckpt-200000
begin load datadata/kth
there are 127271 pictures
there are 5200 sequences
begin load datadata/kth
there are 74833 pictures
there are 3167 sequences
2019-05-15 14:39:52 itr: 1
training loss: 16082.05078125
2019-05-15 14:39:52 test...
mse per seq: 1853.1817014023088
96.02373807308271
80.29797137965903
84.68072711946989
83.75463825016179
84.48666421838448
84.61139482557209
85.35639578890967
86.27750272624341
87.66025201745674
89.2119170410002
90.84818150224523
92.64167446828084
94.38503250199183
96.13222195449993
98.02904253614453
99.92525694480216
101.83609684253146
103.8342688265889
105.73710226033657
107.45162212494725
psnr per frame: 23.111416
23.2865
23.752821
23.5958
23.57663
23.51337
23.477915
23.422129
23.364187
23.28756
23.209711
23.131495
23.047438
22.969624
22.893667
22.811342
22.732689
22.653484
22.571104
22.496899
22.43397
ssim per frame: 0.6098243
0.63740635
0.62530535
0.6226238
0.61893517
0.6169444
0.6149846
0.61348057
0.61197215
0.61037815
0.60889727
0.60745543
0.6060252
0.6047545
0.60347193
0.6020237
0.6007725
0.59954363
0.59822935
0.5971006
0.59618074

visual results
gt11:gt11 gt12:gt12 gt13:gt13 gt14:gt14 gt15:gt15

pd11: pd11pd12: pd12pd13: pd13pd14: pd14pd15: pd15

...

It seems like the results are very different than what's presented in the paper -- what might I be doing wrong here?

Note: I've successfully reproduced the results and achieved the same SSIM and MSE on moving mnist task in section 4.1, so I don't think it's a system/hardware issue. So I think it could be possible that there is a mistake in the downloaded pretrained KTH action model.

Best,
Kevin

@Fangyh09
Copy link

Fangyh09 commented May 16, 2019

@kevinstan
I think the model is not the correct one. As the SSIM score is only 0.609.
I retrained model MIM(another model) and the result is good.
BTW, I find it very slow to train this model. Have you ever trained the model?

@kevinstan
Copy link
Author

@Fangyh09
What is "model MIM"? Do you mean the Moving MNIST model?
I've tried training the KTH Action model from scratch, and it doesn't take too long. I'm using 4 GPU and it should take 2-3 hrs for ~200,000 iters.

@jhhuang96
Copy link

@kevinstan I confused about why Moving MNIST image shape (64,64,1) often should be reshaped (64//patch_size, 64//patch_size, patch_size**2 )? Is it because of gpu memory?

@xiaomingdujin
Copy link

I cannot get the result too.

@kevinstan
I think the model is not the correct one. As the SSIM score is only 0.609.
I retrained model MIM(another model) and the result is good.
BTW, I find it very slow to train this model. Have you ever trained the model?

I also think the pretrained model is not the correct one.

@xiaomingdujin
Copy link

Maybe I find the reason, in rnn_cell.py, when calculating the output_gate, new_ global_ memory should be returned, but the code returns global memory,which is not updated, but even if I return new_ global_ memory, the result is even worse, so I suspect there is a problem in the transmission of time information

@wyb15
Copy link
Collaborator

wyb15 commented Jul 6, 2020

We noticed that there is a bug in the current code about "global_memory" which may cause for the mismatched pretrained models on the KTH dataset. As this code repo was reproduced after the first author left Google, this issue did not exist in our original experiments and the results reported in the paper are good. We are working on fixing this issue and refreshing our pre-trained KTH models. We apologize for the inconvenience and thank you for your patience.

@toddwyl
Copy link

toddwyl commented Jul 16, 2020

We noticed that there is a bug in the current code about "global_memory" which may cause for the mismatched pretrained models on the KTH dataset. As this code repo was reproduced after the first author left Google, this issue did not exist in our original experiments and the results reported in the paper are good. We are working on fixing this issue and refreshing our pre-trained KTH models. We apologize for the inconvenience and thank you for your patience.

Is there any progress on this issue? Except for the error of return new_ global_ memory,is there any other issue that cause the mismatch?

@xiaomingdujin
Copy link

progress on this issue

We noticed that there is a bug in the current code about "global_memory" which may cause for the mismatched pretrained models on the KTH dataset. As this code repo was reproduced after the first author left Google, this issue did not exist in our original experiments and the results reported in the paper are good. We are working on fixing this issue and refreshing our pre-trained KTH models. We apologize for the inconvenience and thank you for your patience.

Is there any progress on this issue? Except for the error of return new_ global_ memory,is there any other issue that cause the mismatch?

It's been a month, there is no progress.

@dekaiidea
Copy link

Is there any progress on this issue? I have been waiting it for a long time .

@index19950919
Copy link

We noticed that there is a bug in the current code about "global_memory" which may cause for the mismatched pretrained models on the KTH dataset. As this code repo was reproduced after the first author left Google, this issue did not exist in our original experiments and the results reported in the paper are good. We are working on fixing this issue and refreshing our pre-trained KTH models. We apologize for the inconvenience and thank you for your patience.

I also want to cite your paper, but the code cannot run due to a bug. Is it fixed now?

@yifanzhang713
Copy link

Is there any progress on this issue?

@SherryyHou
Copy link

i want to know where to download the pretrain model !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants