How long does the LfI training take? #1

maguro27 · 2023-01-12T02:37:31Z

Hi.
Thank you for your excellent work!

I ran the LfI training script using an A100 40GB GPU.
The training script takes 6 hours in each epoch.
Consequently, the training script may take two and a half days.
Is it correct behavior?

Thanks.

lishiqianhugh · 2023-01-12T10:53:20Z

Thank you for your attention. We conduct our experiments on a V100 128G GPU and it takes about 2 hours to train an epoch. The training is indeed a little time-consuming, but I am afraid there is something wrong with your configurations. Best Regards

…

------------------ 原始邮件 ------------------ 发件人: "lishiqianhugh/LfID" ***@***.***>; 发送时间: 2023年1月12日(星期四) 上午10:37 ***@***.***>; ***@***.***>; 主题: [lishiqianhugh/LfID] How long does the LfI training take? (Issue #1) Hi. Thank you for your excellent work! I ran the LfI training script using an A100 40GB GPU. The training script takes 6 hours in each epoch. Consequently, the training script may take two and a half days. Is it correct behavior? Thanks. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

maguro27 · 2023-01-12T11:07:39Z

Thank you for getting back to me.

You said, "a V100 128GB GPU".
Is this the typo of "a V100 32GB GPU"?

In addition, could you run the LfI training script after cloning this repository?
After that, please tell me the training time in each epoch.

lishiqianhugh · 2023-01-12T11:28:56Z

Sorry for the confusion with anthor project. We use a "A100 80G GPU" to train LFI.  For ViT, it takes 2 hours for an epoch but for Swin Transformer, it takes about 5 hours. It is slow because the data is loaded online from the simulator.

…

------------------ 原始邮件 ------------------ 发件人: "lishiqianhugh/LfID" ***@***.***>; 发送时间: 2023年1月12日(星期四) 晚上7:07 ***@***.***>; ***@***.******@***.***>; 主题: Re: [lishiqianhugh/LfID] How long does the LfI training take? (Issue #1) Thank you for getting back to me. You said, "a V100 128GB GPU". Is this the typo of "a V100 32GB GPU"? In addition, could you run the LfI training script after cloning this repository? After that, please tell me the training time in each epoch. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: ***@***.***>

lishiqianhugh · 2023-01-12T11:47:33Z

font{ line-height: 1.6; } ul,ol{ padding-left: 20px; list-style-position: inside; } After running phyreo.py, only a .pt dataloader is obtained. Running with this .pt dataloader will only save the time of generating it which cannot accelerate training a lot. We try to collect all the data we need in a folder to support offline training but it's too large. ***@***.***

…

---- Replied Message ---- From ***@***.***> Date 1/12/2023 19:38 To ***@***.***> Cc ***@***.***> , ***@***.***> Subject Re: [lishiqianhugh/LfID] How long does the LfI training take? (Issue #1) I understand. I have another question. In https://github.com/lishiqianhugh/LfID/blob/main/LfI/LfI.md, you described the Prepare dataset section. Would I run the training script after running python dataset/phyreo.py? I'm confused about faster training. —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: ***@***.***>

maguro27 · 2023-01-12T11:48:46Z

Hmm...

How many mini-batch sizes did you use?
Just now, the training script takes 5.5 hours though I re-run the training script using the default config.

lishiqianhugh · 2023-01-12T12:09:26Z

We use a batch size of 256.

…

------------------ 原始邮件 ------------------ 发件人: "lishiqianhugh/LfID" ***@***.***>; 发送时间: 2023年1月12日(星期四) 晚上7:48 ***@***.***>; ***@***.******@***.***>; 主题: Re: [lishiqianhugh/LfID] How long does the LfI training take? (Issue #1) Hmm... How many mini-batch sizes did you use? Just now, the training script takes 5.5 hours though I re-run the training script using the default config. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: ***@***.***>

maguro27 · 2023-01-12T12:22:58Z

I updated the mini-batch size from 128 to 256.
However, the training script also takes 5.5 hours in each epoch.

Anyway, I ran the training script using SwinTransformer.
However, the Swin training script also needs 6.5 hours.
There is a slight difference between ViT and Swin.

Do you come up with a cause?

maguro27 · 2023-01-12T12:28:02Z

Oh!

I understand the cause.
Also, you said, the data loader includes an online data collection phase.
If you have time, please add a full offline training script to this repository.
Thanks.

lishiqianhugh · 2023-01-12T12:29:10Z

font{ line-height: 1.6; } ul,ol{ padding-left: 20px; list-style-position: inside; } Do you use the saved .pt dataloader? Every time you change your configurations, you have to save a new dataloader.If it is not about this problem, please considering using more CPU kernels to accelerate data loading. ***@***.***

…

---- Replied Message ---- From ***@***.***> Date 1/12/2023 20:23 To ***@***.***> Cc ***@***.***> , ***@***.***> Subject Re: [lishiqianhugh/LfID] How long does the LfI training take? (Issue #1) I updated the mini-batch size from 128 to 256. However, the training script also takes 5.5 hours in each epoch. Anyway, I ran the training script using SwinTransformer. However, the Swin training script also needs 6.5 hours. There is a slight difference between ViT and Swin. Do you come up with a cause? —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: ***@***.***>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How long does the LfI training take? #1

How long does the LfI training take? #1

maguro27 commented Jan 12, 2023

lishiqianhugh commented Jan 12, 2023 via email

maguro27 commented Jan 12, 2023

lishiqianhugh commented Jan 12, 2023 via email

lishiqianhugh commented Jan 12, 2023 via email

maguro27 commented Jan 12, 2023

lishiqianhugh commented Jan 12, 2023 via email

maguro27 commented Jan 12, 2023

maguro27 commented Jan 12, 2023

lishiqianhugh commented Jan 12, 2023 via email

How long does the LfI training take? #1

How long does the LfI training take? #1

Comments

maguro27 commented Jan 12, 2023

lishiqianhugh commented Jan 12, 2023 via email

maguro27 commented Jan 12, 2023

lishiqianhugh commented Jan 12, 2023 via email

lishiqianhugh commented Jan 12, 2023 via email

maguro27 commented Jan 12, 2023

lishiqianhugh commented Jan 12, 2023 via email

maguro27 commented Jan 12, 2023

maguro27 commented Jan 12, 2023

lishiqianhugh commented Jan 12, 2023 via email