Template in Pre-Training vs. SFT #162

serwansj · 2024-12-16T17:35:19Z

Hi, thank you for your cool work!

After looking at your code, I wanted to know if I understood your use of the conversation template correctly. The templates found in https://github.com/NVlabs/VILA/blob/main/llava/conversation.py are only used during SFT, right? They are not used during the pre-training stage(s)?

Lyken17 · 2025-01-07T19:19:06Z

We apply the template in both pretraining and SFT stage. Only in the alignment (stage1), we use the raw text image pairs.

serwansj · 2025-01-09T15:12:39Z

@Lyken17 Thank you for your reply! How are you applying the template in the case of the MMC4 dataset, i.e. what instruction prompt are you using in this case?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Template in Pre-Training vs. SFT #162

Template in Pre-Training vs. SFT #162

serwansj commented Dec 16, 2024

Lyken17 commented Jan 7, 2025

serwansj commented Jan 9, 2025

Template in Pre-Training vs. SFT #162

Template in Pre-Training vs. SFT #162

Comments

serwansj commented Dec 16, 2024

Lyken17 commented Jan 7, 2025

serwansj commented Jan 9, 2025