Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Template in Pre-Training vs. SFT #162

Open
serwansj opened this issue Dec 16, 2024 · 2 comments
Open

Template in Pre-Training vs. SFT #162

serwansj opened this issue Dec 16, 2024 · 2 comments

Comments

@serwansj
Copy link

Hi, thank you for your cool work!

After looking at your code, I wanted to know if I understood your use of the conversation template correctly. The templates found in https://github.com/NVlabs/VILA/blob/main/llava/conversation.py are only used during SFT, right? They are not used during the pre-training stage(s)?

@Lyken17
Copy link
Collaborator

Lyken17 commented Jan 7, 2025

We apply the template in both pretraining and SFT stage. Only in the alignment (stage1), we use the raw text image pairs.

@serwansj
Copy link
Author

serwansj commented Jan 9, 2025

@Lyken17 Thank you for your reply! How are you applying the template in the case of the MMC4 dataset, i.e. what instruction prompt are you using in this case?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants