You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am seeking clarification and guidance on the process of quantizing the Llava 1.6 model using the Efficient QAT repository. Specifically, I would like to confirm the steps involved and understand the details regarding what components are finetuned at each stage of the process.
Queries
Applying BlockAP on LLM : Is the initial step to apply BlockAP quantization on the LLM? If so, are there any specific datasets, considerations or configurations required during this step?
Freezing LLM and Vision Transformer (ViT), and Training the Projector: After obtaining the BlockAP-quantized LLM, the next step appears to involve freezing both the LLM and ViT while training the projector. Could you provide details on where to perform the projector training ? Any relevant scripts or functions would be helpful for implementing this step effectively.
End-to-End Fine-Tuning of LLM and Projector: During the end-to-end fine-tuning stage, do we:
a. Finetune only the scales for the LLM?
b. Finetune the weights for the projector?
Are there any additional parameters or components involved in this finetuning process that I might be missing?
Could you please provide clarity on the above queries and confirm if the outlined process aligns with the intended approach for quantizing Llava 1.6? Additionally, I would appreciate any guidance on specific code references or best practices for implementing the training and fine-tuning stages.
Looking forward to your insights! Thank you in advance for your support.
The text was updated successfully, but these errors were encountered:
Hi @ChenMnZ,
I am seeking clarification and guidance on the process of quantizing the Llava 1.6 model using the Efficient QAT repository. Specifically, I would like to confirm the steps involved and understand the details regarding what components are finetuned at each stage of the process.
Queries
a. Finetune only the scales for the LLM?
b. Finetune the weights for the projector?
Are there any additional parameters or components involved in this finetuning process that I might be missing?
Could you please provide clarity on the above queries and confirm if the outlined process aligns with the intended approach for quantizing Llava 1.6? Additionally, I would appreciate any guidance on specific code references or best practices for implementing the training and fine-tuning stages.
Looking forward to your insights! Thank you in advance for your support.
The text was updated successfully, but these errors were encountered: