You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have searched the Multimodal Maestro issues and found no similar feature requests.
Description
I was going through the maestro repo and found out that both paligemma and florence models didn't support the implementation of 4-bit quantization (i.e. using QLoRA config).
Use case
Using QLoRA, we could easily fine-tune vision language models on even low end devices without losing on precision a lot. As the models grow, we would eventually need to implement QLoRA to make finetuning fast and possible on memory constraints.
Additional
I would like to learn your take on implementing quantization.
Are you willing to submit a PR?
Yes I'd like to help by submitting a PR!
The text was updated successfully, but these errors were encountered:
Search before asking
Description
I was going through the maestro repo and found out that both paligemma and florence models didn't support the implementation of 4-bit quantization (i.e. using QLoRA config).
Use case
Using QLoRA, we could easily fine-tune vision language models on even low end devices without losing on precision a lot. As the models grow, we would eventually need to implement QLoRA to make finetuning fast and possible on memory constraints.
Additional
I would like to learn your take on implementing quantization.
Are you willing to submit a PR?
The text was updated successfully, but these errors were encountered: