fine-tuning via quantization #108

0xD4rky · 2025-01-11T17:49:02Z

Search before asking

I have searched the Multimodal Maestro issues and found no similar feature requests.

Description

I was going through the maestro repo and found out that both paligemma and florence models didn't support the implementation of 4-bit quantization (i.e. using QLoRA config).

Use case

Using QLoRA, we could easily fine-tune vision language models on even low end devices without losing on precision a lot. As the models grow, we would eventually need to implement QLoRA to make finetuning fast and possible on memory constraints.

Additional

I would like to learn your take on implementing quantization.

Are you willing to submit a PR?

Yes I'd like to help by submitting a PR!

0xD4rky added the enhancement New feature or request label Jan 11, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fine-tuning via quantization #108

fine-tuning via quantization #108

0xD4rky commented Jan 11, 2025

fine-tuning via quantization #108

fine-tuning via quantization #108

Comments

0xD4rky commented Jan 11, 2025

Search before asking

Description

Use case

Additional

Are you willing to submit a PR?