quantized model using AWQ and lora weights #2703

shuyuan-wang · 2025-01-17T09:27:12Z

Hello:
Does TensorRT-LLM supports a model quantized with AWQ and the lora weights trained on the quantized weights?

Tracin · 2025-01-21T02:54:42Z

I think we only support full precision Lora model for now.

lodm94 · 2025-01-22T09:07:37Z

AutoAWQ checkpoint can be converted in TRT-LLM with LoRA support allowing inference with adapters or foundation model through lora uids. Check example for llama.

nv-guomingz assigned Tracin Jan 20, 2025

nv-guomingz added the Low Precision Issue about lower bit quantization, including int8, int4, fp8 label Jan 20, 2025

github-actions bot added triaged Issue has been triaged by maintainers Investigating labels Jan 20, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

quantized model using AWQ and lora weights #2703

quantized model using AWQ and lora weights #2703

shuyuan-wang commented Jan 17, 2025

Tracin commented Jan 21, 2025

lodm94 commented Jan 22, 2025

quantized model using AWQ and lora weights #2703

quantized model using AWQ and lora weights #2703

Comments

shuyuan-wang commented Jan 17, 2025

Tracin commented Jan 21, 2025

lodm94 commented Jan 22, 2025