Support for int2/int3 quantization #2704
Labels
Investigating
Low Precision
Issue about lower bit quantization, including int8, int4, fp8
triaged
Issue has been triaged by maintainers
Hi, I would like to inquire if you plan to support INT2/INT3 quantization in the near future, or if it's possible to implement them by modifying some kernels. I'm interested in understanding the effort required to support INT2/INT3 quantization. Additionally, I noticed that AWQ supports INT3 with 128 groups (int3g128), while trt-llm currently only supports INT4. Could you provide more insight into this?
The text was updated successfully, but these errors were encountered: