-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[feature] Support model compression #48
Comments
Post-training quantization for model compression? |
Yeah, based on TRT |
So this feature is only for triton server, support int8 trt model? Not consider pytorch or tensorflow post-training quantization? Or use TRT KLD use some data calibration for all model from different framework. |
The latter, I think.
In the future we will investigate if we can support TVM or other frameworks. |
Thanks for your response. |
Is this a BUG REPORT or FEATURE REQUEST?:
What happened:
What you expected to happen:
How to reproduce it (as minimally and precisely as possible):
Anything else we need to know?:
The text was updated successfully, but these errors were encountered: