NVIDIA / TensorRT-Model-Optimizer Public

Notifications You must be signed in to change notification settings
Fork 42
Star 560

Code
Issues 52
Pull requests 1
Actions
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Security
Insights

Issues: NVIDIA/TensorRT-Model-Optimizer

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

52 Open 47 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

FP16 and FP32 shows 30% lower accuracy compared to INT8 for the ViT Example in ONNX_PTQ

#106 opened Nov 13, 2024 by chjej202

All the modules are disabled and onnx exporation is failed

#104 opened Nov 10, 2024 by Worromots

Whether fp8 quantization supports the the DIT module？

#102 opened Nov 4, 2024 by Rudin6

In cache_diffusion example, can we use dynamic image shape & batch size?

#101 opened Nov 4, 2024 by wxsms

quantification for SD1.5

#99 opened Oct 30, 2024 by zeng121

Can we use multi GPU while exporting (diffusers ) onnx model?

#96 opened Oct 29, 2024 by wxsms

cache_diffusion

#95 opened Oct 29, 2024 by zeng121

cross_attention_kwargs ['adapter_params'] are not expected by DefaultAttnProcessor2_0 and will be ignored.

#94 opened Oct 25, 2024 by zeng121

cross_attention_kwargs ['adapter_params'] are not expected by DefaultAttnProcessor2_0 and will be ignored.

#93 opened Oct 23, 2024 by zeng121

The algorithm output is not the same as after cache_diffusion

#92 opened Oct 23, 2024 by zeng121

calib_dataloader created fail

#91 opened Oct 22, 2024 by relaxtheo

unrecognized arguments: --deployment

#90 opened Oct 22, 2024 by relaxtheo

quantization flux-dev int8 or fp8 occur oom in L40s

#89 opened Oct 21, 2024 by algorithmconquer

Can you quantify the Multi-scale Deformable Attention module?

#88 opened Oct 12, 2024 by IEIAuto

test fail with compiling error AttributeError: _ARRAY_API not found

#87 opened Oct 11, 2024 by braindevices

ARM64 compatibility

#86 opened Oct 11, 2024 by felixkarevo

Whether the comfyui_tensorrt node can load the tensorrt plan model generated by the tool ?

#85 opened Oct 11, 2024 by blacklong28

Do TP and PP parameters play a role in the quantitative calibration stage

#84 opened Oct 10, 2024 by hadoop2xu

Bringing Back Effective Quantization: Using ModelOPT for YOLO and Similar Architectures

#83 opened Oct 9, 2024 by levipereira

[LLM PTQ] non-fatal error during eval (UnicodeDecodeError: 'utf-8' codec can't decode byte)

#82 opened Oct 4, 2024 by stas00

Please help, ModelOPT int8 quantized model runs slower than fp16 quantized model.

#80 opened Oct 4, 2024 by Rajjeshwar

Int8 calculation problem

#76 opened Sep 21, 2024 by CoinCheung

Match pattern failed when building TRT engine

#74 opened Sep 13, 2024 by CaptainRui1000

Quant Flux-dev OOM on L20

#72 opened Sep 13, 2024 by hezeli123

Quantization INT8 SDXL-Turbo on Windows 11 fails?

#71 opened Sep 13, 2024 by joansc

Previous 1 2 3 Next

Previous Next

ProTip! Adding no:label will show everything without a label.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly