-
Notifications
You must be signed in to change notification settings - Fork 109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimizing Model Performance: Exploring ONNX Export and Engine Integration with TensorRT and OpenVino #68
Comments
Thanks for your interest. Currently, we haven't explored exporting the models to other formats. |
I did some tests, unifying all the attention blocks and building it with TensorRT v8.5. While this improvement in memory the runtime using jetson platform is slower when building the TRT engine respect of pure Pytorch, instead using a nvidia RTX card it remain the same (I'm running the engine using python api). What do you think about this approache (https://github.com/hkchengrex/Cutie) ? Your object memory version is similar? |
There are some Torch compile issues with these models: |
Could you please share the sample script to convert model to onnx? |
Hi, have you explored evaluating the architecture through the export to ONNX format and its implementation with different engines like TensorRT or OpenVino?
The text was updated successfully, but these errors were encountered: