You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am currently using the official trtllm v0.11.0 Docker image. However, I encountered the following issues:
1、The inference results of the HuggingFace model in Python cannot align with the results of the TensorRT-LLM model converted for Triton.
2、beam search result suffers around 3% bad decrease compared to transformers beam search method
Could this be an inherent issue with version 0.11.0?
I would greatly appreciate your guidance or suggestions on how to resolve this. Thank you in advance for your help!
The text was updated successfully, but these errors were encountered:
Hi @fclearner , thx for reporting this issue. Would u please try our latest docker image to see if the issue still exists or not? 0.11 maybe too outdated at this moment.
Hi @fclearner , thx for reporting this issue. Would u please try our latest docker image to see if the issue still exists or not? 0.11 maybe too outdated at this moment.
Thanks for the advice! I will try the new image and close this issue for now. If the problem persists, I'll reopen it. Appreciate your help!
I am currently using the official trtllm v0.11.0 Docker image. However, I encountered the following issues:
1、The inference results of the HuggingFace model in Python cannot align with the results of the TensorRT-LLM model converted for Triton.
2、beam search result suffers around 3% bad decrease compared to transformers beam search method
the inference codes is in: https://github.com/k2-fsa/sherpa/tree/master/triton/speech_llm
Could this be an inherent issue with version 0.11.0?
I would greatly appreciate your guidance or suggestions on how to resolve this. Thank you in advance for your help!
The text was updated successfully, but these errors were encountered: