We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
It appears trtllm-bench is missing support of moe_ep_size / moe_tp_size.
To evaluate MoE model's expert parallelism, e.g., in ref [1], can we get a roadmap update of trtllm-bench's support of MoE parallelisms?
Or please clarify how to get the tooling / process / scripts used in [1] below.
Thanks
1: https://developer.nvidia.com/blog/demystifying-ai-inference-deployments-for-trillion-parameter-large-language-models/
@ncomly-nvidia
examples
trtllm-bench --model mistralai/Mixtral-8x22B-v0.1 build --moe_ep_size 4 --pp_size 2 --quantization FP8 --dataset /home/ubuntu/mistral-8x22b.data
trtllm-bench support MoE parallelisms.
[TensorRT-LLM] TensorRT-LLM version: 0.15.0 Usage: trtllm-bench build [OPTIONS] Try 'trtllm-bench build --help' for help.
Error: No such option: --moe_ep_size (Possible options: --max_batch_size, --pp_size, --tp_size)
n/a
The text was updated successfully, but these errors were encountered:
No branches or pull requests
System Info
It appears trtllm-bench is missing support of moe_ep_size / moe_tp_size.
To evaluate MoE model's expert parallelism, e.g., in ref [1], can we get a roadmap update of trtllm-bench's support of MoE parallelisms?
Or please clarify how to get the tooling / process / scripts used in [1] below.
Thanks
1: https://developer.nvidia.com/blog/demystifying-ai-inference-deployments-for-trillion-parameter-large-language-models/
Who can help?
@ncomly-nvidia
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
trtllm-bench --model mistralai/Mixtral-8x22B-v0.1 build --moe_ep_size 4 --pp_size 2 --quantization FP8 --dataset /home/ubuntu/mistral-8x22b.data
Expected behavior
trtllm-bench support MoE parallelisms.
actual behavior
[TensorRT-LLM] TensorRT-LLM version: 0.15.0
Usage: trtllm-bench build [OPTIONS]
Try 'trtllm-bench build --help' for help.
Error: No such option: --moe_ep_size (Possible options: --max_batch_size, --pp_size, --tp_size)
additional notes
n/a
The text was updated successfully, but these errors were encountered: