You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi @765500005 may I know your cmd to generate ./tllm_checkpoint_2gpu_tp2 folder?
Hi @nv-guomingz , this is my command :
python convert_checkpoint.py --model_dir /models/Meta-Llama-3.1-8B-Instruct
--output_dir ./tllm_checkpoint_2gpu_tp2
--dtype float16
--tp_size 2
Name: tensorrt_llm
Version: 0.17.0.dev2024121700
Summary: TensorRT-LLM: A TensorRT Toolbox for Large Language Models
Home-page: https://github.com/NVIDIA/TensorRT-LLM
Author: NVIDIA Corporation
Author-email:
License: Apache License 2.0
Location: /usr/local/lib/python3.12/dist-packages
--------------------------------------------------------------------------------------------------------------------------------------------------and my consists of 8 Nvidia L20s
by set --max_batch_size 1
I have built it successfully, tks! @nv-guomingz
trtllm-build --checkpoint_dir ./tllm_checkpoint_2gpu_tp2
--output_dir ./tmp/llama/7B/trt_engines/fp16/2-gpu/
--context_fmha enable
--remove_input_padding enable
--gpus_per_node 8
--gemm_plugin auto
[TRT] [E] IBuilder::buildSerializedNetwork: Error Code 4: Internal Error (Internal error: plugin node LLaMAForCausalLM/transformer/layers/0/attention/wrapper_L562/gpt_attention_L5483/PLUGIN_V2_GPTAttention_0 requires 210571452800 bytes of scratch space, but only 47697362944 is available. Try increasing the workspace size with IBuilderConfig::setMemoryPoolLimit().
)
I have 8 46GB GPUs, but this error occurs. Is this issue related to increasing the workspace size? How can I increase it?
The text was updated successfully, but these errors were encountered: