Actions: NVIDIA/TensorRT-LLM
Actions
140 workflow runs
140 workflow runs
meta-llama/Llama-3.1-405B-FP8
on 8 x A100 80G
auto-assign
#41:
Issue #2586
labeled
by
nv-guomingz
--model_cls_file
and --model_cls_name
auto-assign
#38:
Issue #2430
labeled
by
nv-guomingz
w8a8
with kv_cache_dtype
FP8
auto-assign
#25:
Issue #2559
labeled
by
darraghdog