Fail to build the GPT-J docker after successfully installation of tensorrt-llm #2022

Bob123Yang · 2025-01-08T06:43:59Z

When I was running the below command to build the docker for GPT-J:

cm run script --tags=run-mlperf,inference,_find-performance,_full,_r5.0-dev
--model=gptj-99
--implementation=nvidia
--framework=tensorrt
--category=edge
--scenario=Offline
--execution_mode=test
--device=cuda
--docker --quiet
--test_query_count=50

I got the failure as below, I'm not sure if it is related the existing docker (built for Resnet50 several days before) or not?

Successfully installed tensorrt-llm

[notice] A new release of pip is available: 23.3.1 -> 24.3.1
[notice] To update, run: python3 -m pip install --upgrade pip
Initializing model from /mnt/models/GPTJ-6B/checkpoint-final
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:16<00:00,  5.48s/it]
[TensorRT-LLM][WARNING] The manually set model data type is torch.float16, but the data type of the HuggingFace model is torch.float32.
Initializing tokenizer from /mnt/models/GPTJ-6B/checkpoint-final
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Loading calibration dataset
Traceback (most recent call last):
  File "/code/tensorrt_llm/examples/quantization/quantize.py", line 363, in <module>
    main(args)
  File "/code/tensorrt_llm/examples/quantization/quantize.py", line 255, in main
    calib_dataloader = get_calib_dataloader(
  File "/code/tensorrt_llm/examples/quantization/quantize.py", line 187, in get_calib_dataloader
    dataset = load_dataset("cnn_dailymail", name="3.0.0", split="train")
  File "/home/bob1/.local/lib/python3.10/site-packages/datasets/load.py", line 2129, in load_dataset
    builder_instance = load_dataset_builder(
  File "/home/bob1/.local/lib/python3.10/site-packages/datasets/load.py", line 1849, in load_dataset_builder
    dataset_module = dataset_module_factory(
  File "/home/bob1/.local/lib/python3.10/site-packages/datasets/load.py", line 1731, in dataset_module_factory
    raise e1 from None
  File "/home/bob1/.local/lib/python3.10/site-packages/datasets/load.py", line 1618, in dataset_module_factory
    raise ConnectionError(f"Couldn't reach '{path}' on the Hub ({e.__class__.__name__})") from e
ConnectionError: Couldn't reach 'cnn_dailymail' on the Hub (LocalEntryNotFoundError)
make: *** [Makefile:102: devel_run] Error 1
make: Leaving directory '/home/bob1/CM/repos/local/cache/2479e8f0ba164d4c/repo/docker'

CM error: Portable CM script failed (name = get-ml-model-gptj, return code = 256)


^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Note that it is often a portability issue of a third-party tool or a native script
wrapped and unified by this CM script (automation recipe). Please re-run
this script with --repro flag and report this issue with the original
command line, cm-repro directory and full log here:

https://github.com/mlcommons/cm4mlops/issues

The CM concept is to collaboratively fix such issues inside portable CM scripts
to make existing tools and native scripts more portable, interoperable
and deterministic. Thank you!

arjunsuresh · 2025-01-08T17:56:33Z

It looks like a network error as the code is working fine for me. May be retry?

[notice] To update, run: python3 -m pip install --upgrade pip
Initializing model from /mnt/models/GPTJ-6B/checkpoint-final
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:12<00:00,  4.10s/it]
[TensorRT-LLM][WARNING] The manually set model data type is torch.float16, but the data type of the HuggingFace model is torch.float32.
Initializing tokenizer from /mnt/models/GPTJ-6B/checkpoint-final
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Loading calibration dataset
README.md: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 15.6k/15.6k [00:00<00:00, 43.8MB/s]
train-00000-of-00003.parquet: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████| 257M/257M [00:02<00:00, 86.9MB/s]
train-00001-of-00003.parquet: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████| 257M/257M [00:02<00:00, 103MB/s]
train-00002-of-00003.parquet: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████| 259M/259M [00:02<00:00, 99.9MB/s]
validation-00000-of-00001.parquet: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 34.7M/34.7M [00:00<00:00, 49.1MB/s]
test-00000-of-00001.parquet: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████| 30.0M/30.0M [00:01<00:00, 24.4MB/s]
Generating train split: 100%|██████████████████████████████████████████████████████████████████████████████████████████████| 287113/287113 [00:02<00:00, 105507.79 examples/s]
Generating validation split: 100%|████████████████████████████████████████████████████████████████████████████████████████████| 13368/13368 [00:00<00:00, 96769.25 examples/s]
Generating test split: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 11490/11490 [00:00<00:00, 116725.17 examples/s]
{'quant_cfg': {'*weight_quantizer': {'num_bits': (4, 3), 'axis': None}, '*input_quantizer': {'num_bits': (4, 3), 'axis': None}, 'default': {'num_bits': (4, 3), 'axis': None}, '*.query_key_value.output_quantizer': {'num_bits': (4, 3), 'axis': None, 'enable': True}, '*.Wqkv.output_quantizer': {'num_bits': (4, 3), 'axis': None, 'enable': True}, '*.W_pack.output_quantizer': {'num_bits': (4, 3), 'axis': None, 'enable': True}, '*.c_attn.output_quantizer': {'num_bits': (4, 3), 'axis': None, 'enable': True}, '*.k_proj.output_quantizer': {'num_bits': (4, 3), 'axis': None, 'enable': True}, '*.v_proj.output_quantizer': {'num_bits': (4, 3), 'axis': None, 'enable': True}}, 'algorithm': 'max'}
Starting quantization...
Replaced 507 modules to quantized modules

Bob123Yang · 2025-01-09T15:08:03Z

Okay, I will try it later.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fail to build the GPT-J docker after successfully installation of tensorrt-llm #2022

Fail to build the GPT-J docker after successfully installation of tensorrt-llm #2022

Bob123Yang commented Jan 8, 2025

arjunsuresh commented Jan 8, 2025

Bob123Yang commented Jan 9, 2025

Fail to build the GPT-J docker after successfully installation of tensorrt-llm #2022

Fail to build the GPT-J docker after successfully installation of tensorrt-llm #2022

Comments

Bob123Yang commented Jan 8, 2025

arjunsuresh commented Jan 8, 2025

Bob123Yang commented Jan 9, 2025