Decode output of `nvmlDeviceGetName` to avoid JSON serialize issue #240

KeitaW · 2024-08-03T00:23:51Z

nvmlDeviceGetName returns bytes, which is not JSON serializable. This causes

TypeError: Object of type bytes is not JSON serializable

error when hub_utils.py calls save_json method (

optimum-benchmark/optimum_benchmark/hub_utils.py

Line 45 in 65fa416

def save_json(self, path: str, flat: bool = False) -> None:

) as it contains the output from the function (

optimum-benchmark/optimum_benchmark/system_utils.py

Line 98 in 65fa416

gpus.append(pynvml.nvmlDeviceGetName(handle))

).
ex (see 'gpu' key value).

bound method PushToHubMixin.to_dict of BenchmarkConfig(name='pytorch_bert', backend=PyTorchConfig(name='pytorch', version='2.4.0a0+3bcc3cddb5.nv24.7', _target_='optimum_benchmark.backends.pytorch.backend.PyTorchBackend', task='fill-mask', library='transformers', model_type='bert', model='bert-base-uncased', processor='bert-base-uncased', device='cuda', device_ids='0', seed=42, inter_op_num_threads=None, intra_op_num_threads=None, model_kwargs={}, processor_kwargs={}, no_weights=True, device_map=None, torch_dtype=None, eval_mode=True, to_bettertransformer=False, low_cpu_mem_usage=None, attn_implementation=None, cache_implementation=None, autocast_enabled=False, autocast_dtype=None, torch_compile=False, torch_compile_target='forward', torch_compile_config={}, quantization_scheme=None, quantization_config={}, deepspeed_inference=False, deepspeed_inference_config={}, peft_type=None, peft_config={}), scenario=InferenceConfig(name='inference', _target_='optimum_benchmark.scenarios.inference.scenario.InferenceScenario', iterations=10, duration=10, warmup_runs=10, input_shapes={'batch_size': 1, 'num_choices': 2, 'sequence_length': 128}, new_tokens=None, memory=True, latency=True, energy=False, forward_kwargs={}, generate_kwargs={}, call_kwargs={}), launcher=ProcessConfig(name='process', _target_='optimum_benchmark.launchers.process.launcher.ProcessLauncher', device_isolation=True, device_isolation_action='warn', numactl=False, numactl_kwargs={}, start_method='spawn'), environment={'cpu': ' AMD EPYC 7R13 Processor', 'cpu_count': 192, 'cpu_ram_mb': 781912.027136, 'system': 'Linux', 'machine': 'x86_64', 'platform': 'Linux-5.15.0-1063-aws-x86_64-with-glibc2.35', 'processor': 'x86_64', 'python_version': '3.10.12', 'gpu': [b'NVIDIA L4', b'NVIDIA L4', b'NVIDIA L4', b'NVIDIA L4', b'NVIDIA L4', b'NVIDIA L4', b'NVIDIA L4', b'NVIDIA L4'], 'gpu_count': 8, 'gpu_vram_mb': 193223196672, 'optimum_benchmark_version': '0.4.0', 'optimum_benchmark_commit': None, 'transformers_version': '4.43.3', 'transformers_commit': None, 'accelerate_version': '0.33.0', 'accelerate_commit': None, 'diffusers_version': None, 'diffusers_commit': None, 'optimum_version': None, 'optimum_commit': None, 'timm_version': None, 'timm_commit': None, 'peft_version': None, 'peft_commit': None})>

I believe that the issue can be avoided simply by decoding the bytes into string as shown in the following example.

from json import dump, load
import pynvml

# Proposed version
gpus = []
pynvml.nvmlInit()
for i in range(pynvml.nvmlDeviceGetCount()):
    handle = pynvml.nvmlDeviceGetHandleByIndex(i)
    gpus.append(pynvml.nvmlDeviceGetName(handle).decode("utf-8"))
pynvml.nvmlShutdown()
data = {'gpu': gpus}

with open("/tmp/tmp.json", "w") as f:
    dump(data, f)
with open("/tmp/tmp.json", "r") as f: 
    print(f.read())
# {"gpu": ["NVIDIA L4", "NVIDIA L4", "NVIDIA L4", "NVIDIA L4", "NVIDIA L4", "NVIDIA L4", "NVIDIA L4", "NVIDIA L4"]}

#Current version
# https://github.com/huggingface/optimum-benchmark/blob/65fa416fd503cfe9a2be7637ee30c70a4a1f96f1/optimum_benchmark/system_utils.py#L95
gpus = []
pynvml.nvmlInit()
for i in range(pynvml.nvmlDeviceGetCount()):
    handle = pynvml.nvmlDeviceGetHandleByIndex(i)
    gpus.append(pynvml.nvmlDeviceGetName(handle))
pynvml.nvmlShutdown()
data = {'gpu': gpus}

with open("/tmp/tmp.json", "w") as f:
    dump(data, f)

#Traceback (most recent call last):
#  File "/workspace/debug/debug-pynvml.py", line 14, in <module>
#    dump(data, f)
#  File "/usr/lib/python3.10/json/__init__.py", line 179, in dump
#    for chunk in iterable:
#  File "/usr/lib/python3.10/json/encoder.py", line 431, in _iterencode
#    yield from _iterencode_dict(o, _current_indent_level)
#  File "/usr/lib/python3.10/json/encoder.py", line 405, in _iterencode_dict
#    yield from chunks
#  File "/usr/lib/python3.10/json/encoder.py", line 325, in _iterencode_list
#    yield from chunks
#  File "/usr/lib/python3.10/json/encoder.py", line 438, in _iterencode
#    o = _default(o)
#  File "/usr/lib/python3.10/json/encoder.py", line 179, in default
#    raise TypeError(f'Object of type {o.__class__.__name__} '
#TypeError: Object of type bytes is not JSON serializable
#make: *** [Makefile:25: debug-pynvml] Error 1

`nvmlDeviceGetName` returns bytes, which is not JSON serializable. This causes ``` TypeError: Object of type bytes is not JSON serializable ``` error when `hub_utils.py` calls [save_json method](https://github.com/huggingface/optimum-benchmark/blob/65fa416fd503cfe9a2be7637ee30c70a4a1f96f1/optimum_benchmark/hub_utils.py#L45) as it contains the output from this function. ex. ``` bound method PushToHubMixin.to_dict of BenchmarkConfig(name='pytorch_bert', backend=PyTorchConfig(name='pytorch', version='2.4.0a0+3bcc3cddb5.nv24.7', _target_='optimum_benchmark.backends.pytorch.backend.PyTorchBackend', task='fill-mask', library='transformers', model_type='bert', model='bert-base-uncased', processor='bert-base-uncased', device='cuda', device_ids='0', seed=42, inter_op_num_threads=None, intra_op_num_threads=None, model_kwargs={}, processor_kwargs={}, no_weights=True, device_map=None, torch_dtype=None, eval_mode=True, to_bettertransformer=False, low_cpu_mem_usage=None, attn_implementation=None, cache_implementation=None, autocast_enabled=False, autocast_dtype=None, torch_compile=False, torch_compile_target='forward', torch_compile_config={}, quantization_scheme=None, quantization_config={}, deepspeed_inference=False, deepspeed_inference_config={}, peft_type=None, peft_config={}), scenario=InferenceConfig(name='inference', _target_='optimum_benchmark.scenarios.inference.scenario.InferenceScenario', iterations=10, duration=10, warmup_runs=10, input_shapes={'batch_size': 1, 'num_choices': 2, 'sequence_length': 128}, new_tokens=None, memory=True, latency=True, energy=False, forward_kwargs={}, generate_kwargs={}, call_kwargs={}), launcher=ProcessConfig(name='process', _target_='optimum_benchmark.launchers.process.launcher.ProcessLauncher', device_isolation=True, device_isolation_action='warn', numactl=False, numactl_kwargs={}, start_method='spawn'), environment={'cpu': ' AMD EPYC 7R13 Processor', 'cpu_count': 192, 'cpu_ram_mb': 781912.027136, 'system': 'Linux', 'machine': 'x86_64', 'platform': 'Linux-5.15.0-1063-aws-x86_64-with-glibc2.35', 'processor': 'x86_64', 'python_version': '3.10.12', 'gpu': [b'NVIDIA L4', b'NVIDIA L4', b'NVIDIA L4', b'NVIDIA L4', b'NVIDIA L4', b'NVIDIA L4', b'NVIDIA L4', b'NVIDIA L4'], 'gpu_count': 8, 'gpu_vram_mb': 193223196672, 'optimum_benchmark_version': '0.4.0', 'optimum_benchmark_commit': None, 'transformers_version': '4.43.3', 'transformers_commit': None, 'accelerate_version': '0.33.0', 'accelerate_commit': None, 'diffusers_version': None, 'diffusers_commit': None, 'optimum_version': None, 'optimum_commit': None, 'timm_version': None, 'timm_commit': None, 'peft_version': None, 'peft_commit': None})> ``` I believe that the issue can be avoided simply by decoding the bytes into string as shown in the following example. ```python from json import dump, load import pynvml # Proposed version gpus = [] pynvml.nvmlInit() for i in range(pynvml.nvmlDeviceGetCount()): handle = pynvml.nvmlDeviceGetHandleByIndex(i) gpus.append(pynvml.nvmlDeviceGetName(handle).decode("utf-8")) pynvml.nvmlShutdown() data = {'gpu': gpus} with open("/tmp/tmp.json", "w") as f: dump(data, f) with open("/tmp/tmp.json", "r") as f: print(f.read()) # {"gpu": ["NVIDIA L4", "NVIDIA L4", "NVIDIA L4", "NVIDIA L4", "NVIDIA L4", "NVIDIA L4", "NVIDIA L4", "NVIDIA L4"]} ``` ```python #Current version # https://github.com/huggingface/optimum-benchmark/blob/65fa416fd503cfe9a2be7637ee30c70a4a1f96f1/optimum_benchmark/system_utils.py#L95 gpus = [] pynvml.nvmlInit() for i in range(pynvml.nvmlDeviceGetCount()): handle = pynvml.nvmlDeviceGetHandleByIndex(i) gpus.append(pynvml.nvmlDeviceGetName(handle)) pynvml.nvmlShutdown() data = {'gpu': gpus} with open("/tmp/tmp.json", "w") as f: dump(data, f) #Traceback (most recent call last): # File "/workspace/debug/debug-pynvml.py", line 14, in <module> # dump(data, f) # File "/usr/lib/python3.10/json/__init__.py", line 179, in dump # for chunk in iterable: # File "/usr/lib/python3.10/json/encoder.py", line 431, in _iterencode # yield from _iterencode_dict(o, _current_indent_level) # File "/usr/lib/python3.10/json/encoder.py", line 405, in _iterencode_dict # yield from chunks # File "/usr/lib/python3.10/json/encoder.py", line 325, in _iterencode_list # yield from chunks # File "/usr/lib/python3.10/json/encoder.py", line 438, in _iterencode # o = _default(o) # File "/usr/lib/python3.10/json/encoder.py", line 179, in default # raise TypeError(f'Object of type {o.__class__.__name__} ' #TypeError: Object of type bytes is not JSON serializable #make: *** [Makefile:25: debug-pynvml] Error 1 ```

IlyasMoutawwakil · 2024-08-03T08:43:57Z

Thanks ! I've never had serialization problems when benchmarking on GPU, is it dependent on the GPU or the pynvml version/distribution (we use the official nvidia-ml-py)

KeitaW · 2024-08-03T23:03:02Z

Thank you for your reply, @IlyasMoutawwakil .

It seems that the output format differs in different versions of pynvml. When I used pynvml == 11.5.0 (which is the version I got with pip install nvidia-ml-py as of today), the function outputs strings, not bytes.

pip list | grep pynvml
# pynvml                    11.5.0

Here is a sample code snippet demonstrating this:

import pynvml
gpus = []
pynvml.nvmlInit()
for i in range(pynvml.nvmlDeviceGetCount()):
     handle = pynvml.nvmlDeviceGetHandleByIndex(i)
     gpus.append(pynvml.nvmlDeviceGetName(handle))
pynvml.Shutdown()
data = {'gpu': gpus}

When executed, the output is as follows:

In [2]: data
Out[2]: 
{'gpu': ['NVIDIA L4',
  'NVIDIA L4',
  'NVIDIA L4',
  'NVIDIA L4',
  'NVIDIA L4',
  'NVIDIA L4',
  'NVIDIA L4',
  'NVIDIA L4']}

The environment I am testing in is nvcr.io/nvidia/pytorch:24.07-py3 (the latest PyTorch container as of today), which includes a slightly older version of pynvml:

pynvml                    11.4.1

In this version, the output format is bytes, as mentioned in the previous message. Allow me to close this PR as the issue does not exist in the latest version. Thanks again!

IlyasMoutawwakil · 2024-08-05T07:56:12Z

we can add a change in that line where the output is decoded or not depending on its type (str or bytes).

KeitaW · 2024-08-05T08:36:51Z

Sure thing. Updated version should work for both cases, as shown in the following example:

import pynvml
gpus = []
pynvml.nvmlInit()
for i in range(pynvml.nvmlDeviceGetCount()):
     handle = pynvml.nvmlDeviceGetHandleByIndex(i)
     gpu = pynvml.nvmlDeviceGetName(handle)
     gpu = gpu.decode("utf-8") if isinstance(gpu, bytes) else gpu # Older pynvml may rutern bytes 
     gpus.append(gpu)     
pynvml.nvmlShutdown()
data = {'gpu': gpus}
print(data)

$ pip list | grep pynvml && python test_pynvml.py 
pynvml                    11.5.0
{'gpu': ['NVIDIA L4', 'NVIDIA L4', 'NVIDIA L4', 'NVIDIA L4', 'NVIDIA L4', 'NVIDIA L4', 'NVIDIA L4', 'NVIDIA L4']}

 pip list | grep pynvml && python test_pynvml.py 
pynvml                    11.4.1
{'gpu': ['NVIDIA L4', 'NVIDIA L4', 'NVIDIA L4', 'NVIDIA L4', 'NVIDIA L4', 'NVIDIA L4', 'NVIDIA L4', 'NVIDIA L4']}

KeitaW closed this Aug 3, 2024

KeitaW deleted the patch-1 branch August 3, 2024 23:11

KeitaW restored the patch-1 branch August 5, 2024 08:24

KeitaW reopened this Aug 5, 2024

Update system_utils.py

9b8501c

Update system_utils.py

6a07115

IlyasMoutawwakil merged commit fd1d0f8 into huggingface:main Aug 15, 2024
21 of 26 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Decode output of `nvmlDeviceGetName` to avoid JSON serialize issue #240

Decode output of `nvmlDeviceGetName` to avoid JSON serialize issue #240

KeitaW commented Aug 3, 2024 •

edited

Loading

IlyasMoutawwakil commented Aug 3, 2024

KeitaW commented Aug 3, 2024 •

edited

Loading

IlyasMoutawwakil commented Aug 5, 2024

KeitaW commented Aug 5, 2024 •

edited

Loading

Decode output of nvmlDeviceGetName to avoid JSON serialize issue #240

Decode output of nvmlDeviceGetName to avoid JSON serialize issue #240

Conversation

KeitaW commented Aug 3, 2024 • edited Loading

IlyasMoutawwakil commented Aug 3, 2024

KeitaW commented Aug 3, 2024 • edited Loading

IlyasMoutawwakil commented Aug 5, 2024

KeitaW commented Aug 5, 2024 • edited Loading

Decode output of `nvmlDeviceGetName` to avoid JSON serialize issue #240

Decode output of `nvmlDeviceGetName` to avoid JSON serialize issue #240

KeitaW commented Aug 3, 2024 •

edited

Loading

KeitaW commented Aug 3, 2024 •

edited

Loading

KeitaW commented Aug 5, 2024 •

edited

Loading