Issue with Loading 6b Model: ZeroDivisionError #11

huoliangyu · 2023-12-14T10:29:32Z

Excellent work！
I've encountered a specific issue when attempting to load the 6b model using the following command:
llm = LLM(model=model_name_or_dir, tensor_parallel_size=num_gpus)
where model_name_or_dir is a local path. Unfortunately, this resulted in a ZeroDivisionError. The detailed error message is as follows:
File "/usr/local/lib/python3.10/dist-packages/vllm-0.2.0-py3.10-linux-x86_64.egg/vllm/entrypoints/llm.py", line 93, in __init__ self.llm_engine = LLMEngine.from_engine_args(engine_args) File "/usr/local/lib/python3.10/dist-packages/vllm-0.2.0-py3.10-linux-x86_64.egg/vllm/engine/llm_engine.py", line 231, in from_engine_args engine = cls(*engine_configs, File "/usr/local/lib/python3.10/dist-packages/vllm-0.2.0-py3.10-linux-x86_64.egg/vllm/engine/llm_engine.py", line 113, in __init__ self._init_cache() File "/usr/local/lib/python3.10/dist-packages/vllm-0.2.0-py3.10-linux-x86_64.egg/vllm/engine/llm_engine.py", line 193, in _init_cache num_blocks = self._run_workers( File "/usr/local/lib/python3.10/dist-packages/vllm-0.2.0-py3.10-linux-x86_64.egg/vllm/engine/llm_engine.py", line 698, in _run_workers all_outputs = ray.get(all_outputs) File "/usr/local/lib/python3.10/dist-packages/ray/_private/auto_init_hook.py", line 24, in auto_init_wrapper return fn(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/ray/_private/client_mode_hook.py", line 103, in wrapper return func(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/ray/_private/worker.py", line 2547, in get raise value.as_instanceof_cause() ray.exceptions.RayTaskError(ZeroDivisionError): �[36mray::RayWorker.execute_method()�[39m (pid=3707, ip=10.33.79.244, actor_id=71ea7c30788a6797b487792c01000000, repr=<vllm.engine.ray_utils.RayWorker object at 0x7fb32ad01030>) File "/usr/local/lib/python3.10/dist-packages/vllm-0.2.0-py3.10-linux-x86_64.egg/vllm/engine/ray_utils.py", line 32, in execute_method return executor(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/vllm-0.2.0-py3.10-linux-x86_64.egg/vllm/worker/worker.py", line 127, in profile_num_available_blocks (total_gpu_memory * gpu_memory_utilization - peak_memory) // ZeroDivisionError: float floor division by zero
However, when I switch to the 13b model, the model loads without any issues and it works normally. Any guidance or insights into this matter would be greatly appreciated. Thank you for your support and dedication to maintaining this project.

The text was updated successfully, but these errors were encountered:

StarWalkin · 2023-12-15T06:06:30Z

Hi @huoliangyu! Thanks for your valuable feedback.
We've tried to replicate your issue, but everything seems to be working fine. It might be helpful if you could provide us with some more details, like your complete program, your environment etc.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue with Loading 6b Model: ZeroDivisionError #11

Issue with Loading 6b Model: ZeroDivisionError #11

huoliangyu commented Dec 14, 2023

StarWalkin commented Dec 15, 2023

Issue with Loading 6b Model: ZeroDivisionError #11

Issue with Loading 6b Model: ZeroDivisionError #11

Comments

huoliangyu commented Dec 14, 2023

StarWalkin commented Dec 15, 2023