Cannot use of GPU with CUDA instead only CPU which is very slow #2083

dadupriv · 2024-09-11T13:30:28Z

Pre-check

I have searched the existing issues and none cover this bug.

Description

Windows OS:
all requirements that CUDA has
gcc++ 14
Runing PrivateGPT but only with CPU not GPU

+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 14620 C+G ...crosoft\Edge\Application\msedge.exe N/A |
| 1 N/A N/A 9456 C+G C:\Windows\explorer.exe N/A |
| 1 N/A N/A 10884 C+G ...2txyewy\StartMenuExperienceHost.exe N/A |
| 1 N/A N/A 12132 C+G ....Search_cw5n1h2txyewy\SearchApp.exe N/A |
| 1 N/A N/A 14668 C+G ...CBS_cw5n1h2txyewy\TextInputHost.exe N/A |
| 1 N/A N/A 17180 C+G ...am Files (x86)\VideoLAN\VLC\vlc.exe N/A |
| 1 N/A N/A 18792 C+G ...5n1h2txyewy\ShellExperienceHost.exe N/A |
+-----------------------------------------------------------------------------------------+

I have searched, and cannot compile llama cpp with CUDA problem as below.

Anaconda Powershell

PS C:\Users\XXXXX>

$env:CMAKE_ARGS='-DLLAMA_CUBLAS=on'; poetry run pip install --force-reinstall --no-cache-dir llama-cpp-python numpy==1.26.0
Collecting llama-cpp-python
Downloading llama_cpp_python-0.2.90.tar.gz (63.8 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 63.8/63.8 MB 40.9 MB/s eta 0:00:00
Installing build dependencies ... done
Getting requirements to build wheel ... done
Preparing metadata (pyproject.toml) ... done
Collecting numpy==1.26.0
Downloading numpy-1.26.0-cp311-cp311-win_amd64.whl.metadata (61 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 61.1/61.1 kB 3.4 MB/s eta 0:00:00
Collecting typing-extensions>=4.5.0 (from llama-cpp-python)
Downloading typing_extensions-4.12.2-py3-none-any.whl.metadata (3.0 kB)
Collecting diskcache>=5.6.1 (from llama-cpp-python)
Downloading diskcache-5.6.3-py3-none-any.whl.metadata (20 kB)
Collecting jinja2>=2.11.3 (from llama-cpp-python)
Downloading jinja2-3.1.4-py3-none-any.whl.metadata (2.6 kB)
Collecting MarkupSafe>=2.0 (from jinja2>=2.11.3->llama-cpp-python)
Downloading MarkupSafe-2.1.5-cp311-cp311-win_amd64.whl.metadata (3.1 kB)
Downloading numpy-1.26.0-cp311-cp311-win_amd64.whl (15.8 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 15.8/15.8 MB 38.6 MB/s eta 0:00:00
Downloading diskcache-5.6.3-py3-none-any.whl (45 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 45.5/45.5 kB ? eta 0:00:00
Downloading jinja2-3.1.4-py3-none-any.whl (133 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 133.3/133.3 kB ? eta 0:00:00
Downloading typing_extensions-4.12.2-py3-none-any.whl (37 kB)
Downloading MarkupSafe-2.1.5-cp311-cp311-win_amd64.whl (17 kB)
Building wheels for collected packages: llama-cpp-python
Building wheel for llama-cpp-python (pyproject.toml) ... error
error: subprocess-exited-with-error

× Building wheel for llama-cpp-python (pyproject.toml) did not run successfully.
│ exit code: 1
╰─> [31 lines of output]
*** scikit-build-core 0.10.6 using CMake 3.30.3 (wheel)
*** Configuring CMake...
2024-09-11 10:14:43,243 - scikit_build_core - WARNING - Can't find a Python library, got libdir=None, ldlibrary=None, multiarch=None, masd=None
loading initial cache file C:\Users\nasdadu\AppData\Local\Temp\tmp2efzwb2l\build\CMakeInit.txt
-- Building for: Visual Studio 17 2022
-- Selecting Windows SDK version 10.0.22621.0 to target Windows 10.0.19045.
-- The C compiler identification is MSVC 19.35.32217.1
-- The CXX compiler identification is MSVC 19.35.32217.1
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: C:/Program Files (x86)/Microsoft Visual Studio/2022/BuildTools/VC/Tools/MSVC/14.35.32215/bin/Hostx64/x64/cl.exe - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: C:/Program Files (x86)/Microsoft Visual Studio/2022/BuildTools/VC/Tools/MSVC/14.35.32215/bin/Hostx64/x64/cl.exe - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found Git: C:/Users/nasdadu/pinokio/bin/miniconda/Library/bin/git.exe (found version "2.42.0.windows.1")
CMake Error at vendor/llama.cpp/CMakeLists.txt:95 (message):
LLAMA_CUBLAS is deprecated and will be removed in the future.

    Use GGML_CUDA instead

  Call Stack (most recent call first):
    vendor/llama.cpp/CMakeLists.txt:100 (llama_option_depr)


  -- Configuring incomplete, errors occurred!

  *** CMake configuration failed
  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for llama-cpp-python
Failed to build llama-cpp-python
ERROR: Could not build wheels for llama-cpp-python, which is required to install pyproject.toml-based projects

[notice] A new release of pip is available: 23.3.1 -> 24.2
[notice] To update, run: python.exe -m pip install --upgrade pip

Steps to Reproduce

Windows OS:
Input commands in powershell:
$env:CMAKE_ARGS='-DLLAMA_CUBLAS=on'; poetry run pip install --force-reinstall --no-cache-dir llama-cpp-python numpy==1.26.0

Expected Behavior

Expected BLAS=1 with GPU usage

Actual Behavior

Output BLAS=0 only CPU usage

Environment

Windows 10 19045.4780 RTX 3090

Additional Information

No response

Version

No response

Setup Checklist

Confirm that you have followed the installation instructions in the project’s documentation.
Check that you are using the latest version of the project.
Verify disk space availability for model storage and data processing.
Ensure that you have the necessary permissions to run the project.

NVIDIA GPU Setup Checklist

Check that the all CUDA dependencies are installed and are compatible with your GPU (refer to CUDA's documentation)
Ensure an NVIDIA GPU is installed and recognized by the system (run nvidia-smi to verify).
Ensure proper permissions are set for accessing GPU resources.
Docker users - Verify that the NVIDIA Container Toolkit is configured correctly (e.g. run sudo docker run --rm --gpus all nvidia/cuda:11.0.3-base-ubuntu20.04 nvidia-smi)

The text was updated successfully, but these errors were encountered:

dadupriv · 2024-09-11T23:26:25Z

$env:CMAKE_ARGS='-DLLAMA_CUBLAS=on'; poetry run pip install --force-reinstall --no-cache-dir llama-cpp-python numpy==1.26.0
Solved! Replace DLLAMA_CUBLAS=on with GGML_CUDA=on

jveronese · 2024-10-03T02:32:06Z

Is there a certain way I need to launch this? I launch using https://github.com/zylon-ai/private-gpt/issues/2083 after running '$env:CMAKE_ARGS='-GGML_CUDA=on'; poetry run pip install --force-reinstall --no-cache-dir llama-cpp-python numpy==1.26.0' and it still uses my CPU instead of GPU

dadupriv added the bug Something isn't working label Sep 11, 2024

dadupriv changed the title ~~[BUG] Cannot use of GPU with CUDA instead only CPU which es very slow~~ [BUG] Cannot use of GPU with CUDA instead only CPU which is very slow Sep 11, 2024

dadupriv changed the title ~~[BUG] Cannot use of GPU with CUDA instead only CPU which is very slow~~ Cannot use of GPU with CUDA instead only CPU which is very slow Sep 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cannot use of GPU with CUDA instead only CPU which is very slow #2083

Cannot use of GPU with CUDA instead only CPU which is very slow #2083

dadupriv commented Sep 11, 2024 •

edited

Loading

dadupriv commented Sep 11, 2024

jveronese commented Oct 3, 2024

Cannot use of GPU with CUDA instead only CPU which is very slow #2083

Cannot use of GPU with CUDA instead only CPU which is very slow #2083

Comments

dadupriv commented Sep 11, 2024 • edited Loading

Pre-check

Description

Steps to Reproduce

Expected Behavior

Actual Behavior

Environment

Additional Information

Version

Setup Checklist

NVIDIA GPU Setup Checklist

dadupriv commented Sep 11, 2024

jveronese commented Oct 3, 2024

dadupriv commented Sep 11, 2024 •

edited

Loading