-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PI_ERROR_BUILD_PROGRAM_FAILURE error when running Ollama using ipex-llm on 12450H CPU #12597
Comments
Which model are you using? |
use qwen2.5:7b |
similar issue: #12598, we are fixing it. |
@qadzhang You can try to update ipex-llm[cpp] to 2.2.0b20241226 tomorrow and try again. |
Thank you for your efforts. I upgraded the version and then tested qwen2.5:7b, qwen2.5:0.5b, qwen2:0.5b, bge-m3, and gemma2:9b. Among them, qwen2:0.5b and gemma2:9b can run normally, while the other two report errors.When running qwen2.5:0.5b and qwen2.5:7b, the following errors occur: Sometimes the error is reported at the beginning, sometimes after executing a few steps, but eventually, the following error will always be reported: The program was built for 1 devices When running bge-m3, an error is reported as soon as it starts loading, with the following error message: llama_new_context_with_model: graph splits = 2 (llm-cpp) C:\Users\zc\llama-cpp>ollama serve [GIN-debug] [WARNING] Running in "debug" mode. Switch to "release" mode in production.
[GIN-debug] POST /api/pull --> ollama/server.(*Server).PullHandler-fm (5 handlers) |
@qadzhang I have reproduced your error, we will look into it. |
I've tried qwen2.5 and it works indeed. However, I still get errors when using the bge-m3 model for embeddings. If you want to experiment, you can try running the following program to test it.import ollama text="I am learning at the group company's training center today" (llm-cpp) C:\Users\zc\llama-cpp>ollama serve [GIN-debug] [WARNING] Running in "debug" mode. Switch to "release" mode in production.
[GIN-debug] POST /api/pull --> ollama/server.(*Server).PullHandler-fm (5 handlers) llama_kv_cache_init: SYCL0 KV buffer size = 192.00 MiB |
@qadzhang bge-m3 is a different error, we are fixing it. |
Hi @qadzhang, we have fixed the embedding issue, you may install our latest ipex-llm ollama via |
BGE-M3 is also acceptable. Thank you all for your efforts. |
Hello,
The CPU is 12450H with driver version 32.0.101.6325.
The installed software is ipex-llm[cpp], and the Ollama version is 0.4.6.
The installation was successful, but an error occurred before inference while loading the model.
time=2024-12-23T23:18:56.511+08:00 level=INFO source=routes.go:1248 msg="Listening on 127.0.0.1:11434 (version 0.4.6-ipexllm-20241223)"
time=2024-12-23T23:18:56.511+08:00 level=INFO source=common.go:49 msg="Dynamic LLM libraries" runners=[ipex_llm]
time=2024-12-23T23:09:28.726+08:00 level=INFO source=server.go:619 msg="llama runner started in 3.77 seconds"
The program was built for 1 devices
Build program log for 'Intel(R) UHD Graphics':
-11 (PI_ERROR_BUILD_PROGRAM_FAILURE)Exception caught at file:D:/actions-runner/release-cpp-oneapi_2024_2/_work/llm.cpp/llm.cpp/llama-cpp-bigdl/ggml/src/ggml-sycl.cpp, line:3775
The text was updated successfully, but these errors were encountered: