Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL] Compression/* E2E tests failing on HIP #15829

Open
uditagarwal97 opened this issue Oct 23, 2024 · 1 comment
Open

[SYCL] Compression/* E2E tests failing on HIP #15829

uditagarwal97 opened this issue Oct 23, 2024 · 1 comment
Assignees
Labels
bug Something isn't working hip Issues related to execution on HIP backend.

Comments

@uditagarwal97
Copy link
Contributor

Describe the bug

Example: https://github.com/intel/llvm/actions/runs/11478969082/job/31946092977

FAIL: SYCL :: Compression/compression.cpp (403 of 2238)
******************** TEST 'SYCL :: Compression/compression.cpp' FAILED ********************
Exit Code: -11

Command Output (stdout):
--
# RUN: at line 3
/__w/llvm/llvm/toolchain/bin//clang++  -Werror -Xsycl-target-backend=amdgcn-amd-amdhsa --offload-arch=gfx1031 -fsycl -fsycl-targets=amdgcn-amd-amdhsa  /__w/llvm/llvm/llvm/sycl/test-e2e/Compression/compression.cpp -O0 -g /__w/llvm/llvm/llvm/sycl/test-e2e/Compression/Inputs/single_kernel.cpp -o /__w/llvm/llvm/build-e2e/Compression/Output/compression.cpp.tmp_not_compress.out
# executed command: /__w/llvm/llvm/toolchain/bin//clang++ -Werror -Xsycl-target-backend=amdgcn-amd-amdhsa --offload-arch=gfx10[31](https://github.com/intel/llvm/actions/runs/11478969082/job/31946092977#step:22:32) -fsycl -fsycl-targets=amdgcn-amd-amdhsa /__w/llvm/llvm/llvm/sycl/test-e2e/Compression/compression.cpp -O0 -g /__w/llvm/llvm/llvm/sycl/test-e2e/Compression/Inputs/single_kernel.cpp -o /__w/llvm/llvm/build-e2e/Compression/Output/compression.cpp.tmp_not_compress.out
# note: command had no output on stdout or stderr
# RUN: at line 4
/__w/llvm/llvm/toolchain/bin//clang++  -Werror -Xsycl-target-backend=amdgcn-amd-amdhsa --offload-arch=gfx1031 -fsycl -fsycl-targets=amdgcn-amd-amdhsa  /__w/llvm/llvm/llvm/sycl/test-e2e/Compression/compression.cpp -O0 -g --offload-compress --offload-compression-level=3 /__w/llvm/llvm/llvm/sycl/test-e2e/Compression/Inputs/single_kernel.cpp -o /__w/llvm/llvm/build-e2e/Compression/Output/compression.cpp.tmp_compress.out
# executed command: /__w/llvm/llvm/toolchain/bin//clang++ -Werror -Xsycl-target-backend=amdgcn-amd-amdhsa --offload-arch=gfx1031 -fsycl -fsycl-targets=amdgcn-amd-amdhsa /__w/llvm/llvm/llvm/sycl/test-e2e/Compression/compression.cpp -O0 -g --offload-compress --offload-compression-level=3 /__w/llvm/llvm/llvm/sycl/test-e2e/Compression/Inputs/single_kernel.cpp -o /__w/llvm/llvm/build-e2e/Compression/Output/compression.cpp.tmp_compress.out
# note: command had no output on stdout or stderr
# RUN: at line 5
env ONEAPI_DEVICE_SELECTOR=hip:gpu  /__w/llvm/llvm/build-e2e/Compression/Output/compression.cpp.tmp_not_compress.out
# executed command: env ONEAPI_DEVICE_SELECTOR=hip:gpu /__w/llvm/llvm/build-e2e/Compression/Output/compression.cpp.tmp_not_compress.out
# note: command had no output on stdout or stderr
# RUN: at line 6
env ONEAPI_DEVICE_SELECTOR=hip:gpu  /__w/llvm/llvm/build-e2e/Compression/Output/compression.cpp.tmp_compress.out
# executed command: env ONEAPI_DEVICE_SELECTOR=hip:gpu /__w/llvm/llvm/build-e2e/Compression/Output/compression.cpp.tmp_compress.out
# note: command had no output on stdout or stderr
# error: command failed with exit status: -11

--

********************
FAIL: SYCL :: Compression/compression_multiple_tu.cpp (410 of 2238)
******************** TEST 'SYCL :: Compression/compression_multiple_tu.cpp' FAILED ********************
Exit Code: -6

Command Output (stdout):
--
# RUN: at line 5
/__w/llvm/llvm/toolchain/bin//clang++  -Werror -Xsycl-target-backend=amdgcn-amd-amdhsa --offload-arch=gfx1031 -fsycl -fsycl-targets=amdgcn-amd-amdhsa  /__w/llvm/llvm/llvm/sycl/test-e2e/Compression/compression_multiple_tu.cpp --offload-compress -DENABLE_KERNEL1 -shared -fPIC -o /__w/llvm/llvm/build-e2e/Compression/Output/kernel1.so
# executed command: /__w/llvm/llvm/toolchain/bin//clang++ -Werror -Xsycl-target-backend=amdgcn-amd-amdhsa --offload-arch=gfx1031 -fsycl -fsycl-targets=amdgcn-amd-amdhsa /__w/llvm/llvm/llvm/sycl/test-e2e/Compression/compression_multiple_tu.cpp --offload-compress -DENABLE_KERNEL1 -shared -fPIC -o /__w/llvm/llvm/build-e2e/Compression/Output/kernel1.so
# note: command had no output on stdout or stderr
# RUN: at line 6
/__w/llvm/llvm/toolchain/bin//clang++  -Werror -Xsycl-target-backend=amdgcn-amd-amdhsa --offload-arch=gfx1031 -fsycl -fsycl-targets=amdgcn-amd-amdhsa  /__w/llvm/llvm/llvm/sycl/test-e2e/Compression/compression_multiple_tu.cpp -DENABLE_KERNEL2 -shared -fPIC -o /__w/llvm/llvm/build-e2e/Compression/Output/kernel2.so
# executed command: /__w/llvm/llvm/toolchain/bin//clang++ -Werror -Xsycl-target-backend=amdgcn-amd-amdhsa --offload-arch=gfx1031 -fsycl -fsycl-targets=amdgcn-amd-amdhsa /__w/llvm/llvm/llvm/sycl/test-e2e/Compression/compression_multiple_tu.cpp -DENABLE_KERNEL2 -shared -fPIC -o /__w/llvm/llvm/build-e2e/Compression/Output/kernel2.so
# note: command had no output on stdout or stderr
# RUN: at line 8
/__w/llvm/llvm/toolchain/bin//clang++  -Werror -Xsycl-target-backend=amdgcn-amd-amdhsa --offload-arch=gfx1031 -fsycl -fsycl-targets=amdgcn-amd-amdhsa  /__w/llvm/llvm/llvm/sycl/test-e2e/Compression/compression_multiple_tu.cpp /__w/llvm/llvm/build-e2e/Compression/Output/kernel1.so /__w/llvm/llvm/build-e2e/Compression/Output/kernel2.so -o /__w/llvm/llvm/build-e2e/Compression/Output/compression_multiple_tu.cpp.tmp_compress.out
# executed command: /__w/llvm/llvm/toolchain/bin//clang++ -Werror -Xsycl-target-backend=amdgcn-amd-amdhsa --offload-arch=gfx1031 -fsycl -fsycl-targets=amdgcn-amd-amdhsa /__w/llvm/llvm/llvm/sycl/test-e2e/Compression/compression_multiple_tu.cpp /__w/llvm/llvm/build-e2e/Compression/Output/kernel1.so /__w/llvm/llvm/build-e2e/Compression/Output/kernel2.so -o /__w/llvm/llvm/build-e2e/Compression/Output/compression_multiple_tu.cpp.tmp_compress.out
# note: command had no output on stdout or stderr
# RUN: at line 9
env ONEAPI_DEVICE_SELECTOR=hip:gpu  /__w/llvm/llvm/build-e2e/Compression/Output/compression_multiple_tu.cpp.tmp_compress.out
# executed command: env ONEAPI_DEVICE_SELECTOR=hip:gpu /__w/llvm/llvm/build-e2e/Compression/Output/compression_multiple_tu.cpp.tmp_compress.out
# .---command stderr------------
# | <HIP>[ERROR]: 
# | UR HIP ERROR:
# | 	Value:           218
# | 	Name:            hipErrorInvalidKernelFile
# | 	Description:     invalid kernel file
# | 	Function:        buildProgram
# | 	Source Location: /__w/llvm/llvm/build/_deps/unified-runtime-src/source/adapters/hip/program.cpp:2[35](https://github.com/intel/llvm/actions/runs/11478969082/job/31946092977#step:22:36)
# | 
# | terminate called after throwing an instance of 'sycl::_V1::exception'
# |   what():  The program was built for 1 devices
# | Build program log for 'AMD Radeon RX 6700 XT':
# |  {��
# `-----------------------------
# error: command failed with exit status: -6

To reproduce

Guilty commit: 155fe2a

Environment

  • OS: Linux
  • Target device and vendor: AMD GPU

Additional context

No response

@uditagarwal97 uditagarwal97 added bug Something isn't working hip Issues related to execution on HIP backend. labels Oct 23, 2024
@uditagarwal97 uditagarwal97 self-assigned this Oct 23, 2024
@uditagarwal97
Copy link
Contributor Author

Here's why these tests are failing:
--offload-compress is being used by HIP in clang-offload-bundler and by us in clang-offload-wrapper. When we use --offload-compress for SYCL offloading to HIP, the device images gets compressed twice: once in offload-bundler and then in offload-wrapper.
Here's the clang-driver invocation:

clang++ -Xsycl-target-backend=amdgcn-amd-amdhsa --offload-arch=gfx1031 -fsycl --offload-compress -fsycl-targets=amdgcn-amd-amdhsa ../sycl/test-e2e/Compression/compression.cpp -###


...
 "/localdisk2/uagarwal/llvm/build/bin/llvm-foreach" "--out-ext=hipfb" "--in-file-list=/tmp/compression-gfx1031-0f0fe1.out" "--in-replace=/tmp/compression-gfx1031-0f0fe1.out" "--out-file-list=/tmp/compression-gfx1031-dec388.hipfb" "--out-replace=/tmp/compression-gfx1031-dec388.hipfb" "--" "/localdisk2/uagarwal/llvm/build/bin/clang-offload-bundler" "-type=o" "-bundle-align=4096" "-targets=host-x86_64-unknown-linux,hipv4-amdgcn-amd-amdhsa--gfx1031" "-input=/dev/null" "-input=/tmp/compression-gfx1031-0f0fe1.out" "-output=/tmp/compression-gfx1031-dec388.hipfb" "-compress"
 "/localdisk2/uagarwal/llvm/build/bin/file-table-tform" "-replace=Code,Code" "-o" "/tmp/compression-gfx1031-420739.table" "/tmp/compression-gfx1031-a7319f.bc" "/tmp/compression-gfx1031-dec388.hipfb"
 "/localdisk2/uagarwal/llvm/build/bin/clang-offload-wrapper" "-o=/tmp/wrapper-d311b9.bc" "-host=x86_64-unknown-linux-gnu"  "-offload-compress"  "-compile-opts=--offload-arch=gfx1031" "-target=amdgcn" "-kind=sycl" "-batch" "/tmp/compression-gfx1031-420739.table"

Note the -compress in clang-offload-bundler invocation and --offload-compress in clang-offload-wrapper invocation. It's causing the device image to get compressed twice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working hip Issues related to execution on HIP backend.
Projects
None yet
Development

No branches or pull requests

1 participant