Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build failure with Tensorflow addons 0.20 #2828

Open
npanpaliya opened this issue Apr 10, 2023 · 7 comments
Open

Build failure with Tensorflow addons 0.20 #2828

npanpaliya opened this issue Apr 10, 2023 · 7 comments

Comments

@npanpaliya
Copy link
Contributor

npanpaliya commented Apr 10, 2023

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux x86_64
  • TensorFlow version and how it was installed (source or binary): 2.12 via conda package of TF (Built using https://github.com/open-ce/tensorflow-feedstock)
  • TensorFlow-Addons version and how it was installed (source or binary): 0.20 (Built from source)
  • Python version: Python 3.10
  • Is GPU used? (yes/no): yes

Describe the bug
While building TF addons 0.20 with TF 2.12, cuda 11.8 and cudnn 8.8.1, I'm seeing following build failure -

n file included from /usr/local/cuda-11.8/bin/../targets/x86_64-linux/include/thrust/system/cuda/config.h:33,
                 from /usr/local/cuda-11.8/bin/../targets/x86_64-linux/include/thrust/system/cuda/detail/execution_policy.h:35,
                 from /usr/local/cuda-11.8/bin/../targets/x86_64-linux/include/thrust/iterator/detail/device_system_tag.h:23,
                 from /usr/local/cuda-11.8/bin/../targets/x86_64-linux/include/thrust/iterator/detail/iterator_facade_category.h:22,
                 from /usr/local/cuda-11.8/bin/../targets/x86_64-linux/include/thrust/iterator/iterator_facade.h:37,
                 from external/cub_archive/cub/device/../iterator/arg_index_input_iterator.cuh:48,
                 from external/cub_archive/cub/device/device_reduce.cuh:41,
                 from tensorflow_addons/custom_ops/layers/cc/kernels/correlation_cost_op_gpu.cu.cc:20:
/usr/local/cuda-11.8/bin/../targets/x86_64-linux/include/cub/util_namespace.cuh:46:2: error: #error CUB requires a definition of CUB_NS_QUALIFIER when CUB_NS_PREFIX/POSTFIX are defined.
   46 | #error CUB requires a definition of CUB_NS_QUALIFIER when CUB_NS_PREFIX/POSTFIX are defined.

My .bazelrc looks like

build --action_env TF_HEADER_DIR="/opt/conda/envs/testaddons/lib/python3.10/site-packages/tensorflow/include"
build --action_env TF_SHARED_LIBRARY_DIR="/opt/conda/envs/testaddons/lib/python3.10/site-packages/tensorflow"
build --action_env TF_SHARED_LIBRARY_NAME="libtensorflow_framework.so.2"
build --action_env TF_CXX11_ABI_FLAG="1"
build --action_env TF_CPLUSPLUS_VER="c++17"
build --spawn_strategy=standalone
build --strategy=Genrule=standalone
build  --experimental_repo_remote_exec
build -c opt
build --cxxopt="-D_GLIBCXX_USE_CXX11_ABI=1"
build --copt=-mavx
build --cxxopt=-std=c++17
build --host_cxxopt=-std=c++17
build --action_env TF_NEED_CUDA="1"
build --action_env CUDA_TOOLKIT_PATH="/usr/local/cuda,/opt/conda/envs/testaddons,/usr/include"
build --action_env CUDNN_INSTALL_PATH="/opt/conda/envs/testaddons"
build --action_env TF_CUDA_VERSION="11"
build --action_env TF_CUDNN_VERSION="8.8"
test --config=cuda
build --config=cuda
build:cuda --define=using_cuda=true --define=using_cuda_nvcc=true
build:cuda [email protected]_manylinux2014-cuda11.8-cudnn8.6-tensorrt8.4_config_cuda//crosstool:toolchain
build --action_env PYTHON_BIN_PATH="/opt/conda/envs/testaddons/bin/python"
build --action_env PYTHON_LIB_PATH="/opt/conda/envs/testaddons/lib/python3.10/site-packages"
build --python_path="/opt/conda/envs/testaddons/bin/python"
build --action_env GCC_HOST_COMPILER_PATH="/opt/conda/envs/testaddons/bin/x86_64-conda-linux-gnu-cc"

Code to reproduce the issue
Build command:
bazel build -s --enable_runfiles build_pip_pkg

Please provide some help to get rid of this build error.

Provide a reproducible test case that is the bare minimum necessary to generate the problem.

Other info / logs

Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached.

@npanpaliya
Copy link
Contributor Author

@seanpmorgan - Could you please provide some pointer?

@npanpaliya
Copy link
Contributor Author

Does anyone have any pointers to fix this issue?

@bhack
Copy link
Contributor

bhack commented Apr 18, 2023

it seems similar to
dmlc/xgboost#7378
fixed with dmlc/xgboost#7379

@npanpaliya
Copy link
Contributor Author

Okay.. Thanks @bhack. I'll give this a try.

@MrAta
Copy link

MrAta commented Oct 17, 2023

Running into the same issue when building tf addons 0.19 with cuda 11.8. what config should be used in this case?
In my case removing cub from WORKSPACE similar to #2821 works. @seanpmorgan May I know what's the reason for cub removal in that PR?

@854768750
Copy link

I have this issue in another project.
Tried CUDA 10.1 and 12.3. Same issue.
But there is no error with CUDA 11.4

@fuhailin
Copy link

Same issue with CUDA 10.8

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants