From d2a2a567c553175695e835797b10fc682992ad57 Mon Sep 17 00:00:00 2001 From: randyh62 Date: Fri, 17 Jan 2025 15:50:38 -0800 Subject: [PATCH] New reorg to remove conflict --- docs/building/build-hipify-perl.rst | 17 + docs/building/building-hipify.rst | 561 +++++++++++++++ docs/hipify-clang.rst | 937 ------------------------- docs/hipify-perl.rst | 57 -- docs/how-to/hipify-clang.rst | 394 +++++++++++ docs/how-to/hipify-perl.rst | 45 ++ docs/index.rst | 28 +- docs/reference/hipify-clang-cmd.rst | 203 ++++++ docs/reference/hipify-perl-cmd.rst | 85 +++ docs/{ => reference}/supported_apis.md | 0 docs/sphinx/_toc.yml.in | 28 +- 11 files changed, 1350 insertions(+), 1005 deletions(-) create mode 100644 docs/building/build-hipify-perl.rst create mode 100644 docs/building/building-hipify.rst delete mode 100644 docs/hipify-clang.rst delete mode 100644 docs/hipify-perl.rst create mode 100644 docs/how-to/hipify-clang.rst create mode 100644 docs/how-to/hipify-perl.rst create mode 100644 docs/reference/hipify-clang-cmd.rst create mode 100644 docs/reference/hipify-perl-cmd.rst rename docs/{ => reference}/supported_apis.md (100%) diff --git a/docs/building/build-hipify-perl.rst b/docs/building/build-hipify-perl.rst new file mode 100644 index 00000000..6518115c --- /dev/null +++ b/docs/building/build-hipify-perl.rst @@ -0,0 +1,17 @@ +.. meta:: + :description: Tools to automatically translate CUDA source code into portable HIP C++ + :keywords: HIPIFY, ROCm, library, tool, CUDA, CUDA2HIP, hipify-clang, hipify-perl + +.. _build-hipify-perl: + +=================== +Building hipify-perl +=================== + +``hipify-perl`` is a perl-based script that heavily uses regular expressions, which is automatically generated from ``hipify-clang``. To generate ``hipify-perl``, run: + +.. code-block:: shell + + hipify-clang --perl + +You can choose to specify the output directory for the generated ``hipify-perl`` file using ``--o-hipify-perl-dir`` option. diff --git a/docs/building/building-hipify.rst b/docs/building/building-hipify.rst new file mode 100644 index 00000000..7770de41 --- /dev/null +++ b/docs/building/building-hipify.rst @@ -0,0 +1,561 @@ +.. meta:: + :description: Tools to automatically translate CUDA source code into portable HIP C++ + :keywords: HIPIFY, ROCm, library, tool, CUDA, CUDA2HIP, hipify-clang, hipify-perl + +.. _build-hipify-clang: + +************************************************************************** +Building hipify-clang +************************************************************************** + +After cloning the HIPIFY repository (``git clone https://github.com/ROCm/HIPIFY.git``), run the following commands from the HIPIFY root folder. + +.. code-block:: bash + + cd .. \ + mkdir build dist \ + cd build + + cmake \ + -DCMAKE_INSTALL_PREFIX=../dist \ + -DCMAKE_BUILD_TYPE=Release \ + ../hipify + + make -j install + +To ensure LLVM is found, or in case of multiple LLVM instances, specify the path to the root folder containing the LLVM distribution: + +.. code-block:: bash + + -DCMAKE_PREFIX_PATH=/usr/llvm/19.1.6/dist + +On Windows, specify the following option for CMake: +``-G "Visual Studio 17 2022"`` + +Build the generated ``hipify-clang.sln`` using ``Visual Studio 17 2022`` instead of ``Make``. See :ref:`Windows testing` for the +supported tools for building. + +As debug build type ``-DCMAKE_BUILD_TYPE=Debug`` is supported and tested, it is recommended to build ``LLVM+Clang`` +in ``debug`` mode. + +Also, 64-bit build mode (``-Thost=x64`` on Windows) is supported, hence it is recommended to build ``LLVM+Clang`` in +64-bit mode. + +You can find the binary at ``./dist/hipify-clang`` or at the folder specified by the +``-DCMAKE_INSTALL_PREFIX`` option. + +Testing hipify-clang +================================================ + +``hipify-clang`` is equipped with unit tests using LLVM +`lit `_ or `FileCheck `_. + +Build ``LLVM+Clang`` from sources, as prebuilt binaries are not exhaustive for testing. Before +building, ensure that the +`software required for building `_ +belongs to an appropriate version. + +LLVM >= 10.0.0 +----------------- + +1. Download `LLVM project `_ sources. + +2. Build `LLVM project `_: + + .. code-block:: bash + + cd .. \ + mkdir build dist \ + cd build + + **Linux**: + + .. code-block:: bash + + cmake \ + -DCMAKE_INSTALL_PREFIX=../dist \ + -DLLVM_TARGETS_TO_BUILD="X86" \ + -DLLVM_ENABLE_PROJECTS="clang" \ + -DLLVM_INCLUDE_TESTS=OFF \ + -DCMAKE_BUILD_TYPE=Release \ + ../llvm-project/llvm + make -j install + + **Windows**: + + .. code-block:: shell + + cmake \ + -G "Visual Studio 17 2022" \ + -A x64 \ + -Thost=x64 \ + -DCMAKE_INSTALL_PREFIX=../dist \ + -DLLVM_TARGETS_TO_BUILD="" \ + -DLLVM_ENABLE_PROJECTS="clang" \ + -DLLVM_INCLUDE_TESTS=OFF \ + -DCMAKE_BUILD_TYPE=Release \ + ../llvm-project/llvm + + Run ``Visual Studio 17 2022``, open the generated ``LLVM.sln``, build all, and build project ``INSTALL``. + +3. Install `CUDA `_ version 7.0 or + greater. + + * In case of multiple CUDA installations, specify the particular version using ``DCUDA_TOOLKIT_ROOT_DIR`` option: + + **Linux**: + + .. code-block:: bash + + -DCUDA_TOOLKIT_ROOT_DIR=/usr/include + + **Windows**: + + .. code-block:: shell + + -DCUDA_TOOLKIT_ROOT_DIR="C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.6" + + -DCUDA_SDK_ROOT_DIR="C:/ProgramData/NVIDIA Corporation/CUDA Samples/v12.6" + +4. [Optional] Install `cuTensor `_: + + * To specify the path to `cuTensor `_, use the ``CUDA_TENSOR_ROOT_DIR`` option: + + **Linux**: + + .. code-block:: bash + + -DCUDA_TENSOR_ROOT_DIR=/usr/include + + **Windows**: + + .. code-block:: shell + + -DCUDA_TENSOR_ROOT_DIR=D:/CUDA/cuTensor/2.0.2.1 + +5. [Optional] Install `cuDNN `_ belonging to the version corresponding + to the CUDA version: + + * To specify the path to `cuDNN `_, use the ``CUDA_DNN_ROOT_DIR`` option: + + **Linux**: + + .. code-block:: bash + + -DCUDA_DNN_ROOT_DIR=/usr/include + + **Windows**: + + .. code-block:: shell + + -DCUDA_DNN_ROOT_DIR=D:/CUDA/cuDNN/9.6.0 + +6. [Optional] Install `CUB 1.9.8 `_ for ``CUDA < 11.0`` only; + for ``CUDA >= 11.0``, the CUB shipped with CUDA will be used for testing. + + * To specify the path to CUB, use the ``CUDA_CUB_ROOT_DIR`` option (only for ``CUDA < 11.0``): + + **Linux**: + + .. code-block:: bash + + -DCUDA_CUB_ROOT_DIR=/srv/git/CUB + + **Windows**: + + .. code-block:: shell + + -DCUDA_CUB_ROOT_DIR=D:/CUDA/CUB + +7. Install `Python `_ version 3.0 or greater. + +8. Install ``lit`` and ``FileCheck``; these are distributed with LLVM. + + * Install ``lit`` into ``Python``: + + **Linux**: + + .. code-block:: bash + + python /usr/llvm/19.1.6/llvm-project/llvm/utils/lit/setup.py install + + **Windows**: + + .. code-block:: shell + + python D:/LLVM/19.1.6/llvm-project/llvm/utils/lit/setup.py install + + In case of errors similar to ``ModuleNotFoundError: No module named 'setuptools'``, upgrade the ``setuptools`` package: + + .. code-block:: bash + + python -m pip install --upgrade pip setuptools + + * Starting with LLVM 6.0.1, specify the path to the ``llvm-lit`` Python script using the ``LLVM_EXTERNAL_LIT`` option: + + **Linux**: + + .. code-block:: bash + + -DLLVM_EXTERNAL_LIT=/usr/llvm/19.1.6/build/bin/llvm-lit + + **Windows**: + + .. code-block:: shell + + -DLLVM_EXTERNAL_LIT=D:/LLVM/19.1.6/build/Release/bin/llvm-lit.py + + * ``FileCheck``: + + **Linux**: + + Copy from ``/usr/llvm/19.1.6/build/bin/`` to ``CMAKE_INSTALL_PREFIX/dist/bin``. + + **Windows**: + + Copy from ``D:/LLVM/19.1.6/build/Release/bin`` to ``CMAKE_INSTALL_PREFIX/dist/bin``. + + Alternatively, specify the path to ``FileCheck`` in the ``CMAKE_INSTALL_PREFIX`` option. + +9. To run OpenGL tests successfully on: + + **Linux**: + + Install GL headers. + + On Ubuntu, use: ``sudo apt-get install mesa-common-dev`` + + **Windows**: + + No installation required. All the required headers are shipped with the Windows SDK. + +10. Set the ``HIPIFY_CLANG_TESTS`` option to ``ON``: ``-DHIPIFY_CLANG_TESTS=ON`` + +11. Build and run tests. + +LLVM <= 9.0.1 +--------------------------------------------------------------------- + +1. Download `LLVM `_ \+ `Clang `_ sources + +2. Build `LLVM+Clang `_: + + .. code-block:: bash + + cd .. \ + mkdir build dist \ + cd build + + **Linux**: + + .. code-block:: bash + + cmake \ + -DCMAKE_INSTALL_PREFIX=../dist \ + -DLLVM_SOURCE_DIR=../llvm \ + -DLLVM_TARGETS_TO_BUILD="X86" \ + -DLLVM_INCLUDE_TESTS=OFF \ + -DCMAKE_BUILD_TYPE=Release \ + ../llvm + make -j install + + **Windows**: + + .. code-block:: shell + + cmake \ + -G "Visual Studio 16 2019" \ + -A x64 \ + -Thost=x64 \ + -DCMAKE_INSTALL_PREFIX=../dist \ + -DLLVM_SOURCE_DIR=../llvm \ + -DLLVM_TARGETS_TO_BUILD="" \ + -DLLVM_INCLUDE_TESTS=OFF \ + -DCMAKE_BUILD_TYPE=Release \ + ../llvm + +3. Run ``Visual Studio 16 2019``, open the generated ``LLVM.sln``, build all, and build the ``INSTALL`` project. + +Linux testing +====================================================== + +On Linux, the following configurations are tested: + +* Ubuntu 22-23: LLVM 13.0.0 - 19.1.6, CUDA 7.0 - 12.6.3, cuDNN 8.0.5 - 9.6.0, cuTensor 1.0.1.0 - 2.0.2.1 +* Ubuntu 20-21: LLVM 9.0.0 - 19.1.6, CUDA 7.0 - 12.6.3, cuDNN 5.1.10 - 9.6.0, cuTensor 1.0.1.0 - 2.0.2.1 +* Ubuntu 16-19: LLVM 8.0.0 - 14.0.6, CUDA 7.0 - 10.2, cuDNN 5.1.10 - 8.0.5 +* Ubuntu 14: LLVM 4.0.0 - 7.1.0, CUDA 7.0 - 9.0, cuDNN 5.0.5 - 7.6.5 + +Minimum build system requirements for the above configurations: + +* CMake 3.16.8, GNU C/C++ 9.2, Python 3.0. + +Recommended build system requirements: + +* CMake 3.31.2, GNU C/C++ 13.2, Python 3.13.1. + +Here's how to build ``hipify-clang`` with testing support on ``Ubuntu 23.10.01``: + +.. code-block:: bash + + cmake + -DHIPIFY_CLANG_TESTS=ON \ + -DCMAKE_BUILD_TYPE=Release \ + -DCMAKE_INSTALL_PREFIX=../dist \ + -DCMAKE_PREFIX_PATH=/usr/llvm/19.1.6/dist \ + -DCUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda-12.6.3 \ + -DCUDA_DNN_ROOT_DIR=/usr/local/cudnn-9.6.0 \ + -DCUDA_TENSOR_ROOT_DIR=/usr/local/cutensor-2.0.2.1 \ + -DLLVM_EXTERNAL_LIT=/usr/llvm/19.1.6/build/bin/llvm-lit \ + ../hipify + +The corresponding successful output is: + +.. code-block:: shell + + -- The C compiler identification is GNU 13.2.0 + -- The CXX compiler identification is GNU 13.2.0 + -- Detecting C compiler ABI info + -- Detecting C compiler ABI info - done + -- Check for working C compiler: /usr/bin/cc - skipped + -- Detecting C compile features + -- Detecting C compile features - done + -- Detecting CXX compiler ABI info + -- Detecting CXX compiler ABI info - done + -- Check for working CXX compiler: /usr/bin/c++ - skipped + -- Detecting CXX compile features + -- Detecting CXX compile features - done + -- HIPIFY config: + -- - Build hipify-clang : ON + -- - Test hipify-clang : ON + -- - Is part of HIP SDK : OFF + -- - Install clang headers : ON + -- Found ZLIB: /usr/lib/x86_64-linux-gnu/libz.so (found version "1.2.13") + -- Found LLVM 19.1.6: + -- - CMake module path : /usr/llvm/19.1.6/dist/lib/cmake/llvm + -- - Clang include path : /usr/llvm/19.1.6/dist/include + -- - LLVM Include path : /usr/llvm/19.1.6/dist/include + -- - Binary path : /usr/llvm/19.1.6/dist/bin + -- Linker detection: GNU ld + -- ---- The below configuring for hipify-clang testing only ---- + -- Found Python: /usr/bin/python3.13 (found suitable version "3.13.1", required range is "3.0...3.14") found components: Interpreter + -- Found lit: /usr/local/bin/lit + -- Found FileCheck: /GIT/LLVM/trunk/dist/FileCheck + -- Initial CUDA to configure: + -- - CUDA Toolkit path : /usr/local/cuda-12.6.3 + -- - CUDA Samples path : + -- - cuDNN path : /usr/local/cudnn-9.6.0 + -- - cuTENSOR path : /usr/local/cuTensor/2.0.2.1 + -- - CUB path : + -- Found CUDAToolkit: /usr/local/cuda-12.6.3/targets/x86_64-linux/include (found version "12.6.85") + -- Performing Test CMAKE_HAVE_LIBC_PTHREAD + -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success + -- Found Threads: TRUE + -- Found CUDA config: + -- - CUDA Toolkit path : /usr/local/cuda-12.6.3 + -- - CUDA Samples path : OFF + -- - cuDNN path : /usr/local/cudnn-9.6.0 + -- - CUB path : /usr/local/cuda-12.6.3/include/cub + -- - cuTENSOR path : /usr/local/cuTensor/2.0.2.1 + -- Configuring done (0.6s) + -- Generating done (0.0s) + -- Build files have been written to: /usr/hipify/build + +.. code-block:: shell + + make test-hipify + +The corresponding successful output is: + +.. code-block:: shell + + Running HIPify regression tests + =============================================================== + CUDA 12.6.85 - will be used for testing + LLVM 19.1.6 - will be used for testing + x86_64 - Platform architecture + Linux 6.5.0-15-generic - Platform OS + 64 - hipify-clang binary bitness + 64 - python 3.13.1 binary bitness + =============================================================== + -- Testing: 106 tests, 12 threads -- + Testing Time: 6.91s + + Total Discovered Tests: 106 + Passed: 106 (100.00%) + +.. _Windows testing: + +Windows testing +===================================================== + +Tested configurations: + +.. list-table:: + :header-rows: 1 + + * - LLVM + - CUDA + - cuDNN + - Visual Studio + - CMake + - Python + * - ``4.0.0 - 5.0.2`` + - ``7.0 - 8.0`` + - ``5.1.10 - 7.1.4`` + - ``2015.14.0, 2017.15.5.2`` + - ``3.5.1 - 3.18.0`` + - ``3.6.4 - 3.8.5`` + * - ``6.0.0 - 6.0.1`` + - ``7.0 - 9.0`` + - ``7.0.5 - 7.6.5`` + - ``2015.14.0, 2017.15.5.5`` + - ``3.6.0 - 3.18.0`` + - ``3.7.2 - 3.8.5`` + * - ``7.0.0 - 7.1.0`` + - ``7.0 - 9.2`` + - ``7.0.5 - 7.6.5`` + - ``2017.15.9.11`` + - ``3.13.3 - 3.18.0`` + - ``3.7.3 - 3.8.5`` + * - ``8.0.0 - 8.0.1`` + - ``7.0 - 10.0`` + - ``7.6.5`` + - ``2017.15.9.15`` + - ``3.14.2 - 3.18.0`` + - ``3.7.4 - 3.8.5`` + * - ``9.0.0 - 9.0.1`` + - ``7.0 - 10.1`` + - ``7.6.5`` + - ``2017.15.9.20, 2019.16.4.5`` + - ``3.16.4 - 3.18.0`` + - ``3.8.0 - 3.8.5`` + * - ``10.0.0 - 11.0.0`` + - ``7.0 - 11.1`` + - ``7.6.5 - 8.0.5`` + - ``2017.15.9.30, 2019.16.8.3`` + - ``3.19.2`` + - ``3.9.1`` + * - ``11.0.1 - 11.1.0`` + - ``7.0 - 11.2.2`` + - ``7.6.5 - 8.0.5`` + - ``2017.15.9.31, 2019.16.8.4`` + - ``3.19.3`` + - ``3.9.2`` + * - ``12.0.0 - 13.0.1`` + - ``7.0 - 11.5.1`` + - ``7.6.5 - 8.3.2`` + - ``2017.15.9.43, 2019.16.11.9`` + - ``3.22.2`` + - ``3.10.2`` + * - ``14.0.0 - 14.0.6`` + - ``7.0 - 11.7.1`` + - ``8.0.5 - 8.4.1`` + - ``2017.15.9.57,`` :sup:`5` ``2019.16.11.17, 2022.17.2.6`` + - ``3.24.0`` + - ``3.10.6`` + * - ``15.0.0 - 15.0.7`` + - ``7.0 - 11.8.0`` + - ``8.0.5 - 8.8.1`` + - ``2019.16.11.25, 2022.17.5.2`` + - ``3.26.0`` + - ``3.11.2`` + * - ``16.0.0 - 16.0.6`` + - ``7.0 - 12.2.2`` + - ``8.0.5 - 8.9.5`` + - ``2019.16.11.29, 2022.17.7.1`` + - ``3.27.3`` + - ``3.11.4`` + * - ``17.0.1`` :sup:`6` - ``18.1.8`` :sup:`7` + - ``7.0 - 12.3.2`` + - ``8.0.5 - 9.6.0`` + - ``2019.16.11.42, 2022.17.12.3`` + - ``3.31.2`` + - ``3.13.1`` + * - ``19.1.0 - 19.1.6`` + - ``7.0 - 12.6.3`` + - ``8.0.5 - 9.6.0`` + - ``2019.16.11.42, 2022.17.12.3`` + - ``3.31.2`` + - ``3.13.1`` + +:sup:`5` LLVM 14.x.x is the latest major release supporting Visual Studio 2017. + +To build LLVM 14.x.x correctly using Visual Studio 2017, add ``-DLLVM_FORCE_USE_OLD_TOOLCHAIN=ON`` +to corresponding CMake command line. + +You can also build LLVM \< 14.x.x correctly using Visual Studio 2017 without the +``LLVM_FORCE_USE_OLD_TOOLCHAIN`` option. + +:sup:`6` Note that LLVM 17.0.0 was withdrawn due to an issue; use 17.0.1 or newer instead. + +:sup:`7` Note that LLVM 18.0.0 has never been released; use 18.1.0 or newer instead. + +Building with testing support using ``Visual Studio 17 2022`` on ``Windows 11``: + +.. code-block:: shell + + cmake + -G "Visual Studio 17 2022" \ + -A x64 \ + -Thost=x64 \ + -DHIPIFY_CLANG_TESTS=ON \ + -DCMAKE_BUILD_TYPE=Release \ + -DCMAKE_INSTALL_PREFIX=../dist \ + -DCMAKE_PREFIX_PATH=D:/LLVM/19.1.6/dist \ + -DCUDA_TOOLKIT_ROOT_DIR="C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.6" \ + -DCUDA_SDK_ROOT_DIR="C:/ProgramData/NVIDIA Corporation/CUDA Samples/v12.5" \ + -DCUDA_DNN_ROOT_DIR=D:/CUDA/cuDNN/9.6.0 \ + -DCUDA_TENSOR_ROOT_DIR=D:/CUDA/cuTensor/2.0.2.1 \ + -DLLVM_EXTERNAL_LIT=D:/LLVM/19.1.6/build/Release/bin/llvm-lit.py \ + ../hipify + +The corresponding successful output is: + +.. code-block:: shell + + -- Selecting Windows SDK version 10.0.22621.0 to target Windows 10.0.22631. + -- The C compiler identification is MSVC 19.42.34435.0 + -- The CXX compiler identification is MSVC 19.42.34435.0 + -- Detecting C compiler ABI info + -- Detecting C compiler ABI info - done + -- Check for working C compiler: C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.42.34433/bin/Hostx64/x64/cl.exe - skipped + -- Detecting C compile features + -- Detecting C compile features - done + -- Detecting CXX compiler ABI info + -- Detecting CXX compiler ABI info - done + -- Check for working CXX compiler: C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.42.34433/bin/Hostx64/x64/cl.exe - skipped + -- Detecting CXX compile features + -- Detecting CXX compile features - done + -- HIPIFY config: + -- - Build hipify-clang : ON + -- - Test hipify-clang : ON + -- - Is part of HIP SDK : OFF + -- - Install clang headers : ON + -- Found LLVM 19.1.6: + -- - CMake module path : D:/LLVM/19.1.6/dist/lib/cmake/llvm + -- - Clang include path : D:/LLVM/19.1.6/dist/include + -- - LLVM Include path : D:/LLVM/19.1.6/dist/include + -- - Binary path : D:/LLVM/19.1.6/dist/bin + -- ---- The below configuring for hipify-clang testing only ---- + -- Found Python: C:/Users/TT/AppData/Local/Programs/Python/Python313/python.exe (found suitable version "3.13.1", required range is "3.0...3.14") found components: Interpreter + -- Found lit: C:/Users/TT/AppData/Local/Programs/Python/Python313/Scripts/lit.exe + -- Found FileCheck: D:/LLVM/19.1.6/dist/bin/FileCheck.exe + -- Initial CUDA to configure: + -- - CUDA Toolkit path : C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.6 + -- - CUDA Samples path : C:/ProgramData/NVIDIA Corporation/CUDA Samples/v12.5 + -- - cuDNN path : D:/CUDA/cuDNN/9.6.0 + -- - cuTENSOR path : D:/CUDA/cuTensor/2.0.2.1 + -- - CUB path : + -- Found CUDAToolkit: C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.6/include (found version "12.6.85") + -- Found CUDA config: + -- - CUDA Toolkit path : C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.6 + -- - CUDA Samples path : C:/ProgramData/NVIDIA Corporation/CUDA Samples/v12.5 + -- - cuDNN path : D:/CUDA/cuDNN/9.6.0 + -- - cuTENSOR path : D:/CUDA/cuTensor/2.0.2.1 + -- - CUB path : C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.6/include/cub + -- Configuring done (4.4s) + -- Generating done (0.1s) + -- Build files have been written to: D:/HIPIFY/build + +Run ``Visual Studio 17 2022``, open the generated ``hipify-clang.sln``, and build the project ``test-hipify``. diff --git a/docs/hipify-clang.rst b/docs/hipify-clang.rst deleted file mode 100644 index d7323d25..00000000 --- a/docs/hipify-clang.rst +++ /dev/null @@ -1,937 +0,0 @@ -.. meta:: - :description: Tools to automatically translate CUDA source code into portable HIP C++ - :keywords: HIPIFY, ROCm, library, tool, CUDA, CUDA2HIP, hipify-clang, hipify-perl - -.. _hipify-clang: - -************************************************************************** -hipify-clang -************************************************************************** - -``hipify-clang`` is a Clang-based tool for translating NVIDIA CUDA sources into HIP sources. - -It translates CUDA source into an Abstract Syntax Tree (AST), which is traversed by transformation -matchers. After applying all the matchers, the output HIP source is produced. - -**Advantages:** - -- ``hipify-clang`` is a translator. It parses complex constructs successfully or else reports an error. -- It supports Clang options such as - `-I `_, - `-D `_, and - `--cuda-path `_. -- The support for new CUDA versions is seamless, as the Clang front-end is statically linked into - ``hipify-clang`` and does all the syntactical parsing of a CUDA source to HIPIFY. -- It is very well supported as a compiler extension. - -**Disadvantages:** - -- You must ensure that the input CUDA code is correct as incorrect code can't be translated to HIP. -- You must install CUDA and in case of multiple installations, specify using ``--cuda-path`` option. -- You must provide all the ``includes`` and ``defines`` to successfully translate the code. - -Dependencies -================ - -``hipify-clang`` requires: - -* `LLVM+Clang `_ of at least version - `4.0.0 `_; the latest stable and recommended release: - `19.1.6 `_. - -* `CUDA `_ of at least version - `7.0 `_, the latest supported version is - `12.6.3 `_. - -.. list-table:: - - * - LLVM release version - - Latest supported CUDA version - - Windows - - Linux - * - `3.8.0 `_ :sup:`1`, - `3.8.1 `_ :sup:`1`, - `3.9.0 `_ :sup:`1`, - `3.9.1 `_ :sup:`1` - - `7.5 `_ - - ✅ - - ✅ - * - `4.0.0 `_, - `4.0.1 `_, - `5.0.0 `_, - `5.0.1 `_, - `5.0.2 `_ - - `8.0 `_ - - ✅ - - ✅ - * - `6.0.0 `_, - `6.0.1 `_ - - `9.0 `_ - - ✅ - - ✅ - * - `7.0.0 `_, - `7.0.1 `_, - `7.1.0 `_ - - `9.2 `_ - - Works only with patch due to Clang bug `38811 `_ - |patch for 7.0.0| :sup:`2` - |patch for 7.0.1| :sup:`2` - |patch for 7.1.0| :sup:`2` - - ❌ due to Clang bug `36384 `_ - * - `8.0.0 `_, - `8.0.1 `_ - - `10.0 `_ - - Works only with patch due to Clang bug `38811 `_ - |patch for 8.0.0| :sup:`2` - |patch for 8.0.1| :sup:`2` - - ✅ - * - `9.0.0 `_, - `9.0.1 `_ - - `10.1 `_ - - ✅ - - ✅ - * - `10.0.0 `_, - `10.0.1 `_ - - `11.0.0 `_ - - ✅ - - ✅ - * - `10.0.0 `_, - `10.0.1 `_ - - `11.0.1 `_, - `11.1.0 `_, - `11.1.1 `_ - - Works only with patch due to Clang bug `47332 `_ - |patch for 10.0.0| :sup:`3` - |patch for 10.0.1| :sup:`3` - - Works only with patch due to Clang bug `47332 `_ - |patch for 10.0.0| :sup:`3` - |patch for 10.0.1| :sup:`3` - * - `11.0.0 `_ - - `11.0.0 `_ - - ✅ - - ✅ - * - `11.0.0 `_ - - `11.0.1 `_, - `11.1.0 `_, - `11.1.1 `_ - - Works only with patch due to Clang bug `47332 `_ - |patch for 11.0.0| :sup:`3` - - Works only with patch due to Clang bug `47332 `_ - |patch for 11.0.0| :sup:`3` - * - `11.0.1 `_, - `11.1.0 `_ - - `11.2.2 `_ - - ✅ - - ✅ - * - `12.0.0 `_, - `12.0.1 `_, - `13.0.0 `_, - `13.0.1 `_ - - `11.5.1 `_ - - ✅ - - ✅ - * - `14.0.0 `_, - `14.0.1 `_, - `14.0.2 `_, - `14.0.3 `_, - `14.0.4 `_ - - `11.7.1 `_ - - Works only with patch due to Clang bug `54609 `_ - |patch for 14.0.0| :sup:`2` - |patch for 14.0.1| :sup:`2` - |patch for 14.0.2| :sup:`2` - |patch for 14.0.3| :sup:`2` - |patch for 14.0.4| :sup:`2` - - ✅ - * - `14.0.5 `_, - `14.0.6 `_, - `15.0.0 `_, - `15.0.1 `_, - `15.0.2 `_, - `15.0.3 `_, - `15.0.4 `_, - `15.0.5 `_, - `15.0.6 `_, - `15.0.7 `_ - - `11.8.0 `_ - - ✅ - - ✅ - * - `16.0.0 `_, - `16.0.1 `_, - `16.0.2 `_, - `16.0.3 `_, - `16.0.4 `_, - `16.0.5 `_, - `16.0.6 `_ - - `12.2.2 `_ - - ✅ - - ✅ - * - `17.0.1 `_, - `17.0.2 `_, - `17.0.3 `_, - `17.0.4 `_, - `17.0.5 `_, - `17.0.6 `_, - `18.1.0 `_, - `18.1.1 `_, - `18.1.2 `_, - `18.1.3 `_, - `18.1.4 `_, - `18.1.5 `_, - `18.1.6 `_, - `18.1.7 `_, - `18.1.8 `_ - - `12.3.2 `_ - - ✅ - - ✅ - * - `19.1.0 `_, - `19.1.1 `_, - `19.1.2 `_, - `19.1.3 `_, - `19.1.4 `_, - `19.1.5 `_, - `19.1.6 `_:sup:`4` - - `12.6.3 `_:sup:`4` - - **Latest stable config** - - **Latest stable config** - -.. |patch for 7.0.0| replace:: - :download:`patch for 7.0.0 <./data/patches/patch_for_clang_7.0.0_bug_38811.zip>` -.. |patch for 7.0.1| replace:: - :download:`patch for 7.0.1 <./data/patches/patch_for_clang_7.0.1_bug_38811.zip>` -.. |patch for 7.1.0| replace:: - :download:`patch for 7.1.0 <./data/patches/patch_for_clang_7.1.0_bug_38811.zip>` -.. |patch for 8.0.0| replace:: - :download:`patch for 8.0.0 <./data/patches/patch_for_clang_8.0.0_bug_38811.zip>` -.. |patch for 8.0.1| replace:: - :download:`patch for 8.0.1 <./data/patches/patch_for_clang_8.0.1_bug_38811.zip>` -.. |patch for 10.0.0| replace:: - :download:`patch for 10.0.0 <./data/patches/patch_for_clang_10.0.0_bug_47332.zip>` -.. |patch for 10.0.1| replace:: - :download:`patch for 10.0.1 <./data/patches/patch_for_clang_10.0.1_bug_47332.zip>` -.. |patch for 11.0.0| replace:: - :download:`patch for 11.0.0 <./data/patches/patch_for_clang_11.0.0_bug_47332.zip>` -.. |patch for 14.0.0| replace:: - :download:`patch for 14.0.0 <./data/patches/patch_for_clang_14.0.0_bug_54609.zip>` -.. |patch for 14.0.1| replace:: - :download:`patch for 14.0.1 <./data/patches/patch_for_clang_14.0.1_bug_54609.zip>` -.. |patch for 14.0.2| replace:: - :download:`patch for 14.0.2 <./data/patches/patch_for_clang_14.0.2_bug_54609.zip>` -.. |patch for 14.0.3| replace:: - :download:`patch for 14.0.3 <./data/patches/patch_for_clang_14.0.3_bug_54609.zip>` -.. |patch for 14.0.4| replace:: - :download:`patch for 14.0.4 <./data/patches/patch_for_clang_14.0.4_bug_54609.zip>` - -:sup:`1` ``LLVM 3.x`` is no longer supported (but might still work). - -:sup:`2` Download the patch and unpack it into your ``LLVM distributive directory``. This overwrites a few header files. You don't need to rebuild ``LLVM``. - -:sup:`3` Download the patch and unpack it into your ``LLVM source directory``. This overwrites the ``Cuda.cpp`` file. You need to rebuild ``LLVM``. - -:sup:`4` Represents the latest supported and recommended configuration. - -In most cases, you can get a suitable version of ``LLVM+Clang`` with your package manager. However, you can also -`download a release archive `_ and build or install it. In case of multiple versions of ``LLVM`` installed, set -`CMAKE_PREFIX_PATH `_ so that -``CMake`` can find the desired version of ``LLVM``. For example, ``-DCMAKE_PREFIX_PATH=D:\LLVM\19.1.6\dist``. - -Usage -============================================================ - -To process a file, ``hipify-clang`` needs access to the same headers that are required to compile it -with ``Clang``: - -.. code:: shell - - ./hipify-clang square.cu --cuda-path=/usr/local/cuda-12.6 -I /usr/local/cuda-12.6/samples/common/inc - -``hipify-clang`` arguments are supplied first, followed by a separator ``--`` and the arguments to be -passed to Clang for compiling the input file: - -.. code:: shell - - ./hipify-clang cpp17.cu --cuda-path=/usr/local/cuda-12.6 -- -std=c++17 - -``hipify-clang`` also supports the hipification of multiple files that can be specified in a single -command with absolute or relative paths: - -.. code:: shell - - ./hipify-clang cpp17.cu ../../square.cu /home/user/cuda/intro.cu --cuda-path=/usr/local/cuda-12.6 -- -std=c++17 - -To use a specific version of LLVM during hipification, specify the ``hipify-clang`` option -``--clang-resource-directory=`` to point to the Clang resource directory, which is the -parent directory for the ``include`` folder that contains ``__clang_cuda_runtime_wrapper.h`` and other -header files used during the hipification process: - -.. code:: shell - - ./hipify-clang square.cu --cuda-path=/usr/local/cuda-12.6 --clang-resource-directory=/usr/llvm/19.1.6/dist/lib/clang/19 - -For more information, refer to the `Clang manual for compiling CUDA `_. - -Using JSON compilation database -===================================================== - -For some hipification automation (starting from Clang 8.0.0), you can also provide a -`Compilation Database in JSON format `_ -in the ``compile_commands.json`` file: - -.. code:: bash - - -p or - -p= - -You can provide the compilation database in the ``compile_commands.json`` file or generate using -Clang based on CMake. You can specify multiple source files as well. - -To provide Clang options, use ``compile_commands.json`` file, whereas to provide ``hipify-clang`` options, use ``hipify-clang`` command line. - -.. note:: - - Don't use the options separator ``--`` to avoid compilation error caused due to the ``hipify-clang`` options being - provided before the separator. - -Here's an -`example `_ -demonstrating the ``compile_commands.json`` usage: - -.. code:: json - - [ - { - "directory": "", - "command": "hipify-clang \"\" -I./include -v", - "file": "cd_intro.cu" - } - ] - -Hipification statistics -======================================================= - -The options ``--print-stats`` and ``--print-stats-csv`` provide an overview of what is hipified and -what is not, and the hipification statistics: - -.. code:: cpp - - hipify-clang intro.cu -cuda-path="C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.6" --print-stats - -.. code:: cpp - - [HIPIFY] info: file 'intro.cu' statistics: - CONVERTED refs count: 40 - UNCONVERTED refs count: 0 - CONVERSION %: 100.0 - REPLACED bytes: 604 - [HIPIFY] info: file 'intro.cu' statistics: - CONVERTED refs count: 40 - UNCONVERTED refs count: 0 - CONVERSION %: 100.0 - REPLACED bytes: 604 - TOTAL bytes: 5794 - CHANGED lines of code: 34 - TOTAL lines of code: 174 - CODE CHANGED (in bytes) %: 10.4 - CODE CHANGED (in lines) %: 19.5 - TIME ELAPSED s: 0.41 - [HIPIFY] info: CONVERTED refs by type: - error: 2 - device: 2 - memory: 16 - event: 9 - thread: 1 - include_cuda_main_header: 1 - type: 2 - numeric_literal: 7 - [HIPIFY] info: CONVERTED refs by API: - CUDA Driver API: 1 - CUDA RT API: 39 - [HIPIFY] info: CONVERTED refs by names: - cuda.h: 1 - cudaDeviceReset: 1 - cudaError_t: 1 - cudaEventCreate: 2 - cudaEventElapsedTime: 1 - cudaEventRecord: 3 - cudaEventSynchronize: 3 - cudaEvent_t: 1 - cudaFree: 4 - cudaFreeHost: 3 - cudaGetDeviceCount: 1 - cudaGetErrorString: 1 - cudaGetLastError: 1 - cudaMalloc: 3 - cudaMemcpy: 6 - cudaMemcpyDeviceToHost: 3 - cudaMemcpyHostToDevice: 3 - cudaSuccess: 1 - cudaThreadSynchronize: 1 - -.. code-block:: cpp - - hipify-clang intro.cu -cuda-path="C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.6" --print-stats-csv - -This generates ``intro.cu.csv`` file with statistics: - -.. image:: ./data/csv_statistics.png - :alt: list of stats - - -In case of multiple source files, the statistics are provided per file and in total. - -For a list of ``hipify-clang`` options, run ``hipify-clang --help``. - -Building hipify-clang -===================================== - -After cloning the HIPIFY repository (``git clone https://github.com/ROCm/HIPIFY.git``), run the following commands from the HIPIFY root folder. - -.. code-block:: bash - - cd .. \ - mkdir build dist \ - cd build - - cmake \ - -DCMAKE_INSTALL_PREFIX=../dist \ - -DCMAKE_BUILD_TYPE=Release \ - ../hipify - - make -j install - -To ensure LLVM being found or in case of multiple LLVM instances, specify the path to the root folder containing the LLVM distributive: - -.. code-block:: bash - - -DCMAKE_PREFIX_PATH=/usr/llvm/19.1.6/dist - -On Windows, specify the following option for CMake in the first place: -``-G "Visual Studio 17 2022"``. -Build the generated ``hipify-clang.sln`` using -``Visual Studio 17 2022`` instead of ``Make``. See :ref:`Windows testing` for the -supported tools for building. - -As debug build type ``-DCMAKE_BUILD_TYPE=Debug`` is supported and tested, it is recommended to build ``LLVM+Clang`` -in ``debug`` mode. - -Also, 64-bit build mode (``-Thost=x64`` on Windows) is supported, hence it is recommended to build ``LLVM+Clang`` in -64-bit mode. - -You can find the binary at ``./dist/hipify-clang`` or at the folder specified by the -``-DCMAKE_INSTALL_PREFIX`` option. - -Testing hipify-clang -================================================ - -``hipify-clang`` is equipped with unit tests using LLVM -`lit `_ or `FileCheck `_. - -Build ``LLVM+Clang`` from sources, as prebuilt binaries are not exhaustive for testing. Before -building, ensure that the -`software required for building `_ -belongs to an appropriate version. - -LLVM <= 9.0.1 ---------------------------------------------------------------------- - -1. Download `LLVM `_ \+ `Clang `_ sources - -2. Build `LLVM+Clang `_: - - .. code-block:: bash - - cd .. \ - mkdir build dist \ - cd build - - **Linux**: - - .. code-block:: bash - - cmake \ - -DCMAKE_INSTALL_PREFIX=../dist \ - -DLLVM_SOURCE_DIR=../llvm \ - -DLLVM_TARGETS_TO_BUILD="X86" \ - -DLLVM_INCLUDE_TESTS=OFF \ - -DCMAKE_BUILD_TYPE=Release \ - ../llvm - make -j install - - **Windows**: - - .. code-block:: shell - - cmake \ - -G "Visual Studio 16 2019" \ - -A x64 \ - -Thost=x64 \ - -DCMAKE_INSTALL_PREFIX=../dist \ - -DLLVM_SOURCE_DIR=../llvm \ - -DLLVM_TARGETS_TO_BUILD="" \ - -DLLVM_INCLUDE_TESTS=OFF \ - -DCMAKE_BUILD_TYPE=Release \ - ../llvm - -3. Run ``Visual Studio 16 2019``, open the generated ``LLVM.sln``, build all, and build the ``INSTALL`` project. - -LLVM >= 10.0.0 ------------------ - -1. Download `LLVM project `_ sources. - -2. Build `LLVM project `_: - - .. code-block:: bash - - cd .. \ - mkdir build dist \ - cd build - - **Linux**: - - .. code-block:: bash - - cmake \ - -DCMAKE_INSTALL_PREFIX=../dist \ - -DLLVM_TARGETS_TO_BUILD="X86" \ - -DLLVM_ENABLE_PROJECTS="clang" \ - -DLLVM_INCLUDE_TESTS=OFF \ - -DCMAKE_BUILD_TYPE=Release \ - ../llvm-project/llvm - make -j install - - **Windows**: - - .. code-block:: shell - - cmake \ - -G "Visual Studio 17 2022" \ - -A x64 \ - -Thost=x64 \ - -DCMAKE_INSTALL_PREFIX=../dist \ - -DLLVM_TARGETS_TO_BUILD="" \ - -DLLVM_ENABLE_PROJECTS="clang" \ - -DLLVM_INCLUDE_TESTS=OFF \ - -DCMAKE_BUILD_TYPE=Release \ - ../llvm-project/llvm - - Run ``Visual Studio 17 2022``, open the generated ``LLVM.sln``, build all, and build project ``INSTALL``. - -3. Install `CUDA `_ version 7.0 or - greater. - - * In case of multiple CUDA installations, specify the particular version using ``DCUDA_TOOLKIT_ROOT_DIR`` option: - - **Linux**: - - .. code-block:: bash - - -DCUDA_TOOLKIT_ROOT_DIR=/usr/include - - **Windows**: - - .. code-block:: shell - - -DCUDA_TOOLKIT_ROOT_DIR="C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.6" - - -DCUDA_SDK_ROOT_DIR="C:/ProgramData/NVIDIA Corporation/CUDA Samples/v12.6" - -4. [Optional] Install `cuTensor `_: - - * To specify the path to `cuTensor `_, use the ``CUDA_TENSOR_ROOT_DIR`` option: - - **Linux**: - - .. code-block:: bash - - -DCUDA_TENSOR_ROOT_DIR=/usr/include - - **Windows**: - - .. code-block:: shell - - -DCUDA_TENSOR_ROOT_DIR=D:/CUDA/cuTensor/2.0.2.1 - -5. [Optional] Install `cuDNN `_ belonging to the version corresponding - to the CUDA version: - - * To specify the path to `cuDNN `_, use the ``CUDA_DNN_ROOT_DIR`` option: - - **Linux**: - - .. code-block:: bash - - -DCUDA_DNN_ROOT_DIR=/usr/include - - **Windows**: - - .. code-block:: shell - - -DCUDA_DNN_ROOT_DIR=D:/CUDA/cuDNN/9.6.0 - -6. [Optional] Install `CUB 1.9.8 `_ for ``CUDA < 11.0`` only; - for ``CUDA >= 11.0``, the CUB shipped with CUDA will be used for testing. - - * To specify the path to CUB, use the ``CUDA_CUB_ROOT_DIR`` option (only for ``CUDA < 11.0``): - - **Linux**: - - .. code-block:: bash - - -DCUDA_CUB_ROOT_DIR=/srv/git/CUB - - **Windows**: - - .. code-block:: shell - - -DCUDA_CUB_ROOT_DIR=D:/CUDA/CUB - -7. Install `Python `_ version 3.0 or greater. - -8. Install ``lit`` and ``FileCheck``; these are distributed with LLVM. - - * Install ``lit`` into ``Python``: - - **Linux**: - - .. code-block:: bash - - python /usr/llvm/19.1.6/llvm-project/llvm/utils/lit/setup.py install - - **Windows**: - - .. code-block:: shell - - python D:/LLVM/19.1.6/llvm-project/llvm/utils/lit/setup.py install - - In case of errors similar to ``ModuleNotFoundError: No module named 'setuptools'``, upgrade the ``setuptools`` package: - - .. code-block:: bash - - python -m pip install --upgrade pip setuptools - - * Starting with LLVM 6.0.1, specify the path to the ``llvm-lit`` Python script using the ``LLVM_EXTERNAL_LIT`` option: - - **Linux**: - - .. code-block:: bash - - -DLLVM_EXTERNAL_LIT=/usr/llvm/19.1.6/build/bin/llvm-lit - - **Windows**: - - .. code-block:: shell - - -DLLVM_EXTERNAL_LIT=D:/LLVM/19.1.6/build/Release/bin/llvm-lit.py - - * ``FileCheck``: - - **Linux**: - - Copy from ``/usr/llvm/19.1.6/build/bin/`` to ``CMAKE_INSTALL_PREFIX/dist/bin``. - - **Windows**: - - Copy from ``D:/LLVM/19.1.6/build/Release/bin`` to ``CMAKE_INSTALL_PREFIX/dist/bin``. - - Alternatively, specify the path to ``FileCheck`` in the ``CMAKE_INSTALL_PREFIX`` option. - -9. To run OpenGL tests successfully on: - - **Linux**: - - Install GL headers. - - On Ubuntu, use: ``sudo apt-get install mesa-common-dev`` - - **Windows**: - - No installation required. All the required headers are shipped with the Windows SDK. - -10. Set the ``HIPIFY_CLANG_TESTS`` option to ``ON``: ``-DHIPIFY_CLANG_TESTS=ON`` - -11. Build and run tests. - -Linux testing -====================================================== - -On Linux, the following configurations are tested: - -* Ubuntu 14: LLVM 4.0.0 - 7.1.0, CUDA 7.0 - 9.0, cuDNN 5.0.5 - 7.6.5 -* Ubuntu 16-19: LLVM 8.0.0 - 14.0.6, CUDA 7.0 - 10.2, cuDNN 5.1.10 - 8.0.5 -* Ubuntu 20-21: LLVM 9.0.0 - 19.1.6, CUDA 7.0 - 12.6.3, cuDNN 5.1.10 - 9.6.0, cuTensor 1.0.1.0 - 2.0.2.1 -* Ubuntu 22-23: LLVM 13.0.0 - 19.1.6, CUDA 7.0 - 12.6.3, cuDNN 8.0.5 - 9.6.0, cuTensor 1.0.1.0 - 2.0.2.1 - -Minimum build system requirements for the above configurations: - -* CMake 3.16.8, GNU C/C++ 9.2, Python 3.0. - -Recommended build system requirements: - -* CMake 3.31.2, GNU C/C++ 13.2, Python 3.13.1. - -Here's how to build ``hipify-clang`` with testing support on ``Ubuntu 23.10.01``: - -.. code-block:: bash - - cmake - -DHIPIFY_CLANG_TESTS=ON \ - -DCMAKE_BUILD_TYPE=Release \ - -DCMAKE_INSTALL_PREFIX=../dist \ - -DCMAKE_PREFIX_PATH=/usr/llvm/19.1.6/dist \ - -DCUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda-12.6.3 \ - -DCUDA_DNN_ROOT_DIR=/usr/local/cudnn-9.6.0 \ - -DCUDA_TENSOR_ROOT_DIR=/usr/local/cutensor-2.0.2.1 \ - -DLLVM_EXTERNAL_LIT=/usr/llvm/19.1.6/build/bin/llvm-lit \ - ../hipify - -The corresponding successful output is: - -.. code-block:: shell - - -- The C compiler identification is GNU 13.2.0 - -- The CXX compiler identification is GNU 13.2.0 - -- Detecting C compiler ABI info - -- Detecting C compiler ABI info - done - -- Check for working C compiler: /usr/bin/cc - skipped - -- Detecting C compile features - -- Detecting C compile features - done - -- Detecting CXX compiler ABI info - -- Detecting CXX compiler ABI info - done - -- Check for working CXX compiler: /usr/bin/c++ - skipped - -- Detecting CXX compile features - -- Detecting CXX compile features - done - -- HIPIFY config: - -- - Build hipify-clang : ON - -- - Test hipify-clang : ON - -- - Is part of HIP SDK : OFF - -- - Install clang headers : ON - -- Found ZLIB: /usr/lib/x86_64-linux-gnu/libz.so (found version "1.2.13") - -- Found LLVM 19.1.6: - -- - CMake module path : /usr/llvm/19.1.6/dist/lib/cmake/llvm - -- - Clang include path : /usr/llvm/19.1.6/dist/include - -- - LLVM Include path : /usr/llvm/19.1.6/dist/include - -- - Binary path : /usr/llvm/19.1.6/dist/bin - -- Linker detection: GNU ld - -- ---- The below configuring for hipify-clang testing only ---- - -- Found Python: /usr/bin/python3.13 (found suitable version "3.13.1", required range is "3.0...3.14") found components: Interpreter - -- Found lit: /usr/local/bin/lit - -- Found FileCheck: /GIT/LLVM/trunk/dist/FileCheck - -- Initial CUDA to configure: - -- - CUDA Toolkit path : /usr/local/cuda-12.6.3 - -- - CUDA Samples path : - -- - cuDNN path : /usr/local/cudnn-9.6.0 - -- - cuTENSOR path : /usr/local/cuTensor/2.0.2.1 - -- - CUB path : - -- Found CUDAToolkit: /usr/local/cuda-12.6.3/targets/x86_64-linux/include (found version "12.6.85") - -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success - -- Found Threads: TRUE - -- Found CUDA config: - -- - CUDA Toolkit path : /usr/local/cuda-12.6.3 - -- - CUDA Samples path : OFF - -- - cuDNN path : /usr/local/cudnn-9.6.0 - -- - CUB path : /usr/local/cuda-12.6.3/include/cub - -- - cuTENSOR path : /usr/local/cuTensor/2.0.2.1 - -- Configuring done (0.6s) - -- Generating done (0.0s) - -- Build files have been written to: /usr/hipify/build - -.. code-block:: shell - - make test-hipify - -The corresponding successful output is: - -.. code-block:: shell - - Running HIPify regression tests - =============================================================== - CUDA 12.6.85 - will be used for testing - LLVM 19.1.6 - will be used for testing - x86_64 - Platform architecture - Linux 6.5.0-15-generic - Platform OS - 64 - hipify-clang binary bitness - 64 - python 3.13.1 binary bitness - =============================================================== - -- Testing: 106 tests, 12 threads -- - Testing Time: 6.91s - - Total Discovered Tests: 106 - Passed: 106 (100.00%) - -.. _Windows testing: - -Windows testing -===================================================== - -Tested configurations: - -.. list-table:: - :header-rows: 1 - - * - LLVM - - CUDA - - cuDNN - - Visual Studio - - CMake - - Python - * - ``4.0.0 - 5.0.2`` - - ``7.0 - 8.0`` - - ``5.1.10 - 7.1.4`` - - ``2015.14.0, 2017.15.5.2`` - - ``3.5.1 - 3.18.0`` - - ``3.6.4 - 3.8.5`` - * - ``6.0.0 - 6.0.1`` - - ``7.0 - 9.0`` - - ``7.0.5 - 7.6.5`` - - ``2015.14.0, 2017.15.5.5`` - - ``3.6.0 - 3.18.0`` - - ``3.7.2 - 3.8.5`` - * - ``7.0.0 - 7.1.0`` - - ``7.0 - 9.2`` - - ``7.0.5 - 7.6.5`` - - ``2017.15.9.11`` - - ``3.13.3 - 3.18.0`` - - ``3.7.3 - 3.8.5`` - * - ``8.0.0 - 8.0.1`` - - ``7.0 - 10.0`` - - ``7.6.5`` - - ``2017.15.9.15`` - - ``3.14.2 - 3.18.0`` - - ``3.7.4 - 3.8.5`` - * - ``9.0.0 - 9.0.1`` - - ``7.0 - 10.1`` - - ``7.6.5`` - - ``2017.15.9.20, 2019.16.4.5`` - - ``3.16.4 - 3.18.0`` - - ``3.8.0 - 3.8.5`` - * - ``10.0.0 - 11.0.0`` - - ``7.0 - 11.1`` - - ``7.6.5 - 8.0.5`` - - ``2017.15.9.30, 2019.16.8.3`` - - ``3.19.2`` - - ``3.9.1`` - * - ``11.0.1 - 11.1.0`` - - ``7.0 - 11.2.2`` - - ``7.6.5 - 8.0.5`` - - ``2017.15.9.31, 2019.16.8.4`` - - ``3.19.3`` - - ``3.9.2`` - * - ``12.0.0 - 13.0.1`` - - ``7.0 - 11.5.1`` - - ``7.6.5 - 8.3.2`` - - ``2017.15.9.43, 2019.16.11.9`` - - ``3.22.2`` - - ``3.10.2`` - * - ``14.0.0 - 14.0.6`` - - ``7.0 - 11.7.1`` - - ``8.0.5 - 8.4.1`` - - ``2017.15.9.57,`` :sup:`5` ``2019.16.11.17, 2022.17.2.6`` - - ``3.24.0`` - - ``3.10.6`` - * - ``15.0.0 - 15.0.7`` - - ``7.0 - 11.8.0`` - - ``8.0.5 - 8.8.1`` - - ``2019.16.11.25, 2022.17.5.2`` - - ``3.26.0`` - - ``3.11.2`` - * - ``16.0.0 - 16.0.6`` - - ``7.0 - 12.2.2`` - - ``8.0.5 - 8.9.5`` - - ``2019.16.11.29, 2022.17.7.1`` - - ``3.27.3`` - - ``3.11.4`` - * - ``17.0.1`` :sup:`6` - ``18.1.8`` :sup:`7` - - ``7.0 - 12.3.2`` - - ``8.0.5 - 9.6.0`` - - ``2019.16.11.42, 2022.17.12.3`` - - ``3.31.2`` - - ``3.13.1`` - * - ``19.1.0 - 19.1.6`` - - ``7.0 - 12.6.3`` - - ``8.0.5 - 9.6.0`` - - ``2019.16.11.42, 2022.17.12.3`` - - ``3.31.2`` - - ``3.13.1`` - -:sup:`5` LLVM 14.x.x is the latest major release supporting Visual Studio 2017. - -To build LLVM 14.x.x correctly using Visual Studio 2017, add ``-DLLVM_FORCE_USE_OLD_TOOLCHAIN=ON`` -to corresponding CMake command line. - -You can also build LLVM \< 14.x.x correctly using Visual Studio 2017 without the -``LLVM_FORCE_USE_OLD_TOOLCHAIN`` option. - -:sup:`6` Note that LLVM 17.0.0 was withdrawn due to an issue; use 17.0.1 or newer instead. - -:sup:`7` Note that LLVM 18.0.0 has never been released; use 18.1.0 or newer instead. - -Building with testing support using ``Visual Studio 17 2022`` on ``Windows 11``: - -.. code-block:: shell - - cmake - -G "Visual Studio 17 2022" \ - -A x64 \ - -Thost=x64 \ - -DHIPIFY_CLANG_TESTS=ON \ - -DCMAKE_BUILD_TYPE=Release \ - -DCMAKE_INSTALL_PREFIX=../dist \ - -DCMAKE_PREFIX_PATH=D:/LLVM/19.1.6/dist \ - -DCUDA_TOOLKIT_ROOT_DIR="C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.6" \ - -DCUDA_SDK_ROOT_DIR="C:/ProgramData/NVIDIA Corporation/CUDA Samples/v12.5" \ - -DCUDA_DNN_ROOT_DIR=D:/CUDA/cuDNN/9.6.0 \ - -DCUDA_TENSOR_ROOT_DIR=D:/CUDA/cuTensor/2.0.2.1 \ - -DLLVM_EXTERNAL_LIT=D:/LLVM/19.1.6/build/Release/bin/llvm-lit.py \ - ../hipify - -The corresponding successful output is: - -.. code-block:: shell - - -- Selecting Windows SDK version 10.0.22621.0 to target Windows 10.0.22631. - -- The C compiler identification is MSVC 19.42.34435.0 - -- The CXX compiler identification is MSVC 19.42.34435.0 - -- Detecting C compiler ABI info - -- Detecting C compiler ABI info - done - -- Check for working C compiler: C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.42.34433/bin/Hostx64/x64/cl.exe - skipped - -- Detecting C compile features - -- Detecting C compile features - done - -- Detecting CXX compiler ABI info - -- Detecting CXX compiler ABI info - done - -- Check for working CXX compiler: C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.42.34433/bin/Hostx64/x64/cl.exe - skipped - -- Detecting CXX compile features - -- Detecting CXX compile features - done - -- HIPIFY config: - -- - Build hipify-clang : ON - -- - Test hipify-clang : ON - -- - Is part of HIP SDK : OFF - -- - Install clang headers : ON - -- Found LLVM 19.1.6: - -- - CMake module path : D:/LLVM/19.1.6/dist/lib/cmake/llvm - -- - Clang include path : D:/LLVM/19.1.6/dist/include - -- - LLVM Include path : D:/LLVM/19.1.6/dist/include - -- - Binary path : D:/LLVM/19.1.6/dist/bin - -- ---- The below configuring for hipify-clang testing only ---- - -- Found Python: C:/Users/TT/AppData/Local/Programs/Python/Python313/python.exe (found suitable version "3.13.1", required range is "3.0...3.14") found components: Interpreter - -- Found lit: C:/Users/TT/AppData/Local/Programs/Python/Python313/Scripts/lit.exe - -- Found FileCheck: D:/LLVM/19.1.6/dist/bin/FileCheck.exe - -- Initial CUDA to configure: - -- - CUDA Toolkit path : C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.6 - -- - CUDA Samples path : C:/ProgramData/NVIDIA Corporation/CUDA Samples/v12.5 - -- - cuDNN path : D:/CUDA/cuDNN/9.6.0 - -- - cuTENSOR path : D:/CUDA/cuTensor/2.0.2.1 - -- - CUB path : - -- Found CUDAToolkit: C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.6/include (found version "12.6.85") - -- Found CUDA config: - -- - CUDA Toolkit path : C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.6 - -- - CUDA Samples path : C:/ProgramData/NVIDIA Corporation/CUDA Samples/v12.5 - -- - cuDNN path : D:/CUDA/cuDNN/9.6.0 - -- - cuTENSOR path : D:/CUDA/cuTensor/2.0.2.1 - -- - CUB path : C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.6/include/cub - -- Configuring done (4.4s) - -- Generating done (0.1s) - -- Build files have been written to: D:/HIPIFY/build - -Run ``Visual Studio 17 2022``, open the generated ``hipify-clang.sln``, and build the project ``test-hipify``. diff --git a/docs/hipify-perl.rst b/docs/hipify-perl.rst deleted file mode 100644 index 60fd1422..00000000 --- a/docs/hipify-perl.rst +++ /dev/null @@ -1,57 +0,0 @@ -.. meta:: - :description: Tools to automatically translate CUDA source code into portable HIP C++ - :keywords: HIPIFY, ROCm, library, tool, CUDA, CUDA2HIP, hipify-clang, hipify-perl - -.. _hipify-perl: - -=================== -hipify-perl -=================== - -``hipify-perl`` is an autogenerated perl-based script that heavily uses regular expressions. - -**Advantages:** - -- Ease of use - -- No checks for input source NVIDIA CUDA code for correctness required - -- No dependency on third party tools, including CUDA - -**Disadvantages:** - -- Inability or difficulty in implementing the following constructs: - - - Macros expansion - - - Namespaces: - - - Redefinition of CUDA entities in user namespaces - - - Using directive - - - Templates (some cases) - - - Device or host function calls differentiation - - - Correct injection of header files - - - Parsing complicated argument lists - -Usage ------------ - -.. code-block:: shell - - perl hipify-perl square.cu > square.cu.hip - -Building hipify-perl ---------------------- - -To generate ``hipify-perl``, run - -.. code-block:: shell - - hipify-clang --perl - -You can choose to specify the output directory for the generated ``hipify-perl`` file using ``--o-hipify-perl-dir`` option. diff --git a/docs/how-to/hipify-clang.rst b/docs/how-to/hipify-clang.rst new file mode 100644 index 00000000..4d77a494 --- /dev/null +++ b/docs/how-to/hipify-clang.rst @@ -0,0 +1,394 @@ +.. meta:: + :description: Tools to automatically translate CUDA source code into portable HIP C++ + :keywords: HIPIFY, ROCm, library, tool, CUDA, CUDA2HIP, hipify-clang, hipify-perl + +.. _hipify-clang: + +************************************************************************** +Using hipify-clang +************************************************************************** + +``hipify-clang`` is a Clang-based tool for translating NVIDIA CUDA sources into HIP sources. + +It translates CUDA source into an Abstract Syntax Tree (AST), which is traversed by transformation +matchers. After applying all the matchers, the output HIP source is produced. + +**Advantages:** + +- ``hipify-clang`` is a translator. It parses complex constructs successfully or reports an error. +- It supports Clang options such as + `-I `_, + `-D `_, and + `--cuda-path `_. +- The support for new CUDA versions is seamless, as the Clang front-end is statically linked into + ``hipify-clang`` and does all the syntactical parsing of a CUDA source to HIPIFY. +- It is very well supported as a compiler extension. + +**Disadvantages:** + +- You must ensure that the input CUDA code is correct as incorrect code can't be translated to HIP. +- You must install CUDA, and in case of multiple installations specify the needed version using ``--cuda-path`` option. +- You must provide all the ``includes`` and ``defines`` to successfully translate the code. + +Release Dependencies +==================== + +``hipify-clang`` requires: + +* `CUDA `_, the latest supported version is + `12.6.3 `_, but requires at least version + `7.0 `_. + +* `LLVM+Clang `_ version is determined at least partially by + the CUDA version you are using, as shown in the table below. The recommended Clang release + is the latest stable release `19.1.6 `_, + or at least version `4.0.0 `_. + +.. list-table:: + + * - CUDA version + - supported LLVM release versions + - Windows + - Linux + * - `12.6.3 `_:sup:`1` + - `19.1.0 `_, + `19.1.1 `_, + `19.1.2 `_, + `19.1.3 `_, + `19.1.4 `_, + `19.1.5 `_, + `19.1.6 `_:sup:`1` + - ✅ + - ✅ + * - `12.3.2 `_ + - `17.0.1 `_, + `17.0.2 `_, + `17.0.3 `_, + `17.0.4 `_, + `17.0.5 `_, + `17.0.6 `_, + `18.1.0 `_, + `18.1.1 `_, + `18.1.2 `_, + `18.1.3 `_, + `18.1.4 `_, + `18.1.5 `_, + `18.1.6 `_, + `18.1.7 `_, + `18.1.8 `_ + - ✅ + - ✅ + * - `12.2.2 `_ + - `16.0.0 `_, + `16.0.1 `_, + `16.0.2 `_, + `16.0.3 `_, + `16.0.4 `_, + `16.0.5 `_, + `16.0.6 `_ + - ✅ + - ✅ + * - `11.8.0 `_ + - `14.0.5 `_, + `14.0.6 `_, + `15.0.0 `_, + `15.0.1 `_, + `15.0.2 `_, + `15.0.3 `_, + `15.0.4 `_, + `15.0.5 `_, + `15.0.6 `_, + `15.0.7 `_ + - ✅ + - ✅ + * - `11.7.1 `_ + - `14.0.0 `_, + `14.0.1 `_, + `14.0.2 `_, + `14.0.3 `_, + `14.0.4 `_ + - Works only with patch due to Clang bug `54609 `_ + |patch for 14.0.0| :sup:`2` + |patch for 14.0.1| :sup:`2` + |patch for 14.0.2| :sup:`2` + |patch for 14.0.3| :sup:`2` + |patch for 14.0.4| :sup:`2` + - ✅ + * - `11.5.1 `_ + - `12.0.0 `_, + `12.0.1 `_, + `13.0.0 `_, + `13.0.1 `_ + - ✅ + - ✅ + * - `11.2.2 `_ + - `11.0.1 `_, + `11.1.0 `_ + - ✅ + - ✅ + * - `11.0.1 `_, + `11.1.0 `_, + `11.1.1 `_ + - `11.0.0 `_ + - Works only with patch due to Clang bug `47332 `_ + |patch for 11.0.0| :sup:`3` + - Works only with patch due to Clang bug `47332 `_ + |patch for 11.0.0| :sup:`3` + * - `11.0.0 `_ + - `11.0.0 `_ + - ✅ + - ✅ + * - `11.0.1 `_, + `11.1.0 `_, + `11.1.1 `_ + - `10.0.0 `_, + `10.0.1 `_ + - Works only with patch due to Clang bug `47332 `_ + |patch for 10.0.0| :sup:`3` + |patch for 10.0.1| :sup:`3` + - Works only with patch due to Clang bug `47332 `_ + |patch for 10.0.0| :sup:`3` + |patch for 10.0.1| :sup:`3` + * - `11.0.0 `_ + - `10.0.0 `_, + `10.0.1 `_ + - ✅ + - ✅ + * - `10.1 `_ + - `9.0.0 `_, + `9.0.1 `_ + - ✅ + - ✅ + * - `10.0 `_ + - `8.0.0 `_, + `8.0.1 `_ + - Works only with patch due to Clang bug `38811 `_ + |patch for 8.0.0| :sup:`2` + |patch for 8.0.1| :sup:`2` + - ✅ + * - `9.2 `_ + - `7.0.0 `_, + `7.0.1 `_, + `7.1.0 `_ + - Works only with patch due to Clang bug `38811 `_ + |patch for 7.0.0| :sup:`2` + |patch for 7.0.1| :sup:`2` + |patch for 7.1.0| :sup:`2` + - ❌ due to Clang bug `36384 `_ + * - `9.0 `_ + - `6.0.0 `_, + `6.0.1 `_ + - ✅ + - ✅ + * - `8.0 `_ + - `4.0.0 `_, + `4.0.1 `_, + `5.0.0 `_, + `5.0.1 `_, + `5.0.2 `_ + - ✅ + - ✅ + * - `7.5 `_ + - `3.8.0 `_ :sup:`4`, + `3.8.1 `_ :sup:`4`, + `3.9.0 `_ :sup:`4`, + `3.9.1 `_ :sup:`4` + - ✅ + - ✅ + +.. |patch for 7.0.0| replace:: + :download:`patch for 7.0.0 <./data/patches/patch_for_clang_7.0.0_bug_38811.zip>` +.. |patch for 7.0.1| replace:: + :download:`patch for 7.0.1 <./data/patches/patch_for_clang_7.0.1_bug_38811.zip>` +.. |patch for 7.1.0| replace:: + :download:`patch for 7.1.0 <./data/patches/patch_for_clang_7.1.0_bug_38811.zip>` +.. |patch for 8.0.0| replace:: + :download:`patch for 8.0.0 <./data/patches/patch_for_clang_8.0.0_bug_38811.zip>` +.. |patch for 8.0.1| replace:: + :download:`patch for 8.0.1 <./data/patches/patch_for_clang_8.0.1_bug_38811.zip>` +.. |patch for 10.0.0| replace:: + :download:`patch for 10.0.0 <./data/patches/patch_for_clang_10.0.0_bug_47332.zip>` +.. |patch for 10.0.1| replace:: + :download:`patch for 10.0.1 <./data/patches/patch_for_clang_10.0.1_bug_47332.zip>` +.. |patch for 11.0.0| replace:: + :download:`patch for 11.0.0 <./data/patches/patch_for_clang_11.0.0_bug_47332.zip>` +.. |patch for 14.0.0| replace:: + :download:`patch for 14.0.0 <./data/patches/patch_for_clang_14.0.0_bug_54609.zip>` +.. |patch for 14.0.1| replace:: + :download:`patch for 14.0.1 <./data/patches/patch_for_clang_14.0.1_bug_54609.zip>` +.. |patch for 14.0.2| replace:: + :download:`patch for 14.0.2 <./data/patches/patch_for_clang_14.0.2_bug_54609.zip>` +.. |patch for 14.0.3| replace:: + :download:`patch for 14.0.3 <./data/patches/patch_for_clang_14.0.3_bug_54609.zip>` +.. |patch for 14.0.4| replace:: + :download:`patch for 14.0.4 <./data/patches/patch_for_clang_14.0.4_bug_54609.zip>` + +:sup:`1` Represents the latest supported and recommended configuration. + +:sup:`2` Download the patch and unpack it into your ``LLVM distributive directory``. This overwrites a few header files. You don't need to rebuild ``LLVM``. + +:sup:`3` Download the patch and unpack it into your ``LLVM source directory``. This overwrites the ``Cuda.cpp`` file. You need to rebuild ``LLVM``. + +:sup:`4` ``LLVM 3.x`` is no longer supported (but might still work). + +In most cases, you can get a suitable version of ``LLVM+Clang`` with your package manager. However, you can also +`download a release archive `_ and build or install it. In case of multiple versions of ``LLVM`` installed, set +`CMAKE_PREFIX_PATH `_ so that +``CMake`` can find the desired version of ``LLVM``. For example, ``-DCMAKE_PREFIX_PATH=D:\LLVM\19.1.6\dist``. + +Usage +===== + +.. note:: + For additional details on the following ``hipify-clang`` command options, see :ref:`hipify_clang-command` + +To process a file, ``hipify-clang`` needs access to the same headers that are required to compile it +with ``Clang``: + +.. code:: shell + + ./hipify-clang square.cu --cuda-path=/usr/local/cuda-12.6 -I /usr/local/cuda-12.6/samples/common/inc + +``hipify-clang`` arguments are supplied first, followed by a separator ``--`` and the arguments to be +passed to Clang for compiling the input file: + +.. code:: shell + + ./hipify-clang cpp17.cu --cuda-path=/usr/local/cuda-12.6 -- -std=c++17 + +``hipify-clang`` also supports the hipification of multiple files that can be specified in a single +command with absolute or relative paths: + +.. code:: shell + + ./hipify-clang cpp17.cu ../../square.cu /home/user/cuda/intro.cu --cuda-path=/usr/local/cuda-12.6 -- -std=c++17 + +To use a specific version of LLVM during hipification, specify the ``hipify-clang`` option +``--clang-resource-directory=`` to point to the Clang resource directory, which is the +parent directory for the ``include`` folder that contains ``__clang_cuda_runtime_wrapper.h`` and other +header files used during the hipification process: + +.. code:: shell + + ./hipify-clang square.cu --cuda-path=/usr/local/cuda-12.6 --clang-resource-directory=/usr/llvm/19.1.6/dist/lib/clang/19 + +For more information, refer to the `Clang manual for compiling CUDA `_. + +.. _hipify-json: + +Using JSON compilation database +=============================== + +For some hipification automation (starting from Clang 8.0.0), you can provide a +`Compilation Database in JSON format `_ +in the ``compile_commands.json`` file: + +.. code:: bash + + -p + - or - + -p= + +You can provide the compilation database in the ``compile_commands.json`` file or generate using +Clang based on CMake. You can specify multiple source files as well. + +To provide Clang options, use ``compile_commands.json`` file, whereas to provide ``hipify-clang`` options, use the ``hipify-clang`` command line. + +.. note:: + + Don't use the options separator ``--`` to avoid compilation error caused due to the ``hipify-clang`` options being + provided before the separator. + +Here's an +`example `_ +demonstrating the ``compile_commands.json`` usage: + +.. code:: json + + [ + { + "directory": "", + "command": "hipify-clang \"\" -I./include -v", + "file": "cd_intro.cu" + } + ] + +.. _hipify-stats: + +Hipification statistics +======================= + +The options ``--print-stats`` and ``--print-stats-csv`` provide an overview of what is hipified and what is not, as well as the hipification statistics. Use the ``--print-stats`` command to return the statistics as text to the terminal, or the ``--print-stats-csv`` command to create a CSV file to open in a spreadsheet. + +.. note:: + When multiple source files are specified on the command-line, the statistics are provided per file and in total. + +Print statistics +---------------- + +.. code:: cpp + + hipify-clang intro.cu -cuda-path="C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.6" --print-stats + +.. code:: cpp + + [HIPIFY] info: file 'intro.cu' statistics: + CONVERTED refs count: 40 + UNCONVERTED refs count: 0 + CONVERSION %: 100.0 + REPLACED bytes: 604 + [HIPIFY] info: file 'intro.cu' statistics: + CONVERTED refs count: 40 + UNCONVERTED refs count: 0 + CONVERSION %: 100.0 + REPLACED bytes: 604 + TOTAL bytes: 5794 + CHANGED lines of code: 34 + TOTAL lines of code: 174 + CODE CHANGED (in bytes) %: 10.4 + CODE CHANGED (in lines) %: 19.5 + TIME ELAPSED s: 0.41 + [HIPIFY] info: CONVERTED refs by type: + error: 2 + device: 2 + memory: 16 + event: 9 + thread: 1 + include_cuda_main_header: 1 + type: 2 + numeric_literal: 7 + [HIPIFY] info: CONVERTED refs by API: + CUDA Driver API: 1 + CUDA RT API: 39 + [HIPIFY] info: CONVERTED refs by names: + cuda.h: 1 + cudaDeviceReset: 1 + cudaError_t: 1 + cudaEventCreate: 2 + cudaEventElapsedTime: 1 + cudaEventRecord: 3 + cudaEventSynchronize: 3 + cudaEvent_t: 1 + cudaFree: 4 + cudaFreeHost: 3 + cudaGetDeviceCount: 1 + cudaGetErrorString: 1 + cudaGetLastError: 1 + cudaMalloc: 3 + cudaMemcpy: 6 + cudaMemcpyDeviceToHost: 3 + cudaMemcpyHostToDevice: 3 + cudaSuccess: 1 + cudaThreadSynchronize: 1 + +Print CSV statistics +-------------------- + +.. code-block:: cpp + + hipify-clang intro.cu -cuda-path="C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.6" --print-stats-csv + +This generates ``intro.cu.csv`` file with statistics: + +.. image:: ../data/csv_statistics.png + :alt: list of stats diff --git a/docs/how-to/hipify-perl.rst b/docs/how-to/hipify-perl.rst new file mode 100644 index 00000000..ac93e7d3 --- /dev/null +++ b/docs/how-to/hipify-perl.rst @@ -0,0 +1,45 @@ +.. meta:: + :description: Tools to automatically translate CUDA source code into portable HIP C++ + :keywords: HIPIFY, ROCm, library, tool, CUDA, CUDA2HIP, hipify-clang, hipify-perl + +.. _hipify-perl: + +=================== +Using hipify-perl +=================== + +``hipify-perl`` is perl-based script that heavily uses regular expressions, that is automatically generated from ``hipify-clang``. + +**Advantages:** + +- Ease of use +- No checks for input source NVIDIA CUDA code for correctness required +- No dependency on third party tools, including CUDA + +**Disadvantages:** + +- Inability or difficulty in implementing the following constructs: + + - Macros expansion + - Namespaces: + + - Redefinition of CUDA entities in user namespaces + - Using directive + + - Templates (some cases) + - Device or host function calls differentiation + - Correct injection of header files + - Parsing complicated argument lists + +Example +======= + +For additional details on the following ``hipify-perl`` command options, see :ref:`hipify_perl-command`. For more advanced translation needs use ``hipify-clang`` as it is more comprehensive and accurate. + +Convert a simple CUDA file (``square.cu``) to HIP using ``hipify-perl``: + +.. code-block:: shell + + hipify-perl square.cu -o square.cu.hip + +This command translates the input file and writes the result to ``square.cu.hip``. diff --git a/docs/index.rst b/docs/index.rst index 2f3ebcbf..a2c05f01 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -8,27 +8,41 @@ HIPIFY documentation ===================== -``hipify-clang`` and ``hipify-perl`` are tools that automatically translate NVIDIA CUDA source code into portable HIP C++. +HIPIFY is a ROCm tool to help developers migrate GPU programming from NVIDIA's CUDA language to AMD's HIP C++ programming language for use on AMD GPUs. HIPIFY includes two tools offering different levels of capability: + +• ``hipify-clang``: A clang-based tool that parses CUDA code and converts it to HIP code. It handles syntax changes, API calls, and kernel launch differences. +• ``hipify-perl``: A simpler tool generated from ``hipify-clang`` that replaces CUDA API calls with HIP equivalents for basic code translation needs. ``hipify-perl`` is useful for simple CUDA programs, but offers less error detection when running into issues during translation. .. note:: - `hipify_torch `_ is a related tool that also translates CUDA source code into portable HIP C++. It was initially developed as part of the PyTorch project to cater to the project's unique requirements but was found to be useful for PyTorch-related projects and thus was released as an independent utility. + `hipify_torch `_ is a related tool that also translates CUDA source code into portable HIP C++. It was developed as part of the PyTorch project to cater to the project's unique requirements, was found to be useful for PyTorch-related projects, and released as an independent utility. + +HIPIFY does not automatically convert all CUDA code into HIP code seamlessly. While it is a powerful tool for translating CUDA code to HIP, there are some limitations and areas where manual intervention is often required. HIPIFY can automatically convert many CUDA runtime API calls, kernel launch syntax, standard CUDA library functions where there is a HIP library equivalent, specific keywords like __global__ and __device__. However, HIP is not a complete replacement for CUDA, and HIPIFY cannot automatically translate all code. CUDA libraries, or third-party libraries that have no HIP equivalent cannot be translated. In addition, code which is optimized for performance on NVIDIA GPUs might require additional rework to optimize performance on AMD GPUs. + +After migrating code through HIPIFY, you should perform a code review to ensure functional correctness, replace any unsupported libraries or constructs with HIP or ROCm features. Debug and test the new HIP program, and optimize the performance on the target AMD GPUs. -You can access HIPIFY code on our `GitHub repository `_. +HIPIFY is open-source and freely available as part of the ROCm ecosystem. You can find the HIPIFY code on AMD's `GitHub HIPIFY repository `_. The documentation is structured as follows: .. grid:: 2 :gutter: 3 - .. grid-item-card:: Conceptual + .. grid-item-card:: Building + + * :ref:`build-hipify-clang` + * :ref:`build-hipify-perl` + + .. grid-item-card:: How to - * :ref:`hipify-clang` - * :ref:`hipify-perl` + * :doc:`Use hipify-clang <./how-to/hipify-clang>` + * :doc:`Use hipify-perl <./how-to/hipify-perl>` .. grid-item-card:: API reference - * :doc:`Supported APIs ` + * :ref:`hipify_clang-command` + * :ref:`hipify_perl-command` + * :doc:`Supported APIs <./reference/supported_apis>` To contribute to the documentation, refer to `Contributing to ROCm `_. diff --git a/docs/reference/hipify-clang-cmd.rst b/docs/reference/hipify-clang-cmd.rst new file mode 100644 index 00000000..6646d209 --- /dev/null +++ b/docs/reference/hipify-clang-cmd.rst @@ -0,0 +1,203 @@ +.. meta:: + :description: Tools to automatically translate CUDA source code into portable HIP C++ + :keywords: HIPIFY, ROCm, library, tool, CUDA, CUDA2HIP, hipify-clang, hipify-perl + +.. _hipify_clang-command: + +************************************************************************** +hipify-clang command +************************************************************************** + +For a list of ``hipify-clang`` options, run: + +.. code-block:: cpp + + hipify-clang --help + +Output: +======= + +Usage +----- + +.. code-block:: cpp + + hipify-clang [options] [... ] + +Options +------- + +.. # COMMENT: The following lines define a break for use in the table below. +.. |br| raw:: html + +
+ +.. list-table:: + :widths: 2 5 + + * - **Options** + - **Description** + + * - ``--`` + - Separator between ``hipify-clang`` and ``clang`` options. Don't specify if there are no ``clang`` options. Not all ``clang`` options are supported by ``hipify-clang`` + + * - ``-D =`` + - Define ```` to ```` or 1 if ```` is omitted + + * - ``-I `` + - Add directory to include search path + + * - ``--amap`` + - Try to hipify as much as possible; ignores ``default-preprocessor`` + + * - ``--clang-resource-directory=`` + - Defines the path to the parent folder for the ``include`` folder, containing ``__clang_cuda_runtime_wrapper.h`` and other header files used on runtime + + * - ``--csv`` + - Generate documentation in CSV format + + * - ``--cuda-gpu-arch=`` + - CUDA GPU architecture (e.g. sm_35); may be specified more than once + + * - ``--cuda-kernel-execution-syntax`` + - Keep CUDA kernel launch syntax (default) + + * - ``--cuda-path=`` + - CUDA installation path. The CUDA path is required for ``hipify-clang`` + + * - ``--default-preprocessor`` + - Enable default preprocessor behavior (synonymous with ``--skip-excluded-preprocessor-conditional-blocks``) + + * - ``--doc-format=`` + - Documentation format: ``full`` (default), ``strict``, or ``compact``. Either the ``--md`` or ``--csv`` option must also be specified to generate the documentation. + + * - ``--doc-roc=`` + - ROC documentation generation: ``skip`` (default), ``separate``, or ``joint``. Either the ``--md`` or ``--csv`` option must also be specified to generate the documentation. + + * - ``--examine`` + - Combine the ``--no-output`` and ``--print-stats`` options + + * - ``--experimental`` + - Hipify HIP APIs that are experimentally supported, otherwise, the corresponding warnings will be emitted + + * - ``--extra-arg=`` + - Additional argument to append to the compiler command line + + * - ``--extra-arg-before=`` + - Additional argument to prepend to the compiler command line + + * - ``--help`` + - Display available options (Use ``--help-hidden`` to include hidden options) + + * - ``--help-list`` + - Display list of available options (Use ``--help-list-hidden`` to include hidden options) + + * - ``--hip-kernel-execution-syntax`` + - Transform CUDA kernel launch syntax to a regular HIP function call (overrides ``--cuda-kernel-execution-syntax``) + + * - ``--inplace`` + - Modify input file in-place. This will overwrite the input file with the hipify output + + * - ``--md`` + - Generate documentation in Markdown format + + * - ``--miopen`` + - Translate to ``miopen`` libraries instead of ``hip`` libraries where it is possible. Cannot be used with ``--roc`` + + * - ``--no-backup`` + - Don't create a backup file for the hipified source + + * - ``--no-output`` + - Don't write any translated output to stdout + + * - ``--no-undocumented-features`` + - Don't rely on undocumented features in code transformation + + * - ``--no-warnings-on-undocumented-features`` + - Suppress warnings on undocumented features in code transformation + + * - ``-o `` + - Output filename + + * - ``--o-dir=`` + - Output directory + + * - ``--o-hipify-perl-dir=`` + - Output directory for hipify-perl script + + * - ``--o-python-map-dir=`` + - Output directory for Python map + + * - ``--o-stats=`` + - Output filename for statistics + + * - ``-p `` + - Used to read a compile command database as described in :ref:`hipify-json`. For example, it can be a CMake build directory in which a file named ``compile_commands.json`` exists (use ``-DCMAKE_EXPORT_COMPILE_COMMANDS=ON`` CMake option to get this output). When no build path is specified, a search for ``compile_commands.json`` will be attempted through all parent paths of the first input file . See: https://clang.llvm.org/docs/HowToSetupToolingForLLVM.html for an example of setting up Clang Tooling on a source tree + + * - ``--perl`` + - Generate ``hipify-perl`` script. See :ref:`build-hipify-perl` for more information. + + * - ``--print-stats`` + - Print translation statistics. See :ref:`hipify-stats` for more information + + * - ``--print-stats-csv`` + - Print translation statistics in a CSV file. See :ref:`hipify-stats` for more information + + * - ``--python`` + - Generate ``hipify-python`` command + + * - ``--roc`` + - Translate to ``roc`` libraries instead of ``hip`` libraries where possible. Cannot be used with ``--miopen`` + + * - ``--save-temps`` + - Save temporary files + + * - ``--skip-excluded-preprocessor-\`` |br| ``conditional-blocks`` + - Enable default preprocessor behavior by skipping undefined conditional blocks. This has the same effect as ``--default-preprocessor`` + + * - ``--temp-dir=`` + - Temporary directory + + * - ``--use-hip-data-types`` + - Use ``hipDataType`` instead of ``hipblasDatatype_t`` or ``rocblas_datatype`` + + * - ``-v`` + - Show commands to run and use verbose output + + * - ``--version`` + - Display the version of this program + + * - ``--versions`` + - Display the versions of the supported 3rd-party software + + * - `` ...`` + - Specify the file paths and names of one or more source files. These paths are looked up in the compile command database. If the path of a file is absolute, it needs to point into CMake's source tree. If the path is relative, the current working directory needs to be in the CMake source tree and the file must be in a subdirectory of the current working directory. ``./`` prefixes in the relative files will be automatically removed, but the rest of a relative path must be a suffix of a path in the compile command database + +Option uses: +------------ + +1. Common Options: + + * ``--help``: Displays the help message + * ``-o ``: Specifies the output file for the converted source + * ``-I ``: Adds the specified directory to the include search paths + * ``--cuda-path=``: Specifies the path to the CUDA installation. Required + * ``--hip-path=``: Specifies the path to the HIP installation (optional; defaults to the ROCm installation path) + +2. Preprocessor and Compilation Options: + + * ``-D``: Defines macros for the preprocessor + * ``-U``: Undefines macros + * ``--save-temps``: Keeps intermediate files generated during processing + +3. Diagnostics and Debugging: + + * ``-v``: Enables verbose output to provide detailed diagnostic information + * ``--version``: Displays the version of HIPIFY-Clang + * ``--show-progress``: Displays progress during the translation process + * ``--print-stats`` | ``--print-stats-csv``: Prints statistics about the translation process (e.g., the number of functions or API calls converted) into either text or CSV form + +4. Include and Exclude Rules: + + * ``--exclude-path=``: Specifies paths to exclude from translation + * ``--include-path=``: Specifies paths to explicitly include during translation diff --git a/docs/reference/hipify-perl-cmd.rst b/docs/reference/hipify-perl-cmd.rst new file mode 100644 index 00000000..62ee0911 --- /dev/null +++ b/docs/reference/hipify-perl-cmd.rst @@ -0,0 +1,85 @@ +.. meta:: + :description: Tools to automatically translate CUDA source code into portable HIP C++ + :keywords: HIPIFY, ROCm, library, tool, CUDA, CUDA2HIP, hipify-clang, hipify-perl + +.. _hipify_perl-command: + +************************************************************************** +hipify-perl command +************************************************************************** + +For a list of ``hipify-perl`` options, run: + +.. code-block:: cpp + + hipify-perl --help + +Output: +======= + +Usage +----- + +.. code-block:: cpp + + hipify-perl [options] [... ] + +Options +------- + +.. # COMMENT: The following lines define a break for use in the table below. +.. |br| raw:: html + +
+ +.. list-table:: + :widths: 2 5 + + * - **Options** + - **Description** + + * - ``-cuda-kernel-execution-syntax`` + - Keep CUDA kernel launch syntax (default) + + * - ``-examine`` + - Combines ``-no-output`` and ``-print-stats`` options + + * - ``-exclude-dirs=`` + - Exclude directories + + * - ``-exclude-files=`` + - Exclude files + + * - ``-experimental`` + - HIPIFY experimentally supported APIs + + * - ``-help`` + - Display available options + + * - ``-hip-kernel-execution-syntax`` + - Transform CUDA kernel launch syntax to a regular HIP function call (overrides ``--cuda-kernel-execution-syntax``) + + * - ``-inplace`` + - Backs up the input file in ``.prehip`` file, and modifies the input file in-place + + * - ``-no-output`` + - Don't write any translated output to stdout + + * - ``-o=`` + - Output filename + + * - ``-print-stats`` + - Print translation statistics as described in :ref:`hipify-stats` + + * - ``-quiet-warnings`` + - Don't print warnings on unknown CUDA identifiers + + * - ``-roc`` + - Translate to ``roc`` libraries instead of ``hip`` libraries where possible + + * - ``-version`` + - The supported HIP version + + * - ``-whitelist=`` + - Whitelist of identifiers + diff --git a/docs/supported_apis.md b/docs/reference/supported_apis.md similarity index 100% rename from docs/supported_apis.md rename to docs/reference/supported_apis.md diff --git a/docs/sphinx/_toc.yml.in b/docs/sphinx/_toc.yml.in index f77589bf..bd872f15 100644 --- a/docs/sphinx/_toc.yml.in +++ b/docs/sphinx/_toc.yml.in @@ -4,27 +4,47 @@ defaults: numbered: False root: index subtrees: -- caption: Conceptual +- caption: Building entries: - - file: hipify-clang - - file: hipify-perl + - file: building/building-hipify + - file: building/build-hipify-perl +- caption: How to + entries: + - file: how-to/hipify-clang + title: Use hipify-clang + - file: how-to/hipify-perl + title: Use hipify-perl - caption: API reference entries: - - file: supported_apis + - file: reference/hipify-clang-cmd + - file: reference/hipify-perl-cmd + - file: reference/supported_apis subtrees: - entries: - file: tables/CUDA_Runtime_API_functions_supported_by_HIP + title: CUDA Runtime functions - file: tables/CUDA_Driver_API_functions_supported_by_HIP + title: CUDA Driver functions - file: tables/cuComplex_API_supported_by_HIP + title: cuComplex API - file: tables/CUDA_Device_API_supported_by_HIP + title: CUDA Device API - file: tables/CUDA_RTC_API_supported_by_HIP + title: CUDA RTC API - file: tables/CUBLAS_API_supported_by_HIP + title: cuBLAS API - file: tables/CUSPARSE_API_supported_by_HIP + title: cuSPARSE - file: tables/CUSOLVER_API_supported_by_HIP + title: cuSOLVER API - file: tables/CURAND_API_supported_by_HIP + title: cuRAND API - file: tables/CUFFT_API_supported_by_HIP + title: cuFFT API - file: tables/CUDNN_API_supported_by_HIP + title: cuDNN API - file: tables/CUB_API_supported_by_HIP + title: CUB API - caption: About entries: - file: license