Skip to content

Commit

Permalink
Add initialization and error handling
Browse files Browse the repository at this point in the history
  • Loading branch information
neon60 committed Nov 18, 2024
1 parent fcb2d83 commit 38254cb
Show file tree
Hide file tree
Showing 4 changed files with 243 additions and 0 deletions.
134 changes: 134 additions & 0 deletions docs/how-to/hip_runtime_api/error_handling.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,134 @@
.. meta::
:description: Error Handling
:keywords: AMD, ROCm, HIP, error handling, error

********************************************************************************
Error handling
********************************************************************************

HIP provides functionality to detect, report, and manage errors that occur
during the execution of HIP runtime functions or when launching kernels. Every
HIP runtime function, apart from launching kernels, has :cpp:type:`hipError_t`
as return type. :cpp:func:`hipGetLastError()` and :cpp:func:`hipPeekAtLastError()`
can be used for catching errors from kernel launches, as kernel launches don't
return an error directly. HIP maintains an internal state, that includes the
last error code. :cpp:func:`hipGetLastError` returns and resets that error to
hipSuccess, while :cpp:func:`hipPeekAtLastError` just returns the error without
changing it. To get a human readable version of the errors,
:cpp:func:`hipGetErrorString()` and :cpp:func:`hipGetErrorName()` can be used.

.. note::

:cpp:func:`hipGetLastError` returns the returned error code of the last HIP
runtime API call even if it's hipSuccess, while ``cudaGetLastError`` returns
the error returned by any of the preceding CUDA APIs in the same host thread.
:cpp:func:`hipGetLastError` behavior will be matched with
``cudaGetLastError`` in ROCm release 7.0.

Best practices of HIP error handling:

1. Check errors after each API call - Avoid error propagation.
2. Use macros for error checking - Check :ref:`hip_check_macros`.
3. Handle errors gracefully - Free resources and provide meaningful error
messages to the user.

For more details on the error handling functions, see :ref:`error handling
functions reference page <error_handling_reference>`.

.. _hip_check_macros:

HIP check macros
================================================================================

HIP uses check macros to simplify error checking and reduce code duplication.
The ``HIP_CHECK`` macros are mainly used to detect and report errors. It can
also exit from application with ``exit(1);`` function call after the error
print. The ``HIP_CHECK`` macro example:

.. code-block:: cpp
#define HIP_CHECK(expression) \
{ \
const hipError_t status = expression; \
if(status != hipSuccess){ \
std::cerr << "HIP error " \
<< status << ": " \
<< hipGetErrorString(status) \
<< " at " << __FILE__ << ":" \
<< __LINE__ << std::endl; \
} \
}
Complete example
================================================================================

A complete example to demonstrate the error handling with a simple addition of
two values kernel:

.. code-block:: cpp
#include <hip/hip_runtime.h>
#include <vector>
#include <iostream>
#define HIP_CHECK(expression) \
{ \
const hipError_t status = expression; \
if(status != hipSuccess){ \
std::cerr << "HIP error " \
<< status << ": " \
<< hipGetErrorString(status) \
<< " at " << __FILE__ << ":" \
<< __LINE__ << std::endl; \
} \
}
// Addition of two values.
__global__ void add(int *a, int *b, int *c, size_t size) {
const size_t index = threadIdx.x + blockDim.x * blockIdx.x;
if(index < size) {
c[index] += a[index] + b[index];
}
}
int main() {
constexpr int numOfBlocks = 256;
constexpr int threadsPerBlock = 256;
constexpr size_t arraySize = 1U << 16;
std::vector<int> a(arraySize), b(arraySize), c(arraySize);
int *d_a, *d_b, *d_c;
// Setup input values.
std::fill(a.begin(), a.end(), 1);
std::fill(b.begin(), b.end(), 2);
// Allocate device copies of a, b and c.
HIP_CHECK(hipMalloc(&d_a, arraySize * sizeof(*d_a)));
HIP_CHECK(hipMalloc(&d_b, arraySize * sizeof(*d_b)));
HIP_CHECK(hipMalloc(&d_c, arraySize * sizeof(*d_c)));
// Copy input values to device.
HIP_CHECK(hipMemcpy(d_a, &a, arraySize * sizeof(*d_a), hipMemcpyHostToDevice));
HIP_CHECK(hipMemcpy(d_b, &b, arraySize * sizeof(*d_b), hipMemcpyHostToDevice));
// Launch add() kernel on GPU.
hipLaunchKernelGGL(add, dim3(numOfBlocks), dim3(threadsPerBlock), 0, 0, d_a, d_b, d_c, arraySize);
// Check the kernel launch
HIP_CHECK(hipGetLastError());
// Check for kernel execution error
HIP_CHECK(hipDeviceSynchronize());
// Copy the result back to the host.
HIP_CHECK(hipMemcpy(&c, d_c, arraySize * sizeof(*d_c), hipMemcpyDeviceToHost));
// Cleanup allocated memory.
HIP_CHECK(hipFree(d_a));
HIP_CHECK(hipFree(d_b));
HIP_CHECK(hipFree(d_c));
// Print the result.
std::cout << a[0] << " + " << b[0] << " = " << c[0] << std::endl;
return 0;
}
105 changes: 105 additions & 0 deletions docs/how-to/hip_runtime_api/initialization.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
.. meta::
:description: Initialization.
:keywords: AMD, ROCm, HIP, initialization

********************************************************************************
Initialization
********************************************************************************

The initialization involves setting up the environment and resources needed for
using GPUs. The following steps are covered with the initialization:

- Setting up the HIP runtime

This includes reading the environment variables set during init, setting up
the active or visible devices, loading necessary libraries, setting up
internal buffers for memory copies or cooperative launches, initialize the
compiler as well as HSA runtime and checks any errors due to lack of resources
or no active devices.

- Querying and setting GPUs

Identifying and querying the available GPU devices on the system.

- Setting up contexts

Creating contexts for each GPU device, which are essential for managing
resources and executing kernels. For further details, check the :ref:`context
section <context_driver_api>`.

Initialize the HIP runtime
================================================================================

The HIP runtime is initialized automatically when the first HIP API call is
made. However, you can explicitly initialize it using :cpp:func:`hipInit`,
to be able to control the timing of the initialization. The manual
initialization can be useful to ensure that the GPU is initialized and
ready, or to isolate GPU initialization time from other parts of
your program.

.. note::

You can use :cpp:func:`hipDeviceReset()` to delete all streams created, memory
allocated, kernels running and events created by the current process. Any new
HIP API call initializes the HIP runtime again.

Querying and setting GPUs
================================================================================

If multiple GPUs are available in the system, you can query and select the
desired GPU(s) to use based on device properties, such as size of global memory,
size shared memory per block, support of cooperative launch and support of
managed memory.

Querying GPUs
--------------------------------------------------------------------------------

The properties of a GPU can be queried using :cpp:func:`hipGetDeviceProperties`,
which returns a struct of :cpp:struct:`hipDeviceProp_t`. The properties in the struct can be
used to identify a device or give an overview of hardware characteristics, that
might make one GPU better suited for the task than others.

The :cpp:func:`hipGetDeviceCount` function returns the number of available GPUs,
which can be used to loop over the available GPUs.

Example code of querying GPUs:

.. code-block:: cpp
#include <hip/hip_runtime.h>
#include <iostream>
int main() {
int deviceCount;
if (hipGetDeviceCount(&deviceCount) == hipSuccess){
for (int i = 0; i < deviceCount; ++i){
hipDeviceProp_t prop;
if ( hipGetDeviceProperties(&prop, i) == hipSuccess)
std::cout << "Device" << i << prop.name << std::endl;
}
}
return 0;
}
Setting the GPU
--------------------------------------------------------------------------------

:cpp:func:`hipSetDevice` function select the GPU to be used for subsequent HIP
operations. This function performs several key tasks:

- Context Binding

Binds the current thread to the context of the specified GPU device. This
ensures that all subsequent operations are executed on the selected device.

- Resource Allocation

Prepares the device for resource allocation, such as memory allocation and
stream creation.

- Check device availability

Checks for errors in device selection and returns error if the specified
device is not available or not capable of executing HIP operations.
2 changes: 2 additions & 0 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,9 @@ The HIP documentation is organized into the following categories:
:::{grid-item-card} How to

* {doc}`./how-to/hip_runtime_api`
* {doc}`./how-to/hip_runtime_api/initialization`
* {doc}`./how-to/hip_runtime_api/memory_management`
* {doc}`./how-to/hip_runtime_api/error_handling`
* {doc}`./how-to/hip_runtime_api/cooperative_groups`
* {doc}`./how-to/hip_runtime_api/hipgraph`
* [HIP porting guide](./how-to/hip_porting_guide)
Expand Down
2 changes: 2 additions & 0 deletions docs/sphinx/_toc.yml.in
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ subtrees:
- file: how-to/hip_runtime_api
subtrees:
- entries:
- file: how-to/hip_runtime_api/initialization
- file: how-to/hip_runtime_api/memory_management
subtrees:
- entries:
Expand All @@ -46,6 +47,7 @@ subtrees:
- file: how-to/hip_runtime_api/memory_management/unified_memory
- file: how-to/hip_runtime_api/memory_management/virtual_memory
- file: how-to/hip_runtime_api/memory_management/stream_ordered_allocator
- file: how-to/hip_runtime_api/error_handling
- file: how-to/hip_runtime_api/cooperative_groups
- file: how-to/hip_runtime_api/hipgraph
- file: how-to/hip_porting_guide
Expand Down

0 comments on commit 38254cb

Please sign in to comment.