Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement batched serial getrs #2483

Merged
merged 2 commits into from
Jan 20, 2025

Conversation

yasahi-hpc
Copy link
Contributor

@yasahi-hpc yasahi-hpc commented Jan 14, 2025

This PR implements getrs function.

Following files are added:

  1. KokkosBatched_Getrs_Serial_Impl.hpp: Internal interfaces
  2. KokkosBatched_Getrs_Serial_Internal.hpp: Implementation details
  3. KokkosBatched_Getrs.hpp: APIs
  4. Test_Batched_SerialGetrs.hpp: Unit tests for that

Detailed description

It solves a general N-by-N matrix A using the LU factorization computed by getrf.
Here, the matrix has the following shape.

  • A: (batch_count, n, n)
    The N-by-N factorized matrix by getrf where A = P * L * U; the unit diagonal elements of L are not stored.
  • IPIV: (batch_count, n)
    The pivot indices from getrf. for 0 <= i < n, row i of the matrix was interchanged with row IPIV(i).

Parallelization would be made in the following manner. This is efficient only when
A is given in LayoutLeft for GPUs and LayoutRight for CPUs (parallelized over batch direction).

Kokkos::parallel_for('getrs', 
    Kokkos::RangePolicy<execution_space> policy(0, n),
    [=](const int k) {
        auto aa = Kokkos::subview(m_a, k, Kokkos::ALL(), Kokkos::ALL());
        auto ipiv = Kokkos::subview(m_ipiv, k, Kokkos::ALL());
        auto bb   = Kokkos::subview(m_b, k, Kokkos::ALL());

        KokkosBatched::SerialGetrs<Trans, AlgoTagType>::invoke(aa, ipiv, bb);
    });

Tests

  1. Make a random matrix from random A and factorize it into LU with ipiv by getrf.
    Then, solve A * x = b with getrs to get x, while keeping the original b in x_ref. Finally, confirm that A * x is equal to b (=x_ref) using gem.
  2. Simple and small analytical test, i.e. choose A as follows to confirm LU == A.
A: [[1, 1],
    [1, -1]]
b: [2, 0]
x: [1, 1]

@cwpearson cwpearson added the AT2-CI-APPROVAL Approve CI to run at SNL label Jan 14, 2025
@lucbv lucbv added enhancement feature request Cleanup Code maintenance that isn't a bugfix or new feature labels Jan 15, 2025
Copy link
Contributor

@lucbv lucbv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good but I would rewrite the analytical test to only use getrs.

/// \param k [in] Number of superdiagonals or subdiagonals of matrix A
/// \param BlkSize [in] Block size of matrix A
template <typename DeviceType, typename ScalarType, typename LayoutType, typename ParamTagType, typename AlgoTagType>
void impl_test_batched_getrs_analytical(const int N) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my opinion here it would be better to not call getrf but instead directly write the output of LU into A and ipiv and only test getrs the idea is to isolate where the error could be coming from. That way this unit test will only fail if an issue is found in getrs not in getrs. For instance you could have A, lu, ipiv and b set as follows:

A=[[1, 1]
      [1, -1]]
ipiv=[0,1]
lu=[[1, 1]]
       [1, -2]
b=[[2]
      [0]]

Then call directly getrs and check that the ouput is x=[[1],[1]]

Signed-off-by: Yuuichi Asahi <[email protected]>
@yasahi-hpc yasahi-hpc force-pushed the implement-batched-serial-getrs branch from 77617ef to 82ea131 Compare January 20, 2025 15:46
@yasahi-hpc yasahi-hpc force-pushed the implement-batched-serial-getrs branch from 82ea131 to 9a81eee Compare January 20, 2025 15:47
@yasahi-hpc yasahi-hpc requested a review from lucbv January 20, 2025 16:57
Copy link
Contributor

@lucbv lucbv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update

@lucbv lucbv merged commit 834f202 into kokkos:develop Jan 20, 2025
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
AT2-CI-APPROVAL Approve CI to run at SNL Cleanup Code maintenance that isn't a bugfix or new feature enhancement feature request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants