Skip to content

Commit

Permalink
Merge branch 'release-candidate-4.2.00' for 4.2.00
Browse files Browse the repository at this point in the history
Part of Kokkos C++ Performance Portability Programming EcoSystem 4.2
  • Loading branch information
ndellingwood committed Nov 9, 2023
2 parents 25a31f8 + d0c412e commit 912d377
Show file tree
Hide file tree
Showing 403 changed files with 19,291 additions and 7,399 deletions.
13 changes: 7 additions & 6 deletions .github/workflows/docs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,15 +11,16 @@ permissions:

jobs:
docs-check:
runs-on: ubuntu-latest
runs-on: [macos-latest]
steps:
- name: Install Dependencies
run: |
sudo apt-get update
sudo apt-get install --no-install-recommends doxygen-latex
pip install sphinx
pip install breathe
pip install sphinx-rtd-theme
brew install doxygen
python3 -m pip install sphinx -v "sphinx==6.2.1"
python3 -m pip install breathe
python3 -m pip install sphinx-rtd-theme
sphinx-build --version
doxygen --version
- name: checkout_kokkos_kernels
uses: actions/checkout@v3
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/osx.yml
Original file line number Diff line number Diff line change
Expand Up @@ -111,4 +111,4 @@ jobs:

- name: test
working-directory: kokkos-kernels/build
run: ctest -j2 --output-on-failure --timeout 3600
run: ctest -j2 --output-on-failure --timeout 7200
121 changes: 121 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,126 @@
# Change Log

## [4.2.00](https://github.com/kokkos/kokkos-kernels/tree/4.2.00) (2023-11-06)
[Full Changelog](https://github.com/kokkos/kokkos-kernels/compare/4.1.00...4.2.00)

### New Features

#### BLAS updates
- Implement BLAS2 syr() and her() functionalities under kokkos-kernels syr() [\#1837](https://github.com/kokkos/kokkos-kernels/pull/1837)

#### LAPACK
- New component added for the implementation of LAPACK algorithms and to support associated TPLs [\#1985](https://github.com/kokkos/kokkos-kernels/pull/1985)
- Fix some issue with unit-test definition for SYCL backend in the new LAPACK component [\#2024](https://github.com/kokkos/kokkos-kernels/pull/2024)

#### Sparse updates
- Extract diagonal blocks from a CRS matrix into separate CRS matrices [\#1947](https://github.com/kokkos/kokkos-kernels/pull/1947)
- Adding exec space instance to spmv [\#1932](https://github.com/kokkos/kokkos-kernels/pull/1932)
- Add merge-based SpMV [\#1911](https://github.com/kokkos/kokkos-kernels/pull/1911)
- Stream support for Gauss-Seidel: Symbolic, Numeric, Apply (PSGS and Team_PSGS) [\#1906](https://github.com/kokkos/kokkos-kernels/pull/1906)
- Add a MergeMatrixDiagonal abstraction to KokkosSparse [\#1780](https://github.com/kokkos/kokkos-kernels/pull/1780)

#### ODE updates
- Newton solver [\#1924](https://github.com/kokkos/kokkos-kernels/pull/1924)

### Enhancements:

#### Sparse
- MDF performance improvements exposing more parallelism in the implementation
- MDF: convert remaining count functor to hierarchical parallelism [\#1894](https://github.com/kokkos/kokkos-kernels/pull/1894)
- MDF: move most expensive kernels over to hierarchical parallelism [\#1893](https://github.com/kokkos/kokkos-kernels/pull/1893)
- Improvements to the Block Crs Matrix-Vector multiplication algorithm
- Improve BSR matrix SpMV Performance [\#1740](https://github.com/kokkos/kokkos-kernels/pull/1740)
- Disallow BsrMatrix tensor-core SpMV on non-scalar types [\#1937](https://github.com/kokkos/kokkos-kernels/pull/1937)
- remove triplicate sanity checks in BsrMatrix [\#1923](https://github.com/kokkos/kokkos-kernels/pull/1923)
- remove duplicate BSR SpMV tests [\#1922](https://github.com/kokkos/kokkos-kernels/pull/1922)
- Only deep_copy from device to host if supernodal sptrsv algorithms are used [\#1993](https://github.com/kokkos/kokkos-kernels/pull/1993)
- Improve KokkosSparse_kk_spmv [\#1979](https://github.com/kokkos/kokkos-kernels/pull/1979)
- Add 5 warm-up calls to get accurate, consistent timing
- Print out the matrix dimensions correctly when loading from disk
- sparse/impl: Make PSGS non-blocking [\#1917](https://github.com/kokkos/kokkos-kernels/pull/1917)

#### ODE
- ODE: changing layout of temp mem in RK algorithms [\#1908](https://github.com/kokkos/kokkos-kernels/pull/1908)
- ODE: adding adaptivity test for RK methods [\#1896](https://github.com/kokkos/kokkos-kernels/pull/1896)

#### Common utilities
- Common: remove half and bhalf implementations (now in Kokkos Core) [\#1981](https://github.com/kokkos/kokkos-kernels/pull/1981)
- KokkosKernels: switching from printf macro to function [\#1977](https://github.com/kokkos/kokkos-kernels/pull/1977)
- OrdinalTraits: constexpr functions [\#1976](https://github.com/kokkos/kokkos-kernels/pull/1976)
- Parallel prefix sum can infer view type [\#1974](https://github.com/kokkos/kokkos-kernels/pull/1974)

#### TPL support
- BSPGEMM: removing cusparse testing for version older than 11.4.0 [\#1996](https://github.com/kokkos/kokkos-kernels/pull/1996)
- Revise KokkosBlas::nrm2 TPL implementation [\#1950](https://github.com/kokkos/kokkos-kernels/pull/1950)
- Add TPL oneMKL GEMV support [\#1912](https://github.com/kokkos/kokkos-kernels/pull/1912)
- oneMKL spmv [\#1882](https://github.com/kokkos/kokkos-kernels/pull/1882)

### Build System:
- CMakeLists.txt: Update Kokkos version to 4.2.99 for version check [\#2003](https://github.com/kokkos/kokkos-kernels/pull/2003)
- CMake: Adding logic to catch bad Kokkos version [\#1990](https://github.com/kokkos/kokkos-kernels/pull/1990)
- Remove calling tribits_exclude_autotools_files() [\#1888](https://github.com/kokkos/kokkos-kernels/pull/1888)

### Documentation and Testing:
- Update create_gs_handle docs [\#1958](https://github.com/kokkos/kokkos-kernels/pull/1958)
- docs: Add testing table [\#1876](https://github.com/kokkos/kokkos-kernels/pull/1876)
- docs: Note which builds have ETI disabled [\#1934](https://github.com/kokkos/kokkos-kernels/pull/1934)
- Generate HTML docs [\#1921](https://github.com/kokkos/kokkos-kernels/pull/1921)
- github/workflows: Pin sphinx version [\#1948](https://github.com/kokkos/kokkos-kernels/pull/1948)
- github/workflows/docs.yml: Use up-to-date doxygen version [\#1941](https://github.com/kokkos/kokkos-kernels/pull/1941)

- Unit-Test: adding specific test for block sparse functions [\#1944](https://github.com/kokkos/kokkos-kernels/pull/1944)
- Update SYCL docker image to Cuda 11.7.1 [\#1939](https://github.com/kokkos/kokkos-kernels/pull/1939)
- Remove printouts from the unit tests of ger() and syr() [\#1933](https://github.com/kokkos/kokkos-kernels/pull/1933)
- update testing scripts [\#1960](https://github.com/kokkos/kokkos-kernels/pull/1960)
- Speed up BSR spmv tests [\#1945](https://github.com/kokkos/kokkos-kernels/pull/1945)
- Test_ODE_Newton: Add template parameters for Kokkos::pair [\#1929](https://github.com/kokkos/kokkos-kernels/pull/1929)
- par_ilut: Update documentation for fill_in_limit [\#2001](https://github.com/kokkos/kokkos-kernels/pull/2001)

### Benchmarks:
- perf_test/sparse: Update GS perf_test for streams [\#1963](https://github.com/kokkos/kokkos-kernels/pull/1963)
- Batched sparse perf_tests: Don't write to source tree during build [\#1904](https://github.com/kokkos/kokkos-kernels/pull/1904)
- ParILUT bench: fix unused IS_GPU warning [\#1900](https://github.com/kokkos/kokkos-kernels/pull/1900)
- BsrMatrix SpMV Google Benchmark [\#1886](https://github.com/kokkos/kokkos-kernels/pull/1886)
- Use extraction timestamps for fetched Google Benchmark files [\#1881](https://github.com/kokkos/kokkos-kernels/pull/1881)
- Improve help text in perf tests [\#1875](https://github.com/kokkos/kokkos-kernels/pull/1875)

### Cleanup:
- iostream clean-up in benchmarks [\#2004](https://github.com/kokkos/kokkos-kernels/pull/2004)
- Rename TestExecSpace to TestDevice [\#1970](https://github.com/kokkos/kokkos-kernels/pull/1970)
- remove Intel 2017 code (no longer supported) [\#1920](https://github.com/kokkos/kokkos-kernels/pull/1920)
- clean-up implementations for move of HIP outside of experimental [#1999](https://github.com/kokkos/kokkos-kernels/pull/1999)

### Bug Fixes:
- upstream iostream removal fix [\#1991](https://github.com/kokkos/kokkos-kernels/pull/1991), [\#1995](https://github.com/kokkos/kokkos-kernels/pull/1995)
- Test and fix gemv stream interface [\#1987](https://github.com/kokkos/kokkos-kernels/pull/1987)
- Test_Sparse_spmv_bsr.hpp: Workaround cuda 11.2 compiler error [\#1983](https://github.com/kokkos/kokkos-kernels/pull/1983)
- Fix improper use of execution space instances in ODE tests. Better handling of CudaUVMSpaces during build. [\#1973](https://github.com/kokkos/kokkos-kernels/pull/1973)
- Don't assume the default memory space is used [\#1969](https://github.com/kokkos/kokkos-kernels/pull/1969)
- MDF: set default verbosity explicitly to avoid valgrind warnings [\#1968](https://github.com/kokkos/kokkos-kernels/pull/1968)
- Fix sort_and_merge functions for in-place case [\#1966](https://github.com/kokkos/kokkos-kernels/pull/1966)
- SPMV_Struct_Functor: initialize numExterior to 0 [\#1957](https://github.com/kokkos/kokkos-kernels/pull/1957)
- Use rank-1 impl types when rank-2 vector is dynamically rank 1 [\#1953](https://github.com/kokkos/kokkos-kernels/pull/1953)
- BsrMatrix: Check if CUDA is enabled before checking architecture [\#1955](https://github.com/kokkos/kokkos-kernels/pull/1955)
- Avoid enum without fixed underlying type to fix SYCL [\#1940](https://github.com/kokkos/kokkos-kernels/pull/1940)
- Fix SpAdd perf test when offset/ordinal is not int [\#1928](https://github.com/kokkos/kokkos-kernels/pull/1928)
- Add KOKKOSKERNELS_CUDA_INDEPENDENT_THREADS definition for architectures with independent thread scheduling [\#1927](https://github.com/kokkos/kokkos-kernels/pull/1927)
- Fix cm_generate_makefile --boundscheck [\#1926](https://github.com/kokkos/kokkos-kernels/pull/1926)
- Bsr compatibility [\#1925](https://github.com/kokkos/kokkos-kernels/pull/1925)
- BLAS: fix assignable check in gemv and gemm [\#1914](https://github.com/kokkos/kokkos-kernels/pull/1914)
- mdf: fix initial value in select pivot functor [\#1916](https://github.com/kokkos/kokkos-kernels/pull/1916)
- add missing headers, std::vector -> std::vector<...> [\#1909](https://github.com/kokkos/kokkos-kernels/pull/1909)
- Add missing <vector> include to Test_Sparse_MergeMatrix.hpp [\#1907](https://github.com/kokkos/kokkos-kernels/pull/1907)
- Remove non-existant dir from CMake include paths [\#1892](https://github.com/kokkos/kokkos-kernels/pull/1892)
- cusparse 12 spmv: check y vector alignment [\#1889](https://github.com/kokkos/kokkos-kernels/pull/1889)
- Change 'or' to '||' to fix compilation on MSVC [\#1885](https://github.com/kokkos/kokkos-kernels/pull/1885)
- Add missing KokkosKernels_Macros.hpp include [\#1884](https://github.com/kokkos/kokkos-kernels/pull/1884)
- Backward-compatible fix with [email protected] [\#1874](https://github.com/kokkos/kokkos-kernels/pull/1874)
- Fix for rocblas builds [\#1871](https://github.com/kokkos/kokkos-kernels/pull/1871)
- Correcting 'syr test' bug causing compilation errors with Trilinos [\#1870](https://github.com/kokkos/kokkos-kernels/pull/1870)
- Workaround for spiluk and sptrsv stream tests with OMP_NUM_THREADS of 1, 2, 3 [\#1864](https://github.com/kokkos/kokkos-kernels/pull/1864)
- bhalf_t fix for isnan function [\#2007](https://github.com/kokkos/kokkos-kernels/pull/2007)


## [4.1.00](https://github.com/kokkos/kokkos-kernels/tree/4.1.00) (2023-06-16)
[Full Changelog](https://github.com/kokkos/kokkos-kernels/compare/4.0.01...4.1.00)

Expand Down
25 changes: 20 additions & 5 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ SET(KOKKOSKERNELS_TOP_BUILD_DIR ${CMAKE_CURRENT_BINARY_DIR})
SET(KOKKOSKERNELS_TOP_SOURCE_DIR ${CMAKE_CURRENT_SOURCE_DIR})

SET(KokkosKernels_VERSION_MAJOR 4)
SET(KokkosKernels_VERSION_MINOR 1)
SET(KokkosKernels_VERSION_MINOR 2)
SET(KokkosKernels_VERSION_PATCH 00)
SET(KokkosKernels_VERSION "${KokkosKernels_VERSION_MAJOR}.${KokkosKernels_VERSION_MINOR}.${KokkosKernels_VERSION_PATCH}")

Expand Down Expand Up @@ -115,6 +115,7 @@ IF (KokkosKernels_INSTALL_TESTING)
KOKKOSKERNELS_ADD_TEST_DIRECTORIES(batched/dense/unit_test)
KOKKOSKERNELS_ADD_TEST_DIRECTORIES(batched/sparse/unit_test)
KOKKOSKERNELS_ADD_TEST_DIRECTORIES(blas/unit_test)
KOKKOSKERNELS_ADD_TEST_DIRECTORIES(lapack/unit_test)
KOKKOSKERNELS_ADD_TEST_DIRECTORIES(graph/unit_test)
KOKKOSKERNELS_ADD_TEST_DIRECTORIES(sparse/unit_test)
KOKKOSKERNELS_ADD_TEST_DIRECTORIES(ode/unit_test)
Expand All @@ -124,9 +125,16 @@ ELSE()
# Regular build, not install testing
# Do all the regular option processing
IF (NOT KOKKOSKERNELS_HAS_TRILINOS AND NOT KOKKOSKERNELS_HAS_PARENT)
# This is a standalone build
FIND_PACKAGE(Kokkos REQUIRED)
MESSAGE(STATUS "Found Kokkos at ${Kokkos_DIR}")
# This is a standalone build
FIND_PACKAGE(Kokkos REQUIRED)
IF((${Kokkos_VERSION} VERSION_EQUAL "4.1.00") OR (${Kokkos_VERSION} VERSION_GREATER_EQUAL "4.2.00"))
MESSAGE(STATUS "Found Kokkos version ${Kokkos_VERSION} at ${Kokkos_DIR}")
IF((${Kokkos_VERSION} VERSION_GREATER "4.2.99"))
MESSAGE(WARNING "Configuring with Kokkos ${Kokkos_VERSION} which is newer than the expected develop branch - version check may need update")
ENDIF()
ELSE()
MESSAGE(FATAL_ERROR "Kokkos Kernels ${KokkosKernels_VERSION} requires 4.1.00, 4.2.00 or develop")
ENDIF()
ENDIF()

INCLUDE(cmake/kokkos_backends.cmake)
Expand Down Expand Up @@ -185,7 +193,7 @@ ELSE()
"ALL"
STRING
"A list of components to enable in testing and building"
VALID_ENTRIES BATCHED BLAS GRAPH SPARSE ALL
VALID_ENTRIES BATCHED BLAS LAPACK GRAPH SPARSE ALL
)

# ==================================================================
Expand Down Expand Up @@ -236,6 +244,7 @@ ELSE()
MESSAGE(" COMMON: ON")
MESSAGE(" BATCHED: ${KokkosKernels_ENABLE_COMPONENT_BATCHED}")
MESSAGE(" BLAS: ${KokkosKernels_ENABLE_COMPONENT_BLAS}")
MESSAGE(" LAPACK: ${KokkosKernels_ENABLE_COMPONENT_LAPACK}")
MESSAGE(" GRAPH: ${KokkosKernels_ENABLE_COMPONENT_GRAPH}")
MESSAGE(" SPARSE: ${KokkosKernels_ENABLE_COMPONENT_SPARSE}")
MESSAGE(" ODE: ${KokkosKernels_ENABLE_COMPONENT_ODE}")
Expand Down Expand Up @@ -280,6 +289,9 @@ ELSE()
IF (KokkosKernels_ENABLE_COMPONENT_BLAS)
INCLUDE(blas/CMakeLists.txt)
ENDIF()
IF (KokkosKernels_ENABLE_COMPONENT_LAPACK)
INCLUDE(lapack/CMakeLists.txt)
ENDIF()
IF (KokkosKernels_ENABLE_COMPONENT_GRAPH)
INCLUDE(graph/CMakeLists.txt)
ENDIF()
Expand Down Expand Up @@ -398,6 +410,9 @@ ELSE()
IF (KokkosKernels_ENABLE_COMPONENT_BLAS)
KOKKOSKERNELS_ADD_TEST_DIRECTORIES(blas/unit_test)
ENDIF()
IF (KokkosKernels_ENABLE_COMPONENT_LAPACK)
KOKKOSKERNELS_ADD_TEST_DIRECTORIES(lapack/unit_test)
ENDIF()
IF (KokkosKernels_ENABLE_COMPONENT_GRAPH)
KOKKOSKERNELS_ADD_TEST_DIRECTORIES(graph/unit_test)
ENDIF()
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -133,7 +133,7 @@ For a complete list of tunable Kokkos options, run
spack info kokkos
````

#### Settuping a development environment with Spack
#### Setting up a development environment with Spack
Spack is generally most useful for installng packages to use.
If you want to install all *dependencies* of Kokkos Kernels first so that you can actively develop a given Kokkos Kernels source this can still be done. Go to the Kokkos Kernels source code folder and run:
````
Expand Down
2 changes: 2 additions & 0 deletions batched/KokkosBatched_Util.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -31,10 +31,12 @@
#include <ctime>

#include <complex>
#include <iostream>

#include "Kokkos_Complex.hpp"

#include "KokkosKernels_config.h"
#include "KokkosKernels_Macros.hpp"
#include "KokkosKernels_SimpleUtils.hpp"
#include "KokkosBlas_util.hpp"

Expand Down
42 changes: 42 additions & 0 deletions batched/dense/impl/KokkosBatched_Axpy_Impl.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -199,17 +199,31 @@ KOKKOS_INLINE_FUNCTION int SerialAxpy::invoke(const alphaViewType& alpha,

// Check compatibility of dimensions at run time.
if (X.extent(0) != Y.extent(0) || X.extent(1) != Y.extent(1)) {
#if KOKKOS_VERSION < 40199
KOKKOS_IMPL_DO_NOT_USE_PRINTF(
"KokkosBatched::axpy: Dimensions of X and Y do not match: X: %d x %d, "
"Y: %d x %d\n",
(int)X.extent(0), (int)X.extent(1), (int)Y.extent(0), (int)Y.extent(1));
#else
Kokkos::printf(
"KokkosBatched::axpy: Dimensions of X and Y do not match: X: %d x %d, "
"Y: %d x %d\n",
(int)X.extent(0), (int)X.extent(1), (int)Y.extent(0), (int)Y.extent(1));
#endif
return 1;
}
if (X.extent(0) != alpha.extent(0)) {
#if KOKKOS_VERSION < 40199
KOKKOS_IMPL_DO_NOT_USE_PRINTF(
"KokkosBatched::axpy: First dimension of X and alpha do not match: X: "
"%d x %d, alpha: %d\n",
(int)X.extent(0), (int)X.extent(1), (int)alpha.extent(0));
#else
Kokkos::printf(
"KokkosBatched::axpy: First dimension of X and alpha do not match: X: "
"%d x %d, alpha: %d\n",
(int)X.extent(0), (int)X.extent(1), (int)alpha.extent(0));
#endif
return 1;
}
#endif
Expand Down Expand Up @@ -249,17 +263,31 @@ KOKKOS_INLINE_FUNCTION int TeamAxpy<MemberType>::invoke(

// Check compatibility of dimensions at run time.
if (X.extent(0) != Y.extent(0) || X.extent(1) != Y.extent(1)) {
#if KOKKOS_VERSION < 40199
KOKKOS_IMPL_DO_NOT_USE_PRINTF(
"KokkosBatched::axpy: Dimensions of X and Y do not match: X: %d x %d, "
"Y: %d x %d\n",
(int)X.extent(0), (int)X.extent(1), (int)Y.extent(0), (int)Y.extent(1));
#else
Kokkos::printf(
"KokkosBatched::axpy: Dimensions of X and Y do not match: X: %d x %d, "
"Y: %d x %d\n",
(int)X.extent(0), (int)X.extent(1), (int)Y.extent(0), (int)Y.extent(1));
#endif
return 1;
}
if (X.extent(0) != alpha.extent(0)) {
#if KOKKOS_VERSION < 40199
KOKKOS_IMPL_DO_NOT_USE_PRINTF(
"KokkosBatched::axpy: First dimension of X and alpha do not match: X: "
"%d x %d, alpha: %d\n",
(int)X.extent(0), (int)X.extent(1), (int)alpha.extent(0));
#else
Kokkos::printf(
"KokkosBatched::axpy: First dimension of X and alpha do not match: X: "
"%d x %d, alpha: %d\n",
(int)X.extent(0), (int)X.extent(1), (int)alpha.extent(0));
#endif
return 1;
}
#endif
Expand Down Expand Up @@ -304,17 +332,31 @@ KOKKOS_INLINE_FUNCTION int TeamVectorAxpy<MemberType>::invoke(

// Check compatibility of dimensions at run time.
if (X.extent(0) != Y.extent(0) || X.extent(1) != Y.extent(1)) {
#if KOKKOS_VERSION < 40199
KOKKOS_IMPL_DO_NOT_USE_PRINTF(
"KokkosBatched::axpy: Dimensions of X and Y do not match: X: %d x %d, "
"Y: %d x %d\n",
(int)X.extent(0), (int)X.extent(1), (int)Y.extent(0), (int)Y.extent(1));
#else
Kokkos::printf(
"KokkosBatched::axpy: Dimensions of X and Y do not match: X: %d x %d, "
"Y: %d x %d\n",
(int)X.extent(0), (int)X.extent(1), (int)Y.extent(0), (int)Y.extent(1));
#endif
return 1;
}
if (X.extent(0) != alpha.extent(0)) {
#if KOKKOS_VERSION < 40199
KOKKOS_IMPL_DO_NOT_USE_PRINTF(
"KokkosBatched::axpy: First dimension of X and alpha do not match: X: "
"%d x %d, alpha: %d\n",
(int)X.extent(0), (int)X.extent(1), (int)alpha.extent(0));
#else
Kokkos::printf(
"KokkosBatched::axpy: First dimension of X and alpha do not match: X: "
"%d x %d, alpha: %d\n",
(int)X.extent(0), (int)X.extent(1), (int)alpha.extent(0));
#endif
return 1;
}
#endif
Expand Down
Loading

0 comments on commit 912d377

Please sign in to comment.