Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Master release 4.3.00 #2163

Merged
merged 262 commits into from
Apr 8, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
262 commits
Select commit Hold shift + click to select a range
6ec77de
HIP: since Kokkos has moved it out of experimental we should clean up
lucbv Oct 17, 2023
1537c4f
Applying clang-format
lucbv Oct 17, 2023
ab0a32d
Merge pull request #2001 from jgfouca/jgfouca/fix_par_ilut_docs
lucbv Oct 19, 2023
b9f3d78
Merge pull request #2007 from lucbv/bhalf_t_fix
lucbv Oct 19, 2023
d9a67b9
Merge pull request #1999 from lucbv/experimental_hip_cleanup
lucbv Oct 19, 2023
8db60a2
Merge pull request #1998 from cwpearson/nightly/rocm520
lucbv Oct 19, 2023
c3646ed
Merge pull request #1942 from eeprude/syr2
lucbv Oct 20, 2023
a3dd32b
Sparse: fix cusparse spgemm hang properly
lucbv Oct 20, 2023
6680347
Sparse: fix logic for bad cursparse spgemm version.
lucbv Oct 20, 2023
4c481e1
Improvements on the unification attempt logic for axpby(), including …
eeprude Jun 5, 2023
9b17fcf
Addressing feedbacks from Luc, plus some small changes here and there:
eeprude Aug 1, 2023
ce072b8
Formatting
eeprude Aug 1, 2023
f4c5351
Using 'ifdef HAVE_KOKKOSKERNELS_DEBUG', per Luc's suggestion
eeprude Aug 1, 2023
f767097
Addressing feedbacks from Luc
eeprude Oct 17, 2023
61ac820
Correcting compilation errors in my Mac
eeprude Oct 17, 2023
12d1fd4
Backup
eeprude Oct 17, 2023
3fc73c9
SYR2: fix unit-test type issue
lucbv Oct 23, 2023
6790651
CUDA 11.0.1 / cuSPARSE 11.0.0 changed SpMM enums
cwpearson Oct 23, 2023
c749f8c
SYR2: applying clang-format
lucbv Oct 23, 2023
4b5793b
Merge pull request #2008 from lucbv/bspgemm_cusparse_hang
lucbv Oct 23, 2023
f8aa101
CUDA 11.2.1 / cuSPARSE 11.4.0 changed SpMV
cwpearson Oct 23, 2023
10cbf87
Merge pull request #1895 from eeprude/axpby_improvement
lucbv Oct 23, 2023
647f3b9
Merge pull request #2012 from cwpearson/fix/cusparse-version-guards-2
lucbv Oct 23, 2023
3f98843
KokkosBlas1_axpby: include <iostream> for debug builds
ndellingwood Oct 23, 2023
ced1e90
Merge pull request #2014 from ndellingwood/fix-cout-debug
lucbv Oct 24, 2023
69379cb
Merge pull request #2013 from lucbv/syr2_fix
lucbv Oct 24, 2023
5b5c101
Backup
eeprude Sep 7, 2023
8f49140
Backup
eeprude Sep 7, 2023
845f7f2
Backup
eeprude Sep 7, 2023
05afd00
Backup
eeprude Sep 7, 2023
e455f37
Backup
eeprude Sep 7, 2023
a63d094
Backup
eeprude Sep 7, 2023
8aecf38
Backup
eeprude Sep 7, 2023
872a553
Backup
eeprude Oct 3, 2023
3223af3
Backup
eeprude Oct 10, 2023
b06bf0f
Backup
eeprude Oct 10, 2023
ca4a943
Backup
eeprude Oct 14, 2023
db8d1c0
Backup
eeprude Oct 14, 2023
f648609
Address CI build errors
e10harvey Oct 2, 2023
6d582b6
Some cleanup on current pull request, making it more related to 'just…
eeprude Oct 25, 2023
e8557be
More cleanup
eeprude Oct 25, 2023
7c9ed9e
Re-enabling gesv unit tests under the lapack subdirectory
eeprude Oct 25, 2023
6ac5ba3
Adding BLAS routines back, for backwards compatibility
eeprude Oct 25, 2023
a62d666
Formatting
eeprude Oct 25, 2023
f8cd2cb
Small cleaning
eeprude Oct 25, 2023
edf2dd0
Correcting error in Jenkins
eeprude Oct 25, 2023
9157665
Fixing compilation error on Jenkins when dealing with HIP
eeprude Oct 25, 2023
b1d77bd
Add required rtd conf file
e10harvey Oct 25, 2023
93211aa
README.md: Use correct project slug
e10harvey Oct 25, 2023
ec8a919
docs/requirements.txt: Add sphinx-rtd-theme
e10harvey Oct 25, 2023
699f3b3
Addressing latest feedbacks from Luc.
eeprude Oct 25, 2023
d674964
Formatting
eeprude Oct 25, 2023
851358f
KokkosKernelsConfig.cmake: add all_libs target and necessary aliases
ndellingwood Oct 25, 2023
e7b6c12
Merge pull request #1985 from eeprude/lapackDir
lucbv Oct 26, 2023
2c70f24
Merge pull request #2020 from ndellingwood/export-all_libs-add-aliases
ndellingwood Oct 26, 2023
252c9db
hide native merge-path SpMV behind "native-merge"
cwpearson Oct 26, 2023
8ba47b9
test native-merge algorithm
cwpearson Oct 27, 2023
33255c8
Merge pull request #1982 from e10harvey/sptrsv_solve_overload
lucbv Oct 27, 2023
89df0f9
Merge pull request #2021 from cwpearson/fix/issue-2010
lucbv Oct 27, 2023
f722428
Quick fix for night compilation with Trilinos
eeprude Oct 27, 2023
9caa7ca
Merge pull request #2024 from eeprude/lapackDir_fix
lucbv Oct 28, 2023
96e7fb5
SPTRSV: check if cusparse is available before calling TPL path
lucbv Oct 30, 2023
9107b3e
SpTRSV: more strickly check prerequisites in SptrsvHandle
lucbv Oct 30, 2023
9408e49
SpTRSV: fix some type definition and variable usaged for cuSPARSE
lucbv Oct 30, 2023
daee1b6
SpTRSV: applying clang-format
lucbv Oct 30, 2023
e88c418
SpTRSV: more fixes
lucbv Oct 31, 2023
ef6f19e
SpTRSV: apply clang-format
lucbv Oct 31, 2023
f5d11ce
Merge pull request #2026 from lucbv/sptrsv_check_cusparse
lucbv Oct 31, 2023
f0d7483
SYCL: fix for Trilinos build with MKL
lucbv Nov 5, 2023
a0c9b75
Apply clang-format to non-cmake files
ndellingwood Nov 5, 2023
78455b4
Merge pull request #2029 from lucbv/sycl_mkl_trilinos_fix
ndellingwood Nov 6, 2023
3621432
SYR2: fix issue with bad type in test function
lucbv Nov 5, 2023
196fa44
Update Test_Blas2_syr2.hpp
lucbv Nov 6, 2023
49f4c23
Merge pull request #2030 from lucbv/syr2_fix2
lucbv Nov 7, 2023
24c73c8
LAPACK: adding rocsolver TPL
lucbv Nov 9, 2023
c06b8db
Lapack: change according to Brian's review
lucbv Nov 15, 2023
2c66d29
Merge pull request #2034 from lucbv/tpl_rocsolver
lucbv Nov 15, 2023
18ca910
cmake/Dependencies.cmake: remove ROCSOLVER
ndellingwood Nov 16, 2023
5a36d57
Merge pull request #2037 from ndellingwood/remove-rocsolver-optional-…
lucbv Nov 16, 2023
4f3549d
Lapack: cusolver TPL logic and support for gesv
lucbv Nov 15, 2023
55433b9
Lapack: updating logic in cm_generate_makefile for cusolver
lucbv Nov 16, 2023
af6aeca
Backup
eeprude Nov 20, 2023
5188b71
Backup
eeprude Nov 20, 2023
ee23cf7
Backup
eeprude Nov 20, 2023
931d8b4
Formatting
eeprude Nov 20, 2023
1624ffd
mv_unification tests with double are failing by very small amounts, e…
eeprude Nov 21, 2023
af49d60
Trying one more increment on tolerance
eeprude Nov 21, 2023
091b3ab
Putting pragma's and unrolls properly right before for loops (compila…
eeprude Nov 21, 2023
fc3d24a
Giving it another try to larger tolarance, after fixing the warning o…
eeprude Nov 21, 2023
aed6a46
Lapack: gesv, implementing review commments
lucbv Nov 21, 2023
d232d2b
Adding Changelog for Release 4.2.0 (#2031)
ndellingwood Nov 8, 2023
2fc777b
Merge pull request #2038 from lucbv/tpl_cusolver
lucbv Nov 22, 2023
94238e3
Merge pull request #2043 from ndellingwood/update-changelog-4200
ndellingwood Nov 22, 2023
6a55d79
NRM1: refactoring TPL layer a bit with c++17 if constexpr
lucbv Nov 7, 2023
9007f55
BLAS: Nrm1 implementing Brian's feedback
lucbv Nov 21, 2023
d7f5e8e
Blas: nrm1, fix in tpl spec decl
lucbv Nov 21, 2023
b1cea63
BLAS: nrm1 problems with ExecSpace template and lack of Kokkos::Threads
lucbv Nov 21, 2023
fbaac45
Another attempt while waiting to get access to the solo cluster
eeprude Nov 22, 2023
9285b6a
Formatting
eeprude Nov 22, 2023
88cec7b
Correction error from the last commit
eeprude Nov 22, 2023
5df5171
Merge pull request #2032 from lucbv/nrm1_tpl_refactor
lucbv Nov 22, 2023
4450d20
Fixing the error that was happening only at the solo cluster
eeprude Nov 23, 2023
9105a8a
Increase tolerance a bit more
eeprude Nov 23, 2023
baab6f5
ncreasing tolerances in all 4 locations
eeprude Nov 23, 2023
a80eb91
Merge pull request #2039 from eeprude/axpby_bug_fix
ndellingwood Nov 24, 2023
44d8a26
Backup
eeprude Nov 21, 2023
3ba6ded
Backup
eeprude Nov 22, 2023
168eb0e
Formatting
eeprude Nov 25, 2023
5e714d7
Forgot to add ClusteringAlgorithm:: at some spots
eeprude Nov 25, 2023
4006d80
Formatting
eeprude Nov 25, 2023
d1aa2b0
Lapack: fixing issue with Magma TPL in gesv, trtri, etc...
lucbv Nov 22, 2023
2b023de
Update blas/unit_test/Test_Blas1_swap.hpp
lucbv Nov 27, 2023
c64c7eb
cmake: Add workaround check for CUSOLVER support with Trilinos
ndellingwood Nov 27, 2023
0261159
Addressing Brian Kelley's feedbacks
eeprude Nov 27, 2023
16e327e
Formatting
eeprude Nov 27, 2023
89d149e
Removing 'ClusteringAlgorithm::'
eeprude Nov 27, 2023
0599b37
Lapack: gesv, incorporate Brian's feedback
lucbv Nov 28, 2023
e04272e
Applying clang-format
lucbv Nov 28, 2023
f61df47
Fixing some deprecation warnings/errors for ROCm 6
seanofthemillers Nov 28, 2023
4d1cfe2
BLAS: fix bug in TPL layer of KokkosBlas::swap
lucbv Nov 29, 2023
2ecf675
CMake: fix bugs in deciding KOKKOSKERNELS_TPL_BLAS_RETURN_COMPLEX
jczhang07 Sep 27, 2023
eebdbb2
TPL: revise BLAS1 dot implementation
jczhang07 Aug 22, 2023
f5415f8
Fix compile errors for C-linkage dot functions returning std::complex
jczhang07 Sep 28, 2023
3cd6420
Use a C struct for complex numbers
jczhang07 Oct 5, 2023
d2e7524
Add a workaround by disabling host MKL dot with complex numbers
jczhang07 Nov 6, 2023
745a7b2
Allow KokkosKernels_ENABLE_PERFTESTS=ON to build perf_tests without K…
cwpearson Dec 1, 2023
27082a9
format sparse/tpls/KokkosSparse_spmv_tpl_spec_decl.hpp
cwpearson Dec 1, 2023
1bbf549
cmake: fix tpl check so cusolver can be disabled when needed
ndellingwood Dec 2, 2023
c8b2999
Link std::filesystem for IntelLLVM in perf_test/sparse
cwpearson Dec 4, 2023
a52ba02
gemm3 perf test: user CUDA, SYCL, or HIP device for kokkos:initialize
cwpearson Dec 5, 2023
18c7d83
Merge pull request #2018 from kokkos/fix_rtd_build
lucbv Dec 5, 2023
d1bf499
Fix for rocm_verison header inclusion
seanofthemillers Dec 6, 2023
4bfde66
Merge pull request #2052 from lucbv/swap_bug_fix
lucbv Dec 6, 2023
f4fd2e5
Merge pull request #1949 from jczhang07/2023-08-18/feature-tpl-dot
lucbv Dec 6, 2023
ca0b810
Merge pull request #2048 from lucbv/magma_fixes
lucbv Dec 6, 2023
f0d9835
Merge pull request #2053 from cwpearson/enhancement/perf_tests-withou…
lucbv Dec 6, 2023
ef4e867
Merge pull request #2058 from cwpearson/fix/blas3-gemm-deviceid
lucbv Dec 6, 2023
7353590
Merge pull request #2045 from eeprude/omp_cluster
lucbv Dec 6, 2023
4ce619e
Merge pull request #2049 from ndellingwood/issue-2047-workaround
lucbv Dec 6, 2023
6679f19
Merge pull request #2055 from cwpearson/fix/std-fs-intelllvm
lucbv Dec 8, 2023
a91a1f2
fence Kokkos before timed interations
cwpearson Dec 8, 2023
cd8f77c
Merge pull request #2066 from cwpearson/fix/spmv-benchmark-fence
lucbv Dec 8, 2023
7144284
Deprecate KOKKOSLINALG_OPT_LEVEL
cwpearson Dec 13, 2023
543446d
Add CMake warning message if KokkosKernels_LINALG_OPT_LEVEL is used
cwpearson Dec 13, 2023
2728071
Async matrix release for MKL >= 2023.2
cwpearson Dec 14, 2023
ddf425f
Support CUBLAS_{LIBRARIES,LIBRARY_DIRS,INCLUDE_DIRS,ROOT} and KokkosK…
cwpearson Dec 14, 2023
d53066a
KokkosSparse_spmv_impl_merge.hpp: use capture by reference
ndellingwood Dec 15, 2023
c61f708
KokkosSparse_par_ilut_numeric_impl.hpp: use capture by reference
ndellingwood Dec 15, 2023
49f5a61
Backup
eeprude Dec 16, 2023
d61e64e
Backup
eeprude Dec 16, 2023
fab15a4
Backup
eeprude Dec 16, 2023
9d45602
Backup
eeprude Dec 16, 2023
70b01bf
Formatting
eeprude Dec 16, 2023
04b0599
Correcting compilation error
eeprude Dec 16, 2023
ceb9a87
Typo
eeprude Dec 16, 2023
2febc99
Changes for syr and syr2, to be tested at weaver
eeprude Dec 17, 2023
7c51188
Formatting
eeprude Dec 17, 2023
0c14496
Changes for axpby
eeprude Dec 17, 2023
852afe1
Backup
eeprude Dec 17, 2023
ebee1f1
Formatting
eeprude Dec 17, 2023
b500f96
Just to force new checking tests in github
eeprude Dec 17, 2023
e439733
Merge pull request #2076 from ndellingwood/fix-werror-implicit-this
lucbv Dec 18, 2023
b50ba54
Merge pull request #2074 from cwpearson/performance/mkl-gemv-release
lucbv Dec 18, 2023
c15b51e
Merge pull request #2072 from cwpearson/deprecate/linalg-opt-level
lucbv Dec 18, 2023
e2b240a
Merge pull request #2050 from seanofthemillers/rocm6_deprecation_fixes
lucbv Dec 18, 2023
4e00981
Addressing feedback from Luc.
eeprude Dec 18, 2023
628d630
Merge pull request #2077 from eeprude/blas_noneti_cuda
lucbv Dec 18, 2023
b242927
Merge pull request #2075 from cwpearson/feature/cmake-cublas-options
lucbv Dec 18, 2023
cb24a0d
Don't call optimize_gemv for one-shot spmv
cwpearson Dec 13, 2023
3dafbed
Merge pull request #2073 from cwpearson/performance/no-optimize-gemv
lucbv Dec 20, 2023
5868e99
Add HIPManagedSpace support
brian-kelley Dec 21, 2023
772183b
Backup
eeprude Dec 25, 2023
5b75a1a
Backup
eeprude Dec 25, 2023
11d369b
Backup
eeprude Dec 25, 2023
c573d6e
Minor typo
eeprude Dec 25, 2023
93d4cda
Merge pull request #2081 from eeprude/axpby_less_deep_copy
lucbv Jan 8, 2024
d5c2924
Merge pull request #2079 from brian-kelley/HIPManagedETI
lucbv Jan 8, 2024
66f60e9
Add block support to all SPILUK algorithms (#2064)
jgfouca Jan 11, 2024
c34c6c5
Add CUDA/HIP TPL support for KokkosSparse::spadd (#1962)
jczhang07 Jan 16, 2024
aa12597
Make spiluk_handle::reset backwards compatible (#2087)
jgfouca Jan 17, 2024
0b1d20f
spadd: add APIs without an execution space argument (#2090)
jczhang07 Jan 20, 2024
9fa4a08
Lapack - SVD: adding initial files that do not implement anything (#2…
lucbv Feb 6, 2024
4315f9a
Hands off namespace `Kokkos::Impl` - cleanup couple violations that s…
dalg24 Feb 6, 2024
8a99aaa
Change name of yaml-cpp to yamlcpp
cwschilly Jan 31, 2024
d77f114
Fix macro setting in CMakeLists
cwschilly Feb 5, 2024
d57e4ee
Merge pull request #2099 from bartlettroscoe/tril-12710-yamlcpp
ndellingwood Feb 7, 2024
49c4308
GMRES: Add support for BSR matrices
jgfouca Feb 6, 2024
adb3064
Remove all mentions of HBWSpace
dalg24 Feb 8, 2024
2ae3452
Reintroduce EXECSPACE_(SERIAL,OPENMP,THREADS}_VALID_MEM_SPACES
ndellingwood Feb 8, 2024
b01ea61
Merge pull request #2101 from dalg24/rm_hbw_space
ndellingwood Feb 9, 2024
401f6c2
Lapack: adding svd benchmark
lucbv Feb 6, 2024
c1beeeb
Fix Cuda TPL finding (#2098)
brian-kelley Feb 9, 2024
2ca9cfc
Add support for BSR matrices to some trsv routines (#2104)
jgfouca Feb 14, 2024
0bf3dcf
Lapack - SVD: adding quick return when cuSOLVER is skipped (#2107)
lucbv Feb 14, 2024
5f719e5
Fix build error in trsv on gcc8
jgfouca Feb 15, 2024
0425cc6
Add a workaround for compilation errors with cuda-12.2.0 + gcc-12.3 (…
jczhang07 Feb 15, 2024
264dee2
Lapack - SVD: fix for unit-test when MKL is enabled (#2110)
lucbv Feb 15, 2024
23abac4
Revert "Merge pull request #2037 from ndellingwood/remove-rocsolver-o…
ndellingwood Feb 15, 2024
8ecbcb7
Merge pull request #2111 from jgfouca/jgfouca/fix_trsv_build_err
ndellingwood Feb 16, 2024
e3026de
Fixing missing inclusion in source file
seanofthemillers Feb 16, 2024
39bad0e
Merge pull request #2113 from seanofthemillers/fix_missing_inclusion_…
ndellingwood Feb 17, 2024
fdadc74
BLAS - MKL: fixing HostBlas calls to handle MKL_INT type (#2112)
lucbv Feb 19, 2024
62aecce
Fix weird Trilinos compiler error
jgfouca Feb 20, 2024
706bf97
Update changelog
ndellingwood Jan 17, 2024
06646f4
Update changelog
ndellingwood Jan 25, 2024
b4513cf
Merge pull request #2117 from jgfouca/jgfouca/fix_tril_compl_err
ndellingwood Feb 21, 2024
0ddc744
Block spiluk follow up (#2085)
jgfouca Feb 21, 2024
0fe8461
Merge pull request #2118 from ndellingwood/update-changelog-421
ndellingwood Feb 22, 2024
d3ce803
github workflows: update to v4 (use Node 20)
ndellingwood Feb 22, 2024
f53f7c5
Refactor Test_Sparse_sptrsv (#2102)
jgfouca Feb 22, 2024
6b4bb06
Merge pull request #2119 from ndellingwood/update-ga-node
ndellingwood Feb 22, 2024
934cd7d
CMake: error out in certain case (#2115)
brian-kelley Feb 26, 2024
63cd89e
Wiki examples for BLAS2 functions are added (#2122)
lucbv Feb 29, 2024
4f2a095
Increase tolerance on gesv test (Fix #2123) (#2124)
brian-kelley Feb 29, 2024
80b1a18
Spmv handle (#2126)
brian-kelley Mar 4, 2024
fe65db5
Option to apply RCM reordering to extracted CRS diagonal blocks (#2125)
vqd8a Mar 5, 2024
9d27c1f
cm_test_all_sandia: various updates
ndellingwood Mar 1, 2024
5524ed6
cm_test_all_sandia: drop decommissioned/unavailable machines
ndellingwood Mar 6, 2024
7aca2c0
Merge pull request #2131 from ndellingwood/update-cmtestallsandia
ndellingwood Mar 6, 2024
865d84c
Fix2130 (#2132)
brian-kelley Mar 6, 2024
74f0ed7
Benchmark: modifying spmv benchmark to run range of spmv tests (#2135)
lucbv Mar 7, 2024
8f2945d
Kokkos Kernels: update version guards to drop old version of Kokkos (…
lucbv Mar 7, 2024
519ef7b
ODE: BDF methods (#1930)
lucbv Mar 12, 2024
a19435c
cm_test_all_sandia: update caraway compilers
ndellingwood Mar 13, 2024
4aa0ebd
Sparse MKL: changing the location of the MKL_SAFE_CALL macro (#2134)
lucbv Mar 13, 2024
f4549f2
Merge pull request #2142 from ndellingwood/udpate-cmtestallsandia-car…
ndellingwood Mar 14, 2024
a29e0e8
Fixing missing descriptor for bsr spmv
seanofthemillers Mar 12, 2024
3a5498d
Kokkos Kernels: change the default offset ETI from size_t to int (#2140)
lucbv Mar 14, 2024
5b08244
KokkosSparse_spmv_bsrmatrix_spec: fix Bsr_TC_Precision namespacing
ndellingwood Mar 14, 2024
98d37b5
Drop comment for cleaner clang-format fix
ndellingwood Mar 14, 2024
a3b7568
Merge pull request #2138 from seanofthemillers/rocsparse_fix_missing_…
ndellingwood Mar 14, 2024
f492f59
Fix usage of RAII to set cusparse/rocsparse stream (#2141)
brian-kelley Mar 14, 2024
2f66110
Use execution space operator== (#2136)
brian-kelley Mar 14, 2024
a725850
Merge pull request #2144 from ndellingwood/fix-Bsr_TC_Precision-names…
ndellingwood Mar 15, 2024
acd7141
cm_test_all_sandia: more caraway module updates and cleanup (#2145)
ndellingwood Mar 15, 2024
0c49c21
Spmv perftest improvements (#2146)
brian-kelley Mar 15, 2024
2c3ffb2
Update version to 4.3.0
ndellingwood Mar 15, 2024
329feb2
Revert "Kokkos Kernels: change the default offset ETI from size_t to …
ndellingwood Mar 19, 2024
f86a08b
Fix signed/unsigned comparison warnings (#2150)
brian-kelley Mar 25, 2024
e642d70
SPMV tpl fixes, cusparse workaround (#2152)
brian-kelley Mar 26, 2024
f909de6
Merge pull request #2147 from lucbv/KK_Utils_cleanup
ndellingwood Mar 26, 2024
ec70c73
KokkosBlas1_axpby.hpp: change debug macro guard for printInformation …
ndellingwood Mar 27, 2024
295a083
Update changelog for 4.3.00 (#2148)
ndellingwood Mar 27, 2024
55fe629
Merge branch 'release-candidate-4.3.00' for release 4.3.0
ndellingwood Apr 2, 2024
e5fba49
FIx changelog typo
brian-kelley Apr 2, 2024
e92a255
Fix merge artifacts
ndellingwood Apr 2, 2024
e3a84b5
CMakeLists.txt: fix Kokkos_VERSION check
ndellingwood Apr 3, 2024
ebbf4b7
Merge pull request #2165 from ndellingwood/test-updates
ndellingwood Apr 3, 2024
d8e2b21
Update master_history.txt for 4.3.0
ndellingwood Apr 2, 2024
7eb4994
KokkosLapack_svd_tpl_spec_decl: defer to MKL spec when LAPACK also en…
ndellingwood Apr 5, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .github/workflows/docs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,12 +23,12 @@ jobs:
doxygen --version

- name: checkout_kokkos_kernels
uses: actions/checkout@v3
uses: actions/checkout@v4
with:
path: kokkos-kernels

- name: checkout_kokkos
uses: actions/checkout@v3
uses: actions/checkout@v4
with:
repository: kokkos/kokkos
ref: develop
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/format.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ jobs:
clang-format-check:
runs-on: ubuntu-20.04
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4

- name: Install Dependencies
run: sudo apt install clang-format-8
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/osx.yml
Original file line number Diff line number Diff line change
Expand Up @@ -50,12 +50,12 @@ jobs:

steps:
- name: checkout_kokkos_kernels
uses: actions/checkout@v3
uses: actions/checkout@v4
with:
path: kokkos-kernels

- name: checkout_kokkos
uses: actions/checkout@v3
uses: actions/checkout@v4
with:
repository: kokkos/kokkos
ref: ${{ github.base_ref }}
Expand Down
5 changes: 4 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -12,4 +12,7 @@ TAGS
#Clangd indexing
compile_commands.json
.cache/
.vscode/
.vscode/

#MacOS hidden files
.DS_Store
35 changes: 35 additions & 0 deletions .readthedocs.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
# Read the Docs configuration file for Sphinx projects
# See https://docs.readthedocs.io/en/stable/config-file/v2.html for details

# Required
version: 2

# Set the OS, Python version and other tools you might need
build:
os: ubuntu-22.04
tools:
python: "3.12"
# You can also specify other tool versions:
# nodejs: "20"
# rust: "1.70"
# golang: "1.20"

# Build documentation in the "docs/" directory with Sphinx
sphinx:
configuration: docs/conf.py
# You can configure Sphinx to use a different builder, for instance use the dirhtml builder for simpler URLs
# builder: "dirhtml"
# Fail on all warnings to avoid broken references
# fail_on_warning: true

# Optionally build your docs in additional formats such as PDF and ePub
# formats:
# - pdf
# - epub

# Optional but recommended, declare the Python requirements required
# to build your documentation
# See https://docs.readthedocs.io/en/stable/guides/reproducible-builds.html
python:
install:
- requirements: docs/requirements.txt
2 changes: 1 addition & 1 deletion BUILD.md
Original file line number Diff line number Diff line change
Expand Up @@ -227,7 +227,7 @@ endif()
* KokkosKernels_LAPACK_ROOT: PATH
* Location of LAPACK install root.
* Default: None or the value of the environment variable LAPACK_ROOT if set
* KokkosKernels_LINALG_OPT_LEVEL: BOOL
* KokkosKernels_LINALG_OPT_LEVEL: BOOL **DEPRECATED**
* Optimization level for KokkosKernels computational kernels: a nonnegative integer. Higher levels result in better performance that is more uniform for corner cases, but increase build time and library size. The default value is 1, which should give performance within ten percent of optimal on most platforms, for most problems.
* Default: 1
* KokkosKernels_MAGMA_ROOT: PATH
Expand Down
94 changes: 94 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,99 @@
# Change Log

## [4.3.00](https://github.com/kokkos/kokkos-kernels/tree/4.3.00) (2024-03-19)
[Full Changelog](https://github.com/kokkos/kokkos-kernels/compare/4.2.01...4.3.00)

### New Features

#### BLAS updates
- Syr2 [\#1942](https://github.com/kokkos/kokkos-kernels/pull/1942)

#### LAPACK updates
- Adding cuSOLVER [\#2038](https://github.com/kokkos/kokkos-kernels/pull/2038)
- Fix for MAGMA with CUDA [\#2044](https://github.com/kokkos/kokkos-kernels/pull/2044)
- Adding rocSOLVER [\#2034](https://github.com/kokkos/kokkos-kernels/pull/2034)
- Fix rocSOLVER issue with Trilinos dependency [\#2037](https://github.com/kokkos/kokkos-kernels/pull/2037)
- Lapack - SVD [\#2092](https://github.com/kokkos/kokkos-kernels/pull/2092)
- Adding benchmark for SVD [\#2103](https://github.com/kokkos/kokkos-kernels/pull/2103)
- Quick return to fix cuSOLVER and improve performance [\#2107](https://github.com/kokkos/kokkos-kernels/pull/2107)
- Fix Intel MKL tolerance for SVD tests [\#2110](https://github.com/kokkos/kokkos-kernels/pull/2110)

#### Sparse updates
- Add block support to all SPILUK algorithms [\#2064](https://github.com/kokkos/kokkos-kernels/pull/2064)
- Block spiluk follow up [\#2085](https://github.com/kokkos/kokkos-kernels/pull/2085)
- Make spiluk_handle::reset backwards compatible [\#2087](https://github.com/kokkos/kokkos-kernels/pull/2087)
- Sptrsv improvements
- Add sptrsv execution space overloads [\#1982](https://github.com/kokkos/kokkos-kernels/pull/1982)
- Refactor Test_Sparse_sptrsv [\#2102](https://github.com/kokkos/kokkos-kernels/pull/2102)
- Add support for BSR matrices to some trsv routines [\#2104](https://github.com/kokkos/kokkos-kernels/pull/2104)
- GMRES: Add support for BSR matrices [\#2097](https://github.com/kokkos/kokkos-kernels/pull/2097)
- Spmv handle [\#2126](https://github.com/kokkos/kokkos-kernels/pull/2126)
- Option to apply RCM reordering to extracted CRS diagonal blocks [\#2125](https://github.com/kokkos/kokkos-kernels/pull/2125)

#### ODE updates
- Adding adaptive BDF methods [\#1930](https://github.com/kokkos/kokkos-kernels/pull/1930)

#### Misc updates
- Add HIPManagedSpace support [\#2079](https://github.com/kokkos/kokkos-kernels/pull/2079)

### Enhancements:

#### BLAS
- Axpby: improvement on unification attempt logic and on the execution of a diversity of situations [\#1895](https://github.com/kokkos/kokkos-kernels/pull/1895)

#### Misc updates
- Use execution space operator== [\#2136](https://github.com/kokkos/kokkos-kernels/pull/2136)

#### TPL support
- Add TPL support for KokkosBlas::dot [\#1949](https://github.com/kokkos/kokkos-kernels/pull/1949)
- Add CUDA/HIP TPL support for KokkosSparse::spadd [\#1962](https://github.com/kokkos/kokkos-kernels/pull/1962)
- Don't call optimize_gemv for one-shot MKL spmv [\#2073](https://github.com/kokkos/kokkos-kernels/pull/2073)
- Async matrix release for MKL >= 2023.2 in SpMV [\#2074](https://github.com/kokkos/kokkos-kernels/pull/2074)
- BLAS - MKL: fixing HostBlas calls to handle MKL_INT type [\#2112](https://github.com/kokkos/kokkos-kernels/pull/2112)

### Build System:
- Support CUBLAS_{LIBRARIES,LIBRARY_DIRS,INCLUDE_DIRS,ROOT} and KokkosKernels_CUBLAS_ROOT CMake options [\#2075](https://github.com/kokkos/kokkos-kernels/pull/2075)
- Link std::filesystem for IntelLLVM in perf_test/sparse [\#2055](https://github.com/kokkos/kokkos-kernels/pull/2055)
- Fix Cuda TPL finding [\#2098](https://github.com/kokkos/kokkos-kernels/pull/2098)
- CMake: error out in certain case [\#2115](https://github.com/kokkos/kokkos-kernels/pull/2115)

### Documentation and Testing:
- par_ilut: Update documentation for fill_in_limit [\#2001](https://github.com/kokkos/kokkos-kernels/pull/2001)
- Wiki examples for BLAS2 functions are added [\#2122](https://github.com/kokkos/kokkos-kernels/pull/2122)
- github workflows: update to v4 (use Node 20) [\#2119](https://github.com/kokkos/kokkos-kernels/pull/2119)

### Benchmarks:
- gemm3 perf test: user CUDA, SYCL, or HIP device for kokkos:initialize [\#2058](https://github.com/kokkos/kokkos-kernels/pull/2058)
- Lapack: adding svd benchmark [\#2103](https://github.com/kokkos/kokkos-kernels/pull/2103)
- Benchmark: modifying spmv benchmark to fix interface and run range of spmv tests [\#2135](https://github.com/kokkos/kokkos-kernels/pull/2135)

### Cleanup:
- Experimental hip cleanup [\#1999](https://github.com/kokkos/kokkos-kernels/pull/1999)
- iostream clean-up in benchmarks [\#2004](https://github.com/kokkos/kokkos-kernels/pull/2004)
- Update: implicit capture of 'this' via '[=]' is deprecated in C++20 warnings [\#2076](https://github.com/kokkos/kokkos-kernels/pull/2076)
- Deprecate KOKKOSLINALG_OPT_LEVEL [\#2072](https://github.com/kokkos/kokkos-kernels/pull/2072)
- Remove all mentions of HBWSpace [\#2101](https://github.com/kokkos/kokkos-kernels/pull/2101)
- Change name of yaml-cpp to yamlcpp (trilinos/Trilinos#12710) [\#2099](https://github.com/kokkos/kokkos-kernels/pull/2099)
- Hands off namespace Kokkos::Impl - cleanup couple violations that snuck in [\#2094](https://github.com/kokkos/kokkos-kernels/pull/2094)
- Kokkos Kernels: update version guards to drop old version of Kokkos [\#2133](https://github.com/kokkos/kokkos-kernels/pull/2133)
- Sparse MKL: changing the location of the MKL_SAFE_CALL macro [\#2134](https://github.com/kokkos/kokkos-kernels/pull/2134)

### Bug Fixes:
- Bspgemm cusparse hang [\#2008](https://github.com/kokkos/kokkos-kernels/pull/2008)
- bhalf_t fix for isnan function [\#2007](https://github.com/kokkos/kokkos-kernels/pull/2007)
- Fence Kokkos before timed iterations [\#2066](https://github.com/kokkos/kokkos-kernels/pull/2066)
- CUDA 11.2.1 / cuSPARSE 11.4.0 changed SpMV enums [\#2011](https://github.com/kokkos/kokkos-kernels/pull/2011)
- Fix the spadd API [\#2090](https://github.com/kokkos/kokkos-kernels/pull/2090)
- Axpby reduce deep copy calls [\#2081](https://github.com/kokkos/kokkos-kernels/pull/2081)
- Correcting BLAS test failures with cuda when ETI_ONLY = OFF (issue #2061) [\#2077](https://github.com/kokkos/kokkos-kernels/pull/2077)
- Fix weird Trilinos compiler error [\#2117](https://github.com/kokkos/kokkos-kernels/pull/2117)
- Fix for missing STL inclusion [\#2113](https://github.com/kokkos/kokkos-kernels/pull/2113)
- Fix build error in trsv on gcc8 [\#2111](https://github.com/kokkos/kokkos-kernels/pull/2111)
- Add a workaround for compilation errors with cuda-12.2.0 + gcc-12.3 [\#2108](https://github.com/kokkos/kokkos-kernels/pull/2108)
- Increase tolerance on gesv test (Fix #2123) [\#2124](https://github.com/kokkos/kokkos-kernels/pull/2124)
- Fix usage of RAII to set cusparse/rocsparse stream [\#2141](https://github.com/kokkos/kokkos-kernels/pull/2141)
- Spmv bsr matrix fix missing matrix descriptor (rocsparse) [\#2138](https://github.com/kokkos/kokkos-kernels/pull/2138)

## [4.2.01](https://github.com/kokkos/kokkos-kernels/tree/4.2.01) (2024-01-17)
[Full Changelog](https://github.com/kokkos/kokkos-kernels/compare/4.2.00...4.2.01)

Expand Down
23 changes: 16 additions & 7 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,8 @@ SET(KOKKOSKERNELS_TOP_BUILD_DIR ${CMAKE_CURRENT_BINARY_DIR})
SET(KOKKOSKERNELS_TOP_SOURCE_DIR ${CMAKE_CURRENT_SOURCE_DIR})

SET(KokkosKernels_VERSION_MAJOR 4)
SET(KokkosKernels_VERSION_MINOR 2)
SET(KokkosKernels_VERSION_PATCH 1)
SET(KokkosKernels_VERSION_MINOR 3)
SET(KokkosKernels_VERSION_PATCH 0)
SET(KokkosKernels_VERSION "${KokkosKernels_VERSION_MAJOR}.${KokkosKernels_VERSION_MINOR}.${KokkosKernels_VERSION_PATCH}")

#Set variables for config file
Expand Down Expand Up @@ -127,13 +127,13 @@ ELSE()
IF (NOT KOKKOSKERNELS_HAS_TRILINOS AND NOT KOKKOSKERNELS_HAS_PARENT)
# This is a standalone build
FIND_PACKAGE(Kokkos REQUIRED)
IF((${Kokkos_VERSION} VERSION_EQUAL "4.1.00") OR (${Kokkos_VERSION} VERSION_GREATER_EQUAL "4.2.00"))
IF((${Kokkos_VERSION} VERSION_GREATER_EQUAL "4.1.0") AND (${Kokkos_VERSION} VERSION_LESS_EQUAL "4.3.0"))
MESSAGE(STATUS "Found Kokkos version ${Kokkos_VERSION} at ${Kokkos_DIR}")
IF((${Kokkos_VERSION} VERSION_GREATER "4.2.99"))
IF((${Kokkos_VERSION} VERSION_GREATER "4.3.99"))
MESSAGE(WARNING "Configuring with Kokkos ${Kokkos_VERSION} which is newer than the expected develop branch - version check may need update")
ENDIF()
ELSE()
MESSAGE(FATAL_ERROR "Kokkos Kernels ${KokkosKernels_VERSION} requires 4.1.00, 4.2.00, 4.2.01 or develop")
MESSAGE(FATAL_ERROR "Kokkos Kernels ${KokkosKernels_VERSION} requires Kokkos_VERSION 4.1.0, 4.2.0, 4.2.1 or 4.3.0")
ENDIF()
ENDIF()

Expand All @@ -156,9 +156,16 @@ ELSE()
KOKKOSKERNELS_ADD_OPTION_AND_DEFINE(
LINALG_OPT_LEVEL
KOKKOSLINALG_OPT_LEVEL
"Optimization level for KokkosKernels computational kernels: a nonnegative integer. Higher levels result in better performance that is more uniform for corner cases, but increase build time and library size. The default value is 1, which should give performance within ten percent of optimal on most platforms, for most problems. Default: 1"
"DEPRECATED. Optimization level for KokkosKernels computational kernels: a nonnegative integer. Higher levels result in better performance that is more uniform for corner cases, but increase build time and library size. The default value is 1, which should give performance within ten percent of optimal on most platforms, for most problems. Default: 1"
"1")

if (KokkosKernels_LINALG_OPT_LEVEL AND NOT KokkosKernels_LINALG_OPT_LEVEL STREQUAL "1")
message(WARNING "KokkosKernels_LINALG_OPT_LEVEL is deprecated!")
endif()
if(KokkosKernels_KOKKOSLINALG_OPT_LEVEL AND NOT KokkosKernels_KOKKOSLINALG_OPT_LEVEL STREQUAL "1")
message(WARNING "KokkosKernels_KOKKOSLINALG_OPT_LEVEL is deprecated!")
endif()

# Enable experimental features of KokkosKernels if set at configure
# time. Default is no.
KOKKOSKERNELS_ADD_OPTION_AND_DEFINE(
Expand Down Expand Up @@ -375,8 +382,10 @@ ELSE()
KOKKOSKERNELS_LINK_TPL(kokkoskernels PUBLIC MKL)
KOKKOSKERNELS_LINK_TPL(kokkoskernels PUBLIC CUBLAS)
KOKKOSKERNELS_LINK_TPL(kokkoskernels PUBLIC CUSPARSE)
KOKKOSKERNELS_LINK_TPL(kokkoskernels PUBLIC CUSOLVER)
KOKKOSKERNELS_LINK_TPL(kokkoskernels PUBLIC ROCBLAS)
KOKKOSKERNELS_LINK_TPL(kokkoskernels PUBLIC ROCSPARSE)
KOKKOSKERNELS_LINK_TPL(kokkoskernels PUBLIC ROCSOLVER)
KOKKOSKERNELS_LINK_TPL(kokkoskernels PUBLIC METIS)
KOKKOSKERNELS_LINK_TPL(kokkoskernels PUBLIC ARMPL)
KOKKOSKERNELS_LINK_TPL(kokkoskernels PUBLIC MAGMA)
Expand Down Expand Up @@ -425,7 +434,7 @@ ELSE()
IF (KOKKOSKERNELS_ALL_COMPONENTS_ENABLED)
IF (KokkosKernels_ENABLE_PERFTESTS)
MESSAGE(STATUS "Enabling perf tests.")
KOKKOSKERNELS_ADD_TEST_DIRECTORIES(perf_test)
add_subdirectory(perf_test) # doesn't require KokkosKernels_ENABLE_TESTS=ON
ENDIF ()
IF (KokkosKernels_ENABLE_EXAMPLES)
MESSAGE(STATUS "Enabling examples.")
Expand Down
8 changes: 4 additions & 4 deletions CheckHostBlasReturnComplex.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,8 @@ FUNCTION(CHECK_HOST_BLAS_RETURN_COMPLEX VARNAME)

extern \"C\" {
void F77_BLAS_MANGLE(zdotc,ZDOTC)(
std::complex<double>* result, const int* n,
const std::complex<double> x[], const int* incx,
std::complex<double>* result, const int* n,
const std::complex<double> x[], const int* incx,
const std::complex<double> y[], const int* incy);
}

Expand All @@ -49,9 +49,9 @@ int main() {
CHECK_CXX_SOURCE_RUNS("${SOURCE}" KK_BLAS_RESULT_AS_POINTER_ARG)

IF(${KK_BLAS_RESULT_AS_POINTER_ARG})
SET(VARNAME OFF)
SET(${VARNAME} OFF PARENT_SCOPE)
ELSE()
SET(VARNAME ON)
SET(${VARNAME} ON PARENT_SCOPE)
ENDIF()

ENDFUNCTION()
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
[![Generic badge](https://readthedocs.org/projects/pip/badge/?version=latest&style=flat)](https://kokkos-kernels.readthedocs.io/en/latest/)
[![Generic badge](https://readthedocs.org/projects/kokkos-kernels/badge/?version=latest)](https://kokkos-kernels.readthedocs.io/en/latest/)

![KokkosKernels](https://avatars2.githubusercontent.com/u/10199860?s=200&v=4)

Expand Down
24 changes: 0 additions & 24 deletions batched/KokkosBatched_Util.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -626,18 +626,6 @@ KOKKOS_INLINE_FUNCTION auto subview_wrapper(ViewType v, IdxType1 i1,
const Trans::NoTranspose) {
return subview_wrapper(v, i1, i2, i3, layout_tag);
}
#if KOKKOS_VERSION < 40099
template <class ViewType, class IdxType1>
KOKKOS_INLINE_FUNCTION auto subview_wrapper(ViewType v, IdxType1 i1,
Kokkos::Impl::ALL_t i2,
Kokkos::Impl::ALL_t i3,
const BatchLayout::Left &layout_tag,
const Trans::Transpose) {
auto sv_nt = subview_wrapper(v, i1, i3, i2, layout_tag);

return transpose_2d_view(sv_nt, layout_tag);
}
#else
template <class ViewType, class IdxType1>
KOKKOS_INLINE_FUNCTION auto subview_wrapper(ViewType v, IdxType1 i1,
Kokkos::ALL_t i2, Kokkos::ALL_t i3,
Expand All @@ -647,7 +635,6 @@ KOKKOS_INLINE_FUNCTION auto subview_wrapper(ViewType v, IdxType1 i1,

return transpose_2d_view(sv_nt, layout_tag);
}
#endif
template <class ViewType, class IdxType1, class IdxType2, class IdxType3>
KOKKOS_INLINE_FUNCTION auto subview_wrapper(ViewType v, IdxType1 i1,
IdxType2 i2, IdxType3 i3,
Expand All @@ -671,16 +658,6 @@ KOKKOS_INLINE_FUNCTION auto subview_wrapper(
const BatchLayout::Right &layout_tag, const Trans::NoTranspose &) {
return subview_wrapper(v, i1, i2, i3, layout_tag);
}
#if KOKKOS_VERSION < 40099
template <class ViewType, class IdxType1>
KOKKOS_INLINE_FUNCTION auto subview_wrapper(
ViewType v, IdxType1 i1, Kokkos::Impl::ALL_t i2, Kokkos::Impl::ALL_t i3,
const BatchLayout::Right &layout_tag, const Trans::Transpose &) {
auto sv_nt = subview_wrapper(v, i1, i3, i2, layout_tag);

return transpose_2d_view(sv_nt, layout_tag);
}
#else
template <class ViewType, class IdxType1>
KOKKOS_INLINE_FUNCTION auto subview_wrapper(
ViewType v, IdxType1 i1, Kokkos::ALL_t i2, Kokkos::ALL_t i3,
Expand All @@ -689,7 +666,6 @@ KOKKOS_INLINE_FUNCTION auto subview_wrapper(

return transpose_2d_view(sv_nt, layout_tag);
}
#endif
template <class ViewType, class IdxType1, class IdxType2, class IdxType3>
KOKKOS_INLINE_FUNCTION auto subview_wrapper(
ViewType v, IdxType1 i1, IdxType2 i2, IdxType3 i3,
Expand Down
Loading
Loading