[RELEASE] cuml v24.08 #6007

raydouglass · 2024-08-01T17:26:28Z

❄️ Code freeze for `branch-24.08` and v24.08 release

What does this mean?

Only critical/hotfix level issues should be merged into branch-24.08 until release (merging of this PR).

What is the purpose of this PR?

Update documentation
Allow testing for the new release
Enable a means to merge branch-24.08 into main for the release

Forward-merge branch-24.06 into branch-24.08

Fix conflict of forward-merge #5905 of branch-24.06 into branch-24.08

Forward-merge branch-24.06 into branch-24.08

Contributes to rapidsai/build-planning#31. Authors: - Kyle Edwards (https://github.com/KyleFromNVIDIA) - Bradley Dice (https://github.com/bdice) - James Lamb (https://github.com/jameslamb) Approvers: - Jake Awe (https://github.com/AyodeAwe) - James Lamb (https://github.com/jameslamb) - Bradley Dice (https://github.com/bdice) URL: #5804

Forward-merge branch-24.06 into branch-24.08

@betatim

This failure was just a testing failure, expectint identical pointers of actual dataframes, as opposed to wrapped objects. Contributes to fixing #5876 cc @betatim Authors: - Dante Gama Dessavre (https://github.com/dantegd) Approvers: - Divye Gala (https://github.com/divyegala) URL: #5885

…5882) Error came from the fact that pandas and cudf convert to numpy by default with different order. Towards fixing #5876 Authors: - Dante Gama Dessavre (https://github.com/dantegd) Approvers: - Divye Gala (https://github.com/divyegala) - Tim Head (https://github.com/betatim) URL: #5882

This PR removes text builds of the documentation, which we do not currently use for anything. Contributes to rapidsai/build-planning#71. Authors: - Vyas Ramasubramani (https://github.com/vyasr) - Bradley Dice (https://github.com/bdice) Approvers: - Bradley Dice (https://github.com/bdice) - Dante Gama Dessavre (https://github.com/dantegd) - Jake Awe (https://github.com/AyodeAwe) URL: #5921

Forward-merge branch-24.06 into branch-24.08

…earn change (#5925) Nightly jobs are showing a failure from hypothesis having a strategy that calls sklearn make_regression with n_samples being zero, which is no longer supported. This PR fixes that. Authors: - Dante Gama Dessavre (https://github.com/dantegd) Approvers: - Bradley Dice (https://github.com/bdice) URL: #5925

… followup (#5928) Contributes to rapidsai/build-planning#31 Contributes to rapidsai/dependency-file-generator#89 #5804 was one of the earlier `rapids-build-backend` PRs merged across RAPIDS. Since it was merged, we've made some small adjustments to the approach for `rapids-build-backend`. This catches `cuml` up with those changes: * removes unused constants in `ci/build*` scripts * uses `--file-key` instead of `--file_key` in `rapids-dependency-file-generator` calls * uses `--prepend-channel` instead of `--prepend-channels` in `rapids-dependency-file-generator` calls * ensures `ci/update-version.sh` preserves alpha specs Authors: - James Lamb (https://github.com/jameslamb) Approvers: - Ray Douglass (https://github.com/raydouglass) URL: #5928

Fixed by passing on sample_weight to the .fit() method in fit_proba() method of SVC. Authors: - Pablo Tanner (https://github.com/pablotanner) - Dante Gama Dessavre (https://github.com/dantegd) Approvers: - Dante Gama Dessavre (https://github.com/dantegd) URL: #5912

@jameslamb

Treelite 4.2.1 contains the following improvements: * Compatibility patch for latest RapidJSON (dmlc/treelite#567) * Support for NumPy 2.0 (dmlc/treelite#562). Thanks @jameslamb * Handle certain class of XGBoost models (dmlc/treelite#564) Authors: - Philip Hyunsu Cho (https://github.com/hcho3) Approvers: - Bradley Dice (https://github.com/bdice) - Dante Gama Dessavre (https://github.com/dantegd) - Ray Douglass (https://github.com/raydouglass) - James Lamb (https://github.com/jameslamb) URL: #5908

…types (#5938) PR fixes parameter sweeps of benchmarks when they have a different type, so for example: ``` --cuml-param-sweep init=random,scalable-k-means++ ``` Authors: - Dante Gama Dessavre (https://github.com/dantegd) Approvers: - Divye Gala (https://github.com/divyegala) URL: #5938

Authors: - Dante Gama Dessavre (https://github.com/dantegd) Approvers: - Mike Sarahan (https://github.com/msarahan) URL: #5949

Contributes to rapidsai/build-planning#80 Adds constraints to avoid pulling in CMake 3.30.0, for the reasons described in that issue. Authors: - James Lamb (https://github.com/jameslamb) Approvers: - Bradley Dice (https://github.com/bdice) URL: #5956

@tfeher

…#5937) Partial solution for #5936 Issue was that concatenating when having a single array per worker was causing a memory copy (not sure if always, but often enough). This PR avoids the concatenation when a worker has a single partition of data. This is coming from a behavior from CuPy, where some testing reveals that sometimes it creates an extra allocation when concatenating lists that are comprised of a single array: ```python >>> import cupy as cp >>> a = cp.random.rand(2000000, 250).astype(cp.float32) # Memory occupied: 5936MB >>> b = [a] >>> c = cp.concatenate(b) # Memory occupied: 5936 MB <- no memory copy ``` ```python >>> import cupy as cp >>> a = cp.random.rand(1000000, 250) # Memory occupied: 2120 MB >>> b = [a] >>> c = cp.concatenate(b) # Memory occupied: 4028 MB <- memory copy was performed! ``` I'm not sure what are the exact rules that CuPy follows here, we could check, but in general avoiding the concatenate when we have a single partition is an easy fix that will not depend on the behavior outside of cuML's code. cc @tfeher @cjnolet Authors: - Dante Gama Dessavre (https://github.com/dantegd) Approvers: - Artem M. Chirkin (https://github.com/achirkin) - Tamas Bela Feher (https://github.com/tfeher) - Divye Gala (https://github.com/divyegala) URL: #5937

With the deployment of rapids-build-backend, we need to make sure our dependencies have alpha specs. Contributes to rapidsai/build-planning#31 Authors: - Kyle Edwards (https://github.com/KyleFromNVIDIA) - Dante Gama Dessavre (https://github.com/dantegd) Approvers: - James Lamb (https://github.com/jameslamb) URL: #5948

Usage of the CUDA math libraries is independent of the CUDA runtime. Make their static/shared status separately controllable. Contributes to rapidsai/build-planning#35 Authors: - Kyle Edwards (https://github.com/KyleFromNVIDIA) Approvers: - Robert Maynard (https://github.com/robertmaynard) URL: #5959

Closes #3458 Add PCA embedding initialization to C++ layer and expose it in Python API. ```python from cuml.manifold import TSNE tsne = TSNE( ... init="pca" # ("random" or "pca") ) ``` Authors: - Anupam (https://github.com/aamijar) - Corey J. Nolet (https://github.com/cjnolet) - Dante Gama Dessavre (https://github.com/dantegd) - Micka (https://github.com/lowener) Approvers: - Micka (https://github.com/lowener) - Corey J. Nolet (https://github.com/cjnolet) URL: #5897

This is a step towards adding support for dynamic linking with wheels (splitting the shared libraries out into their own wheels). That's being tracked in rapidsai/build-planning#33 This PR performs a necessary step of moving the cuml folder one level deeper, such that the python folder becomes a parent of multiple full-fledged projects, instead of having the python folder be the top level of one python project. This is split into this PR because this change affects so many files. It's easier to review the actual changes of supporting the split wheel when you don't have to also consider these moves. This change also affects devcontainers, and there will need to be a change similar to rapidsai/devcontainers#283 for cuml. Authors: - Mike Sarahan (https://github.com/msarahan) - Kyle Edwards (https://github.com/KyleFromNVIDIA) Approvers: - James Lamb (https://github.com/jameslamb) - Dante Gama Dessavre (https://github.com/dantegd) URL: #5944

This PR updates the latest CUDA build/test version 12.2.2 to 12.5.1. Contributes to rapidsai/build-planning#73 Authors: - Kyle Edwards (https://github.com/KyleFromNVIDIA) Approvers: - James Lamb (https://github.com/jameslamb) URL: #5963

@KyleFromNVIDIA

Follow up to PR: #5963 Partially addresses issue: rapidsai/build-planning#73 Renames the `.devcontainer`s for CUDA 12.5 cc @KyleFromNVIDIA @jameslamb @trxcllnt (for awareness) Authors: - https://github.com/jakirkham Approvers: - James Lamb (https://github.com/jameslamb) - Paul Taylor (https://github.com/trxcllnt) URL: #5967

After updating everything to CUDA 12.5.1, use `[email protected]` again. Contributes to rapidsai/build-planning#73 Authors: - Kyle Edwards (https://github.com/KyleFromNVIDIA) Approvers: - James Lamb (https://github.com/jameslamb) - https://github.com/jakirkham URL: #5970

Treelite 4.3.0 contains the following improvements: * Support XGBoost 2.1.0, including the UBJSON format (dmlc/treelite#572, dmlc/treelite#578) * [GTIL] Allow inferencing with FP32 input + FP64 model (dmlc/treelite#574). Related: triton-inference-server/fil_backend#391 * Prevent integer overflow for deep LightGBM trees by using DFS order (dmlc/treelite#570). * Support building with latest RapidJSON (dmlc/treelite#567) Authors: - Philip Hyunsu Cho (https://github.com/hcho3) Approvers: - James Lamb (https://github.com/jameslamb) - Dante Gama Dessavre (https://github.com/dantegd) URL: #5968

Closes #5918 Correctly gathers the labels of all workers together for the `labels_` attribute of the cuml.dask KMeans estimator. Authors: - Dante Gama Dessavre (https://github.com/dantegd) Approvers: - Divye Gala (https://github.com/divyegala) URL: #5931

Contributes to rapidsai/build-planning#31 In short, RAPIDS DLFW builds want to produce wheels with unsuffixed dependencies, e.g. `cudf` depending on `rmm`, not `rmm-cu12`. This PR is part of a series across all of RAPIDS to try to support that type of build by setting up CUDA-suffixed and CUDA-unsuffixed dependency lists in `dependencies.yaml`. For more details, see: * rapidsai/build-planning#31 (comment) * rapidsai/cudf#16183 ## Notes for Reviewers ### Why target 24.08? This is targeting 24.08 because: 1. it should be very low-risk 2. getting these changes into 24.08 prevents the need to carry around patches for every library in DLFW builds using RAPIDS 24.08 Authors: - James Lamb (https://github.com/jameslamb) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) URL: #5974

@dantegd

This fixes a call to `raft::stats::mean()` deactivating the sample parameter in the pca code. CC @dantegd Authors: - Malte Förster (https://github.com/mfoerste4) Approvers: - Bradley Dice (https://github.com/bdice) - Dante Gama Dessavre (https://github.com/dantegd) - Corey J. Nolet (https://github.com/cjnolet) URL: #5980

@wphicks

Noticed while reviewing #5154. Plus an extra (probably benign) typo-bug while I'm here. cc @wphicks Authors: - Lawrence Mitchell (https://github.com/wence-) - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - William Hicks (https://github.com/wphicks) URL: #5166

…5973) Closes #5551 * Replace `np.float32` with `"float32"` so that we don't reference the `np` module. By the time `__dealloc__` method is called, modules may have already been unloaded. * Improve the user experience by raising a helpful error when the user attempts to predict with an empty forest. Authors: - Philip Hyunsu Cho (https://github.com/hcho3) - Dante Gama Dessavre (https://github.com/dantegd) Approvers: - Dante Gama Dessavre (https://github.com/dantegd) URL: #5973

Make `ci/run_cuml_dask_pytests.sh` environment-agnostic again. This script is run outside the RAPIDS CI environment, so it should not include calls to utilities only available in that environment. Follow-up to #5761. Authors: - Paul Taylor (https://github.com/trxcllnt) - Dante Gama Dessavre (https://github.com/dantegd) - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - James Lamb (https://github.com/jameslamb) - Dante Gama Dessavre (https://github.com/dantegd) URL: #5950

Authors: - Jinfeng Li (https://github.com/lijinf2) - Dante Gama Dessavre (https://github.com/dantegd) Approvers: - Dante Gama Dessavre (https://github.com/dantegd) URL: #5962

The sparse PCA still densified `X` during the transform step. This defeats the purpose of a sparse PCA in a sense. However ``` precomputed_mean_impact = self.mean_ @ self.components_.T mean_impact = cp.ones((X.shape[0], 1)) @ precomputed_mean_impact.reshape(1, -1) X_transformed = X.dot(self.components_.T) -mean_impact ``` is the same as ``` X = X - self.mean_ X_transformed = X.dot(self.components_.T) ``` The new implementation is faster (but mainly due to the fact that we don't have to rely on cupy's `to_array()`) and uses a lot less memory. Authors: - Severin Dicks (https://github.com/Intron7) - Dante Gama Dessavre (https://github.com/dantegd) Approvers: - Dante Gama Dessavre (https://github.com/dantegd) URL: #5964

This applies some smaller NumPy 2 related fixes. With (in progress) cupy 13.2 fixups, the single gpu test suite seems to be doing mostly fine. There is a single test remaining: ``` test_simpl_set.py::test_simplicial_set_embedding ``` is failing with: ``` (Pdb) cp.asarray(cu_embedding) array([[23067.518, 23067.518], [17334.559, 17334.559], [22713.598, 22713.598], ..., [23238.438, 23238.438], [25416.912, 25416.912], [19748.943, 19748.943]], dtype=float32) ``` being completely different from the reference: ``` array([[5.330462 , 4.3419437], [4.1822557, 5.6225405], [5.200859 , 4.530094 ], ..., [4.852359 , 5.0026293], [5.361374 , 4.1475334], [4.0259256, 5.7187223]], dtype=float32) ``` And I am not sure why that might be, I will prod it a bit more, but it may need someone who knows the methods to have a look. One wrinkle is that hdbscan is not yet released for NumPy 2, but I guess that still required even though sklearn has a version? (Probably, not a big issue, but my fixups scikit-learn-contrib/hdbscan#644 run into some issue even though it doesn't seem NumPy 2 related.) xref: rapidsai/build-planning#38 Authors: - Sebastian Berg (https://github.com/seberg) - https://github.com/jakirkham - Dante Gama Dessavre (https://github.com/dantegd) Approvers: - Kyle Edwards (https://github.com/KyleFromNVIDIA) - Dante Gama Dessavre (https://github.com/dantegd) URL: #5954

This PR fixes the remaining tests and bugs of the encoders, and other utilities for cudf.pandas. Authors: - Dante Gama Dessavre (https://github.com/dantegd) Approvers: - Divye Gala (https://github.com/divyegala) URL: #5990

@beckernick

Closes #4477 Adds the capability for all estimators to accept any dtype by converting the inputs if needed to. Currently, for most estimators, this means converting to float32. Todo: - [x] Add conversion to all methods - [x] Discuss if default to float32 is the correct deafault - [x] Discuss if an option to override that default is needed - [x] Update docstring generator - [x] Add tests cc @beckernick @pentschev @isVoid @divyegala Authors: - Dante Gama Dessavre (https://github.com/dantegd) Approvers: - Divye Gala (https://github.com/divyegala) URL: #5888

1. Adds `build_algo=nn_descent` option to UMAP. Now user can choose the knn graph build algorithm between `"brute_force_knn"` and `"nn_descent"` Defaults to `"auto"`, in which case decides to run with brute force knn or nn descent depending on the given dataset size. `"auto"` decides to run with `brute_force_knn` if either 1) data has <= 50K rows **OR** 2) data is sparse. Otherwise decides to run with `nn_descent`. 50K rows roughly chosen based on grid search below. (runtime in ms) - Discussed with Corey <img width="1038" alt="Screenshot 2024-07-23 at 5 36 34 PM" src="https://github.com/user-attachments/assets/d2ffd7d6-8e94-4ddc-ba76-f301be9bea67"> ``` X_embedded_nnd = cuUMAP(n_neighbors=16, build_algo="nn_descent").fit_transform(data) score_nnd = cuml.metrics.trustworthiness(data, X_embedded_nnd) ``` 2. Adds `data_on_host` option (defaults to False) when calling `fit()` or `fit_transform()` Note that brute force knn cannot be used with data on host ### Running Benchmarks <img width="962" alt="Screenshot 2024-07-23 at 5 41 19 PM" src="https://github.com/user-attachments/assets/7084b326-50bb-46a9-a012-6979278d871d"> Authors: - Jinsol Park (https://github.com/jinsolp) Approvers: - Divye Gala (https://github.com/divyegala) URL: #5910

copy-pr-bot · 2024-08-01T17:26:32Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Closes #6008 --------- Co-authored-by: Dante Gama Dessavre <[email protected]>

review-notebook-app · 2024-08-06T14:54:49Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

raydouglass and others added 30 commits May 20, 2024 17:42

DOC v24.08 Updates [skip ci]

ab05129

Merge pull request #5902 from rapidsai/branch-24.06

dc41227

Forward-merge branch-24.06 into branch-24.08

Merge pull request #5903 from rapidsai/branch-24.06

7585e60

Forward-merge branch-24.06 into branch-24.08

remove unnecessary 'setuptools' dependency (#5901)

1560886

Merge branch-24.06 into branch-24.08

5e13cbf

Merge pull request #5911 from dantegd/branch-24.08-merge-24.06

9ca9f12

Fix conflict of forward-merge #5905 of branch-24.06 into branch-24.08

Merge pull request #5916 from rapidsai/branch-24.06

1b70cec

Forward-merge branch-24.06 into branch-24.08

Merge pull request #5919 from rapidsai/branch-24.06

c0d8d26

Forward-merge branch-24.06 into branch-24.08

Adopt CI/packaging codeowners (#5923)

3aa2bcd

Merge pull request #5926 from rapidsai/branch-24.06

2aac623

Forward-merge branch-24.06 into branch-24.08

CI Fix: use ld_preload to avoid libgomp issue on ARM jobs (#5949)

2830e87

Authors: - Dante Gama Dessavre (https://github.com/dantegd) Approvers: - Mike Sarahan (https://github.com/msarahan) URL: #5949

skip CMake 3.30.0 (#5956)

d82895f

Contributes to rapidsai/build-planning#80 Adds constraints to avoid pulling in CMake 3.30.0, for the reasons described in that issue. Authors: - James Lamb (https://github.com/jameslamb) Approvers: - Bradley Dice (https://github.com/bdice) URL: #5956

Build and test with CUDA 12.5.1 (#5963)

98721e2

This PR updates the latest CUDA build/test version 12.2.2 to 12.5.1. Contributes to rapidsai/build-planning#73 Authors: - Kyle Edwards (https://github.com/KyleFromNVIDIA) Approvers: - James Lamb (https://github.com/jameslamb) URL: #5963

dantegd and others added 12 commits July 24, 2024 21:52

Support int64 index type in MG sparse LogisticRegression (#5962)

a8fda19

Authors: - Jinfeng Li (https://github.com/lijinf2) - Dante Gama Dessavre (https://github.com/dantegd) Approvers: - Dante Gama Dessavre (https://github.com/dantegd) URL: #5962

Fixes for encoders/transformers for cudf.pandas (#5990)

ca2fae2

This PR fixes the remaining tests and bugs of the encoders, and other utilities for cudf.pandas. Authors: - Dante Gama Dessavre (https://github.com/dantegd) Approvers: - Divye Gala (https://github.com/divyegala) URL: #5990

raydouglass requested review from a team as code owners August 1, 2024 17:26

github-actions bot added conda conda issue Cython / Python Cython or Python issue CMake CUDA/C++ ci labels Aug 1, 2024

Add support for XGBoost UBJSON in FIL (#6009)

efb87b4

Closes #6008 --------- Co-authored-by: Dante Gama Dessavre <[email protected]>

Update Changelog [skip ci]

38b7bb4

raydouglass merged commit 6777bc1 into main Aug 8, 2024
4 of 5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RELEASE] cuml v24.08 #6007

[RELEASE] cuml v24.08 #6007

raydouglass commented Aug 1, 2024

copy-pr-bot bot commented Aug 1, 2024

review-notebook-app bot commented Aug 6, 2024

[RELEASE] cuml v24.08 #6007

[RELEASE] cuml v24.08 #6007

Conversation

raydouglass commented Aug 1, 2024

❄️ Code freeze for branch-24.08 and v24.08 release

What does this mean?

What is the purpose of this PR?

copy-pr-bot bot commented Aug 1, 2024

review-notebook-app bot commented Aug 6, 2024

❄️ Code freeze for `branch-24.08` and v24.08 release