-
Notifications
You must be signed in to change notification settings - Fork 908
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
adding wheel build for libcudf #15483
Conversation
973ce98
to
1ac95d4
Compare
Stuck on dlpack here. The failure is:
https://github.com/rapidsai/cudf/actions/runs/8662725105/job/23755706695?pr=15483#step:9:290 |
I'm not at all sure that I've done this correctly. This is intended to support local development with a devcontainer for the work in rapidsai/cudf#15483
@vyasr the build is passing now, but the tests seem to show that the library (libcudf.so) can't be found. It does seem like libcudf-cuXY is being installed OK, so I'm thinking it has to be some kind of RPATH issue. I don't see how you've addressed this in the RMM PR. Any advice? I've inspected one of the libcudf wheels, and it has both lib and lib64 folders. The lib folder has nvcomp stuff, and lib64 has libcudf.so. |
You won't see this in the rmm PR because rmm is header-only, so there is no library there. The raft PR is a more useful example here. What you want to look at is actually the Python components of the libraft library, specifically the |
5b6bc46
to
580779d
Compare
You have to set the variables before the rapids-cmake clone happens, which in this case is in the |
Co-authored-by: Bradley Dice <[email protected]>
…comp is found at build time
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Error in the pip devcontainers:
ERROR: Could not find a version that satisfies the requirement libcudf-cu12==24.10.*,>=0.0.0a0 (from versions: none)
ERROR: No matching distribution found for libcudf-cu12==24.10.*,>=0.0.0a0
Is this blocked by rapidsai/devcontainers#271?
Edit: yes. From the description:
We should merge that when this is just about ready, then re-run the failing devcontainers CI jobs here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost there. Really only the build.sh change that's a blocker I think.
python -m auditwheel repair -w ${package_dir}/final_dist ${package_dir}/dist/* | ||
python -m auditwheel repair \ | ||
--exclude libcudf.so \ | ||
--exclude libarrow.so.1601 \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Depending on merge order this can be removed.
RAPIDS_PY_CUDA_SUFFIX="$(rapids-wheel-ctk-name-gen ${RAPIDS_CUDA_VERSION})" | ||
|
||
mkdir -p ${package_dir}/final_dist | ||
python -m auditwheel repair --exclude libarrow.so.1601 -w ${package_dir}/final_dist ${package_dir}/dist/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above, this will be removable soon.
I'm not at all sure that I've done this correctly. This is intended to support local development with a devcontainer for the work in rapidsai/cudf#15483 Co-authored-by: James Lamb <[email protected]>
devcontainers job passed after merging rapidsai/devcontainers#271, https://github.com/rapidsai/cudf/actions/runs/10515765176/job/29173481700?pr=15483 😁 thanks @AyodeAwe Given that + all the approvals here, I'm gonna merge this. Thanks so much for all the help everyone!!! |
/merge |
…lishing (#16650) Follow-up to #15483. Contributes to rapidsai/build-planning#33 Wheel publishing for `libcudf` is failing like this: ```text Error: File "./dist/*.whl" does not exist ``` ([build link](https://github.com/rapidsai/cudf/actions/runs/10528569930/job/29176811683)) Because the `package-type` was not set to `cpp` in the `wheels-publish` CI workflow, and that workflow defaults to `python`. ([shared-workflows code link](https://github.com/rapidsai/shared-workflows/blob/157e9824e6e2181fca9aa5c4bea4defd4cc322b0/.github/workflows/wheels-publish.yaml#L23-L26)). This fixes that, and makes that choice explicit for all wheel publishing jobs. References for this `package-type` argument: * rapidsai/shared-workflows#209 * rapidsai/gha-tools#105 Authors: - James Lamb (https://github.com/jameslamb) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) - Kyle Edwards (https://github.com/KyleFromNVIDIA) - Bradley Dice (https://github.com/bdice) URL: #16650
Contributes to rapidsai/build-planning#33. Proposes the following for `cuspatial` wheels: * add build and runtime dependencies on `libcudf` wheels * stop vendoring copies of `libcudf.so`, `libnvcomp.so`, `libnvcomp_bitcomp.so`, and `libnvcomp_gdeflate.so` - *(load `libcudf.so` dynamically at runtime instead)* And other related changes for development/CI: * combine all `pip install` calls into 1 in wheel-testing scripts - *like rapidsai/cudf#16575 - *to improve the chance that packaging issues are discovered in CI* * `dependencies.yaml` changes: - more use of YAML anchors = less duplication - use dedicated `depends_on_librmm` and `depends_on_libcudf` groups * explicitly pass a package type to `gha-tools` wheel uploading/downloading scripts ## Notes for Reviewers ### Benefits of these changes Unblocks CI in this repo (ref: #1444 (comment), #1441 (comment)). Reduces wheel sizes for `cuspatial` wheels by about 125MB 😁 | wheel | size (before) | size (this PR) | |:-----------:|-------------:|---------------:| | `cuspatial` | 146.0M | 21M | | `cuproj ` | 0.9M | 0.9M | |**TOTAL** | **146.9M** | **21.9M** | *NOTES: size = compressed, "before" = 2024-08-21 nightlies (c60bd4d), CUDA = 12, Python = 3.11* <details><summary>how I calculated those (click me)</summary> ```shell # note: 2024-08-21 because that was the most recent date with # successfully-built cuspatial nightlies # docker run \ --rm \ -v $(pwd):/opt/work:ro \ -w /opt/work \ --network host \ --env RAPIDS_NIGHTLY_DATE=2024-08-21 \ --env RAPIDS_NIGHTLY_SHA=c60bd4d \ --env RAPIDS_PR_NUMBER=1447 \ --env RAPIDS_PY_CUDA_SUFFIX=cu12 \ --env RAPIDS_REPOSITORY=rapidsai/cuspatial \ --env WHEEL_DIR_BEFORE=/tmp/wheels-before \ --env WHEEL_DIR_AFTER=/tmp/wheels-after \ -it rapidsai/ci-wheel:cuda12.5.1-rockylinux8-py3.11 \ bash mkdir -p "${WHEEL_DIR_BEFORE}" mkdir -p "${WHEEL_DIR_AFTER}" py_projects=( cuspatial cuproj ) for project in "${py_projects[@]}"; do # before RAPIDS_BUILD_TYPE=nightly \ RAPIDS_PY_WHEEL_NAME="${project}_${RAPIDS_PY_CUDA_SUFFIX}" \ RAPIDS_REF_NAME="branch-24.10" \ RAPIDS_SHA=${RAPIDS_NIGHTLY_SHA} \ rapids-download-wheels-from-s3 python "${WHEEL_DIR_BEFORE}" # after RAPIDS_BUILD_TYPE=pull-request \ RAPIDS_PY_WHEEL_NAME="${project}_${RAPIDS_PY_CUDA_SUFFIX}" \ RAPIDS_REF_NAME="pull-request/${RAPIDS_PR_NUMBER}" \ rapids-download-wheels-from-s3 python "${WHEEL_DIR_AFTER}" done du -sh ${WHEEL_DIR_BEFORE}/* du -sh ${WHEEL_DIR_BEFORE} du -sh ${WHEEL_DIR_AFTER}/* du -sh ${WHEEL_DIR_AFTER} ``` </details> Reduces the amount of additional work required to start shipping `libcuspatial` wheels. ### Background This is part of ongoing work towards packaging `libcuspatial` as a wheel. relevant prior work: * packaging `libcudf` wheels: rapidsai/cudf#15483 * consolidating `pip install` calls in CI scripts for `cudf`: rapidsai/cudf#16575 * `cudf` dropping its Arrow library dependency: rapidsai/cudf#16640 ### How I tested this Confirmed in local builds and CI logs that `cudf` is being *found*, not *built*, in `cuspatial` builds. ```text -- CPM: Using local package [email protected] ``` ([build link](https://github.com/rapidsai/cuspatial/actions/runs/10602971716/job/29386288614?pr=1447#step:9:23472)) Built `cuspatial` wheels locally and ran all the unit tests, without issue. # Authors: - James Lamb (https://github.com/jameslamb) Approvers: - Bradley Dice (https://github.com/bdice) - Vyas Ramasubramani (https://github.com/vyasr) - Matthew Roeschke (https://github.com/mroeschke) URL: #1447
Follow-up to #16650 and #15483. `libcudf` wheels are identical (same content, same filename) across Python versions, but due to an oversight in the PRs linked above, we're currently building nightlies of them once per Python version supported by RAPIDS 😭 You can see this on recent runs of the `build` workflow: <img width="752" alt="image" src="https://github.com/user-attachments/assets/ba3a2192-1752-4d32-a79b-6f238fae9f18"> ([build link](https://github.com/rapidsai/cudf/actions/runs/10627299703/job/29460218854)) This PR fixes that by applying the same matrix filter to `libcudf` nightly build jobs as is currently applied to PR jobs: https://github.com/rapidsai/cudf/blob/5e420ff63ba2997a37bf5dfbfaa73c5f05225f9d/.github/workflows/pr.yaml#L195-L200 Authors: - James Lamb (https://github.com/jameslamb) Approvers: - Kyle Edwards (https://github.com/KyleFromNVIDIA) - Bradley Dice (https://github.com/bdice) URL: #16714
Follow-up to #15483. Contributes to rapidsai/build-planning#33. Adds a build-time dependency on `libkvikio` wheels for `libcudf` wheels (per #15483 (comment)). With this change, CPM is no longer used to download and install the kvikio headers. Before: ```text -- Found cuFile: /usr/local/cuda/lib64/libcufile.so -- CPM: Adding package [email protected] (branch-24.10) ``` ([recent build link from branch-24.10](https://github.com/rapidsai/cudf/actions/runs/10780576194/job/29896649202#step:9:7673)) After: ```text -- KvikIO: Found cuFile Batch API: TRUE -- KvikIO: Found cuFile Stream API: TRUE -- CPM: Using local package [email protected] ``` ([build link from this PR](https://github.com/rapidsai/cudf/actions/runs/10780504202/job/29896555443?pr=16778#step:9:7754)) ## Notes for Reviewers ### This removes kvikio headers/CMake files from libcudf wheels Cuts around 0.8 MB (23 files) out of `libcudf` wheels. As of this PR, these would no longer be vendored in `libcudf` wheels: ```text 0 09-08-2024 06:17 libcudf/include/kvikio/ 0 09-08-2024 06:17 libcudf/include/kvikio/shim/ 6356 09-08-2024 06:17 libcudf/include/kvikio/batch.hpp 3812 09-08-2024 06:17 libcudf/include/kvikio/buffer.hpp 10499 09-08-2024 06:17 libcudf/include/kvikio/utils.hpp 1399 09-08-2024 06:17 libcudf/include/kvikio/cufile_config.hpp 33385 09-08-2024 06:17 libcudf/include/kvikio/file_handle.hpp 7299 09-08-2024 06:17 libcudf/include/kvikio/driver.hpp 9678 09-08-2024 06:17 libcudf/include/kvikio/defaults.hpp 5352 09-08-2024 06:17 libcudf/include/kvikio/stream.hpp 6002 09-08-2024 06:17 libcudf/include/kvikio/error.hpp 4501 09-08-2024 06:17 libcudf/include/kvikio/bounce_buffer.hpp 3197 09-08-2024 06:17 libcudf/include/kvikio/parallel_operation.hpp 9864 09-08-2024 06:17 libcudf/include/kvikio/posix_io.hpp 717 09-08-2024 06:17 libcudf/include/kvikio/version_config.hpp 4529 09-08-2024 06:17 libcudf/include/kvikio/shim/cuda.hpp 3331 09-08-2024 06:17 libcudf/include/kvikio/shim/utils.hpp 4055 09-08-2024 06:17 libcudf/include/kvikio/shim/cufile_h_wrapper.hpp 2242 09-08-2024 06:17 libcudf/include/kvikio/shim/cuda_h_wrapper.hpp 7510 09-08-2024 06:17 libcudf/include/kvikio/shim/cufile.hpp 0 09-08-2024 06:17 libcudf/lib64/cmake/kvikio/ 5031 09-08-2024 06:17 libcudf/lib64/cmake/kvikio/kvikio-targets.cmake 3681 09-08-2024 06:17 libcudf/lib64/cmake/kvikio/kvikio-config-version.cmake 6915 09-08-2024 06:17 libcudf/lib64/cmake/kvikio/kvikio-config.cmake 1529 09-08-2024 06:17 libcudf/lib64/cmake/kvikio/kvikio-dependencies.cmake 3851 09-08-2024 06:17 libcudf/lib64/cmake/kvikio/FindcuFile.cmake ``` This is safe because kvikio is a PRIVATE dependency of `libcudf`. https://github.com/rapidsai/cudf/blob/150f1b10ed9c702d5283216b746df685e1708716/cpp/CMakeLists.txt#L796-L802 # Authors: - James Lamb (https://github.com/jameslamb) - Bradley Dice (https://github.com/bdice) Approvers: - Bradley Dice (https://github.com/bdice) URL: #16778
This CMake option was removed by #15483. Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - James Lamb (https://github.com/jameslamb) URL: #16879
Description
Contributes to rapidsai/build-planning#33
Adds a standalone
libcudf
wheel, containing thelibcudf
C++ shared library.Fixes #16588
Checklist
Notes for Reviewers
Dependency Flows
My (@jameslamb)'s interpretation of the state we want to get to with this PR.
Size changes
libcudf
.pylibcudf
cudf
cudf-polars
dask-cudf
NOTES: size = compressed, "before" = 2024-08-21 nightlies (58799d6)
how I calculated those (click me)
Related work
Corresponding
devcontainers
PR: rapidsai/devcontainers#271We should merge that when this is just about ready, then re-run the failing
devcontainers
CI jobs here.