Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Improvements on bitset class #1877

Merged
merged 10 commits into from
Oct 24, 2023

Conversation

lowener
Copy link
Contributor

@lowener lowener commented Oct 6, 2023

Related to #1866.
This PR adds useful operations on bitsets: count(), any(), ...

@github-actions github-actions bot added the cpp label Oct 6, 2023
@lowener lowener added 3 - Ready for Review improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Oct 10, 2023
@lowener lowener marked this pull request as ready for review October 10, 2023 14:27
@lowener lowener requested a review from a team as a code owner October 10, 2023 14:27
@copy-pr-bot
Copy link

copy-pr-bot bot commented Oct 13, 2023

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@lowener
Copy link
Contributor Author

lowener commented Oct 13, 2023

/ok to test

Copy link
Member

@divyegala divyegala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per suggestions from my last review, pointer access methods should be called data() when they are returning a pointer. data_handle() is specific to mdspan because it may or may not return a pointer

cpp/include/raft/core/bitset.cuh Show resolved Hide resolved
cpp/include/raft/core/bitset.cuh Outdated Show resolved Hide resolved
cpp/include/raft/core/bitset.cuh Outdated Show resolved Hide resolved
},
raft::make_const_mdspan(this_span),
other_span);
return *this;
}

private:
raft::device_uvector<bitset_t> bitset_;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what this type is, is this from rmm? Why not use raft::device_vector or rmm::device_uvector?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is buried in the container policy code for RAFT and I'd prefer using RAFT types, even if they are just direct wrappers around thrust or rmm types (it provides us a safe facade to maintain api compatibility even if the underlying impls should somehow change or need to be modified).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

raft::device_vector is the way to go here!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really, I chose raft::device_uvector because it is resizable. raft::device_vector isn't, and this resizable feature is very helpful for incremental addition to IVF indexes (and soon CAGRA)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my own education, won't raft::device_vector initialize the underlying memory? Would it be worth exposing raft::device_uvector more publicly specifically for cases like this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@divyegala You're quite right! Had rmm::device_vector in my mind instead of raft::device_vector. raft::device_vector uses rmm::device_uvector for its container policy, so no initialization.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding my two cents on the resize question, I'd be quite squeamish about using the resources from the constructor. Not only do we not know if that resources object has gone out of scope, but it also makes it much harder to track and ensure stream safety if we are not explicitly passing the stream we want to use when a resize might occur. For example:

auto res1 = device_resources{};  // Some underlying stream
auto arr = device_mdarray{res1, ...};
res1.sync_stream();
auto res2 = device_resources();
// modify underlying data of arr using the stream from res2
res2.sync_stream();
// Is it now safe to copy back data to the host on res2? Maybe. If a resize was triggered in between, we'd need another res1.sync_stream() first

Alternatively, if we sync the stream from res1 in the resize method, we might still have an issue if the data are actively being modified on another stream during the resize.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wphicks thanks for typing that out, you are right and that just clarifies me for again why we didn't end up supporting this in the first place anyway.

raft::device_uvector is supposed to be an internal/detail type and I do not belive it should be used directly. If resize() is needed then we should switch the type to rmm::device_uvector. Thoughts on this @cjnolet @wphicks ?

Copy link
Member

@cjnolet cjnolet Oct 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, the original suggestion to use a raft type instead of directly using the rmm type in public API code was from me. To make the point more clear, this is about consistency and safeguarding our implementations from changes that we cannot control. This is in a very similar vein to our diligent wrapping of the CUDA math libraries APIs so that we can centralize those calls and change any underlying details should the rug get pulled out from under us.

Upon closer inspection to Mickael's changes, however, I very much agree that we should not be storing the raft::resources instance as object members and Will's example above is one of the very reasons we want to avoid this. We wouldn't otherwise be doing it in the device_container_policy if it weren't for the fact that we needed the deferred allocation. To Divye'a point, the device_uvector had been kept an implementation detail until recently. I would prefer that we fix that problem at some point soon, rather than to continue using this pattern.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed raft::resources instance from the bitset class.

cpp/include/raft/core/bitset.cuh Outdated Show resolved Hide resolved
Copy link
Member

@divyegala divyegala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved, pending decision from reviewers on whether to use raft::device_uvector or rmm::device_uvector

@lowener
Copy link
Contributor Author

lowener commented Oct 21, 2023

/ok to test

@cjnolet
Copy link
Member

cjnolet commented Oct 21, 2023

@wphicks are you happy with the current implementation here? I very much agree that we shouldn't be explicitly storing raft::device_resources off anywhere.

I think we should revisit the current implementation of device_container_policy at some point and try to find a better way to implement the deferred allocation strategy without storing off the resources.

@wphicks
Copy link
Contributor

wphicks commented Oct 23, 2023

@cjnolet Happy and agree 100% on device_container_policy

@cjnolet
Copy link
Member

cjnolet commented Oct 24, 2023

/merge

@rapids-bot rapids-bot bot merged commit 53c2539 into rapidsai:branch-23.12 Oct 24, 2023
61 checks passed
@lowener lowener deleted the 23.12-bitset-fea branch October 24, 2023 10:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review cpp improvement Improvement / enhancement to an existing function non-breaking Non-breaking change
Projects
Development

Successfully merging this pull request may close these issues.

4 participants