Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CAGRA support with latest RAFT #175

Closed
wants to merge 5 commits into from

Conversation

wphicks
Copy link
Contributor

@wphicks wphicks commented Nov 4, 2023

This PR brings in the latest features from RAFT and significantly refactors the RAFT integration code. The primary goal of this refactor is to more clearly separate Knowhere code from RAFT integration code from RAFT itself. This leads to three layers in the updated integration:

  1. knowhere: Code which directly creates e.g. a new IndexNode type in Knowhere is implemented in such a way as to expose no RAFT symbols or CUDA calls to Knowhere headers or other Knowhere code
  2. raft_knowhere: This namespace is used for code responsible for translating between types, symbols, and concepts in Knowhere to types, symbols and concepts in RAFT
  3. raft_proto: This namespace is used for features that may ultimately be upstreamed to RAFT but which are immediately useful to Knowhere.

CAGRA benchmarks have been substantially simplified in this PR and should run significantly faster. Throughput for batch size 1 is still not as high as CAGRA potentially allows, but it is significantly higher than previous benchmarks. Performance is currently bottlenecked on many small host-to-device transfers, but this can be improved in a follow-up PR. Throughput for larger batch sizes is substantially improved, with a median 17% overhead relative to raw RAFT calls during testing.

Given the significant scope of this PR, I will add some comments in-line, but here is the overall summary of changes:

  • Update to RAFT 23.12
  • Update CAGRA integration to improve performance
  • Avoid post-filtering using RAFT's new filtering feature Use RAFT's new device_resources_manager to simplify and optimize resource initialization
  • Update build infratructure to build for all supported CUDA architectures Refactor RAFT integration code to more cleanly separate RAFT code from Knowhere code
  • Avoid exposing RAFT symbols in any Knowhere header
  • Simplify CAGRA benchmarking
  • Allow refinement of initial results for all RAFT index types except CAGRA

NOTE: This PR currently points to a fork of RAFT while waiting for rapidsai/raft#1831 to merge. This was impacted by today's GIthub outage. Before merging, we should shift back to the main RAFT repo.

Close #176

Update to RAFT 23.12
Update CAGRA integration to improve performance
Avoid post-filtering using RAFT's new filtering feature
Use RAFT's new device_resources_manager to simplify and optimize
resource initialization
Update build infratructure to build for all supported CUDA architectures
Refactor RAFT integration code to more cleanly separate RAFT code from
Knowhere code
Avoid exposing RAFT symbols in any Knowhere header

Signed-off-by: William Hicks <[email protected]>
@sre-ci-robot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: wphicks
To complete the pull request process, please assign chasingegg after the PR has been reviewed.
You can assign the PR to them by writing /assign @chasingegg in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@sre-ci-robot
Copy link
Collaborator

Welcome @wphicks! It looks like this is your first PR to zilliztech/knowhere 🎉

Copy link

mergify bot commented Nov 4, 2023

@wphicks 🔍 Important: PR Classification Needed!

For efficient project management and a seamless review process, it's essential to classify your PR correctly. Here's how:

  1. If you're fixing a bug, label it as kind/bug.
  2. For small tweaks (less than 20 lines without altering any functionality), please use kind/improvement.
  3. Significant changes that don't modify existing functionalities should be tagged as kind/enhancement.
  4. Adjusting APIs or changing functionality? Go with kind/feature.

For any PR outside the kind/improvement category, ensure you link to the associated issue using the format: “issue: #”.

Thanks for your efforts and contribution to the community!.

Copy link
Contributor Author

@wphicks wphicks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added some explanatory comments inline where I expect there to be questions about specific changes.

@@ -12,19 +12,28 @@
# License for the specific language governing permissions and limitations under
# the License

cmake_minimum_required(VERSION 3.23.0 FATAL_ERROR)
project(knowhere CXX C)
cmake_minimum_required(VERSION 3.26.4 FATAL_ERROR)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Required for RAPIDS CMake used in RAFT 23.12.


set(CMAKE_EXPORT_COMPILE_COMMANDS ON)
list(APPEND CMAKE_MODULE_PATH "${CMAKE_CURRENT_SOURCE_DIR}/cmake/modules/")
include(GNUInstallDirs)
include(ExternalProject)
include(cmake/utils/utils.cmake)

knowhere_option(WITH_RAFT "Build with RAFT indexes" OFF)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved this up because CMAKE_CUDA_ARCHITECTURES needs to be filled in before initializing the project.

benchmark/hdf5/benchmark_float_qps.cpp Outdated Show resolved Hide resolved
test_cagra(const knowhere::Json& cfg) {
auto conf = cfg;

auto find_smallest_max_iters = [&](float expected_recall) -> int32_t {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finding the best max_iterations has higher impact than searching over itopk

benchmark/hdf5/benchmark_float_qps.cpp Outdated Show resolved Hide resolved
@@ -0,0 +1,125 @@
/**
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is the header actually included elsewhere in Knowhere. It exposes no RAFT symbols and does not require CUDA compilation.

src/common/raft/proto/ivf_to_sample_filter.cuh Outdated Show resolved Hide resolved
@@ -0,0 +1,292 @@
/**
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file provides a generic template for Knowhere indexes based on RAFT.

class RaftIvfFlatConfig : public IvfFlatConfig {
public:
struct GpuRaftIvfPqConfig : public IvfPqConfig {
CFG_FLOAT refine_ratio;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This newly-introduced parameter allows additional refinement after an initial selection of candidates from an index search.

return json;
};

auto refined_gen = [](auto&& upstream_gen) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Helper for generating identical configurations with refinement.

@wphicks wphicks marked this pull request as ready for review November 13, 2023 20:09
@Presburger Presburger closed this Nov 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Provide support for RAFT CAGRA indexes
3 participants