Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds pylibcudf Scalar #14055

Closed
wants to merge 40 commits into from
Closed

Conversation

vyasr
Copy link
Contributor

@vyasr vyasr commented Sep 7, 2023

Description

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@github-actions github-actions bot added Python Affects Python cuDF API. CMake CMake build issue labels Sep 7, 2023
Comment on lines 46 to 47
# Convert the value to a cudf object via pyarrow
arr = pyarrow.lib.array([value.as_py()], type=value.type)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More effort, but I think we can use the C++-level ArrayBuilder interface to get an arrow array from which we can make a table to pass to libcudf. If we push it all onto C++ then we can probably avoid some back and forth too

Copy link
Contributor

@wence- wence- Sep 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This (untested) might be a useful starting point:

diff --git a/cpp/src/interop/from_arrow.cu b/cpp/src/interop/from_arrow.cu
index 30cfee97fd..1a1d5a9bc2 100644
--- a/cpp/src/interop/from_arrow.cu
+++ b/cpp/src/interop/from_arrow.cu
@@ -472,4 +472,28 @@ std::unique_ptr<table> from_arrow(arrow::Table const& input_table,
   return detail::from_arrow(input_table, cudf::get_default_stream(), mr);
 }
 
+std::unique_ptr<scalar> from_arrow(arrow::Scalar const& scalar, rmm::mr::device_memory_resource* mr)
+{
+  auto b = arrow::MakeBuilder(scalar.type);
+  if (b.ok()) {
+    auto builder = std::move(b.ValueOrDie());
+    auto status  = builder->AppendScalar(scalar);
+    if (status != arrow::Status::OK()) { CUDF_FAIL("Arrow ArrayBuilder::AppendScalar failed"); }
+    auto array_result = builder->Finish();
+    if (array_result.ok()) {
+      auto array = array_result.ValueOrDie();
+      auto type  = detail::arrow_to_cudf_type(*array->type().get());
+      return detail::get_element(
+        detail::get_column(*array.get(), type, true, cudf::get_default_stream(), mr)->view(),
+        0,
+        cudf::get_default_stream(),
+        mr);
+    } else {
+      CUDF_FAIL("Arrow ArrayBuilder::Finish failed");
+    }
+  } else {
+    CUDF_FAIL("Arrow MakeBuilder failed");
+  }
+}
+
 }  // namespace cudf

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was hoping to avoid the C++ API, but at this point I think you're right that the most straightforward solution is to implement at this level. I'll look into it.

@github-actions github-actions bot added the libcudf Affects libcudf (C++/CUDA) code. label Sep 12, 2023
@vyasr
Copy link
Contributor Author

vyasr commented Oct 4, 2023

The core of this PR is in #14121, #14124, and #14133. The only remaining bit that's only on this branch is the beginnings of a scatter implementation, which didn't actually get very far. I'll handle that properly in another PR once #14133 is merged, but at this point this PR can be closed since it was primarily to help demonstrate the approach I was taking before I had a chance to really clean up the code.

@vyasr vyasr closed this Oct 4, 2023
@vyasr vyasr deleted the feat/pylibcudf_scalar branch October 4, 2023 01:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CMake CMake build issue libcudf Affects libcudf (C++/CUDA) code. Python Affects Python cuDF API.
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

2 participants