Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Constructorsv2 #249

Open
wants to merge 44 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 23 commits
Commits
Show all changes
44 commits
Select commit Hold shift + click to select a range
a1ea4d3
primitve array constructors
DerThorsten Oct 18, 2024
9401544
wip
DerThorsten Oct 22, 2024
c64ba0b
merged
DerThorsten Oct 22, 2024
37bac80
meta
DerThorsten Oct 24, 2024
74fe539
fail on destructor
DerThorsten Oct 24, 2024
78b5bf7
...segfault
DerThorsten Oct 24, 2024
bb48696
megrged
DerThorsten Oct 24, 2024
372ac13
wip
DerThorsten Oct 25, 2024
7009c14
working
DerThorsten Oct 25, 2024
672eb37
working
DerThorsten Oct 25, 2024
deb2347
tuple
DerThorsten Oct 25, 2024
9456fee
more
DerThorsten Oct 25, 2024
f3864bb
more
DerThorsten Oct 25, 2024
b0e7dad
exotic.....
DerThorsten Oct 25, 2024
ca2229d
exotic.....
DerThorsten Oct 25, 2024
135a991
simplify
DerThorsten Oct 25, 2024
03860a0
....the exotic platforms again....
DerThorsten Oct 25, 2024
e758a64
....the exotic platforms again....
DerThorsten Oct 25, 2024
1e005e3
remove useless cast
DerThorsten Oct 25, 2024
c00a77c
i hate the exotic platforms
DerThorsten Oct 25, 2024
45cebd6
i hate the exotic platforms
DerThorsten Oct 25, 2024
158db2d
Merge branch 'main' into constructorsv2
DerThorsten Oct 25, 2024
0bbc89c
merged
DerThorsten Oct 25, 2024
4cf6663
cleanup
DerThorsten Oct 25, 2024
3934c68
cleanup
DerThorsten Oct 25, 2024
19d48eb
another buffer
DerThorsten Oct 27, 2024
ab34b8d
fix extract
DerThorsten Oct 27, 2024
a11c003
fix
DerThorsten Oct 27, 2024
117f536
fix
DerThorsten Oct 27, 2024
6d321fd
fix
DerThorsten Oct 27, 2024
6be7797
..discovered bug
DerThorsten Oct 28, 2024
c1f66ca
workaround
DerThorsten Oct 28, 2024
0b37081
Merge branch 'main' into constructorsv2
DerThorsten Oct 28, 2024
74e2614
clean
DerThorsten Oct 28, 2024
e80420c
clean
DerThorsten Oct 28, 2024
2b8e589
clean
DerThorsten Oct 28, 2024
1bccd1f
Merge branch 'main' into constructorsv2
DerThorsten Oct 28, 2024
fbbf6ff
cycles..
DerThorsten Oct 28, 2024
78f1e6f
merged
DerThorsten Oct 28, 2024
8568f47
test
DerThorsten Oct 28, 2024
75ab2c9
Merge remote-tracking branch 'upstream/main' into constructorsv2
DerThorsten Oct 28, 2024
e074fcd
undid woraround
DerThorsten Oct 28, 2024
6675e29
no dependencies
DerThorsten Oct 28, 2024
9b0b140
added sparrowm deps again
DerThorsten Oct 28, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 21 additions & 1 deletion include/sparrow/array.hpp
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
// Copyright 2024 Man Group Operations Limited

Check notice on line 1 in include/sparrow/array.hpp

View workflow job for this annotation

GitHub Actions / build

Run clang-format on include/sparrow/array.hpp

File include/sparrow/array.hpp does not conform to Custom style guidelines. (lines 17, 35, 37, 43, 58, 67, 69, 70, 71, 74)
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
Expand All @@ -17,6 +17,7 @@
#include "sparrow/c_interface.hpp"
#include "sparrow/config/config.hpp"
#include "sparrow/layout/array_wrapper.hpp"
#include "sparrow/layout/array_access.hpp"
#include "sparrow/layout/nested_value_types.hpp"
#include "sparrow/types/data_traits.hpp"

Expand All @@ -29,12 +30,19 @@
using size_type = std::size_t;
using value_type = array_traits::value_type;
using const_reference = array_traits::const_reference;


// array data will be moved into the array object
template<class ARRAY_TYPE>
requires std::is_rvalue_reference_v<ARRAY_TYPE&&>
array(ARRAY_TYPE&& array);

SPARROW_API array() = default;

SPARROW_API array(ArrowArray&& array, ArrowSchema&& schema);
SPARROW_API array(ArrowArray&& array, ArrowSchema* schema);
SPARROW_API array(ArrowArray* array, ArrowSchema* schema);



SPARROW_API bool owns_arrow_array() const;
SPARROW_API array& get_arrow_array(ArrowArray*&);
Expand All @@ -47,9 +55,21 @@
SPARROW_API size_type size() const;
SPARROW_API const_reference operator[](size_type) const;

SPARROW_API cloning_ptr<array_wrapper> && extract_array_wrapper() &&;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return type should be a value type instead of an rvalue reference to prevent dangling references.


private:

cloning_ptr<array_wrapper> p_array = nullptr;

friend detail::array_access;
};

template<class ARRAY_TYPE>
requires std::is_rvalue_reference_v<ARRAY_TYPE&&>
array::array(ARRAY_TYPE&& arr)
: p_array( new array_wrapper_impl<std::remove_reference_t<ARRAY_TYPE>>(std::move(arr)))
{
}
Copy link
Collaborator

@JohanMabille JohanMabille Oct 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since ARRAY_TYPE is known at compile time, maybe something like:

    template<class ARRAY_TYPE>
    requires std::is_rvalue_reference_v<ARRAY_TYPE&&>
    array::array(ARRAY_TYPE&& array)
        : p_array(new array_wrapper_impl<ARRAY_TYPE>(std::move(array)))
    {
    }

would be enough? It avoids extracting the proxy and going through the factory.


}

52 changes: 52 additions & 0 deletions include/sparrow/layout/array_access.hpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
// Copyright 2024 Man Group Operations Limited
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

#pragma once

#include "sparrow/arrow_array_schema_proxy.hpp"


namespace sparrow::detail
{

class array_access
{
public:
template<class ARRAY>
static inline const sparrow::arrow_proxy& storage(const ARRAY& array)
{
return array.storage();
}

template<class ARRAY>
static inline sparrow::arrow_proxy& storage(ARRAY& array)
{
return array.storage();
}

template<class ARRAY>
requires(std::is_rvalue_reference_v<ARRAY&&>)
static inline sparrow::arrow_proxy&& extract_arrow_proxy(ARRAY&& array)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same remark here (notice this is just about the return type, the implementation is correct).

{
return std::move(array).extract_arrow_proxy();
}

template<class ARRAY>
requires(std::is_rvalue_reference_v<ARRAY&&>)
static inline auto && extract_array_wrapper(ARRAY&& array)
{
return std::move(array).extract_array_wrapper();
}
};
}
17 changes: 15 additions & 2 deletions include/sparrow/layout/array_base.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -23,9 +23,12 @@
#include "sparrow/buffer/dynamic_bitset/dynamic_bitset_view.hpp"
#include "sparrow/layout/layout_iterator.hpp"
#include "sparrow/utils/crtp_base.hpp"
#include "sparrow/layout/array_access.hpp"

#include "sparrow/utils/iterator.hpp"
#include "sparrow/utils/nullable.hpp"


namespace sparrow
{
/**
Expand Down Expand Up @@ -108,32 +111,36 @@ namespace sparrow

protected:

array_crtp_base(arrow_proxy);
explicit array_crtp_base(arrow_proxy);

array_crtp_base(const array_crtp_base&) = default;
array_crtp_base& operator=(const array_crtp_base&) = default;

array_crtp_base(array_crtp_base&&) = default;
array_crtp_base& operator=(array_crtp_base&&) = default;

[[nodiscard]] arrow_proxy && extract_arrow_proxy() &&;
[[nodiscard]] arrow_proxy& get_arrow_proxy();
[[nodiscard]] const arrow_proxy& get_arrow_proxy() const;


bitmap_const_reference has_value(size_type i) const;

const_bitmap_iterator bitmap_begin() const;
const_bitmap_iterator bitmap_end() const;

const_bitmap_iterator bitmap_cbegin() const;
const_bitmap_iterator bitmap_cend() const;

private:
arrow_proxy m_proxy;

// friend classes
friend class layout_iterator<iterator_types>;
template <class T>
friend class array_wrapper_impl;

friend class detail::array_access;
};

template <class D>
Expand Down Expand Up @@ -212,6 +219,12 @@ namespace sparrow
{
return m_proxy;
}

template <class D>
auto array_crtp_base<D>::extract_arrow_proxy() && -> arrow_proxy&&
{
return std::move(m_proxy);
}

template <class D>
auto array_crtp_base<D>::has_value(size_type i) const -> bitmap_const_reference
Expand Down
19 changes: 19 additions & 0 deletions include/sparrow/layout/array_wrapper.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -69,9 +69,13 @@ namespace sparrow

enum data_type data_type() const;
bool is_dictionary() const;


[[nodiscard]] arrow_proxy&& extract_arrow_proxy() &&;
[[nodiscard]] arrow_proxy& get_arrow_proxy();
[[nodiscard]] const arrow_proxy& get_arrow_proxy() const;


protected:

array_wrapper(enum data_type dt);
Expand All @@ -83,6 +87,7 @@ namespace sparrow
virtual bool is_dictionary_impl() const = 0;
virtual arrow_proxy& get_arrow_proxy_impl() = 0;
virtual const arrow_proxy& get_arrow_proxy_impl() const = 0;
virtual arrow_proxy&& extract_arrow_proxy_impl() = 0;
virtual wrapper_ptr clone_impl() const = 0;
};

Expand Down Expand Up @@ -110,6 +115,7 @@ namespace sparrow
bool is_dictionary_impl() const override;
arrow_proxy& get_arrow_proxy_impl() override;
const arrow_proxy& get_arrow_proxy_impl() const override;
arrow_proxy&& extract_arrow_proxy_impl() override;
wrapper_ptr clone_impl() const override;

using storage_type = std::variant<value_ptr<T>, std::shared_ptr<T>, T*>;
Expand Down Expand Up @@ -146,12 +152,18 @@ namespace sparrow
{
return get_arrow_proxy_impl();
}

inline arrow_proxy&& array_wrapper::extract_arrow_proxy() &&
{
return extract_arrow_proxy_impl();
}

inline const arrow_proxy& array_wrapper::get_arrow_proxy() const
{
return get_arrow_proxy_impl();
}


inline array_wrapper::array_wrapper(enum data_type dt)
: m_data_type(dt)
{
Expand Down Expand Up @@ -243,6 +255,13 @@ namespace sparrow
return wrapper_ptr{new array_wrapper_impl<T>(*this)};
}

template <class T>
auto array_wrapper_impl<T>::extract_arrow_proxy_impl() -> arrow_proxy&&
{
return std::move(std::move(*p_array).extract_arrow_proxy());
}


template <class T>
T& unwrap_array(array_wrapper& ar)
{
Expand Down
11 changes: 11 additions & 0 deletions include/sparrow/layout/dictionary_encoded_array.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@
#include "sparrow/utils/contracts.hpp"
#include "sparrow/utils/functor_index_iterator.hpp"
#include "sparrow/utils/memory.hpp"
#include "sparrow/layout/array_access.hpp"

namespace sparrow
{
Expand Down Expand Up @@ -138,12 +139,16 @@ namespace sparrow
[[nodiscard]] arrow_proxy& get_arrow_proxy();
[[nodiscard]] const arrow_proxy& get_arrow_proxy() const;

arrow_proxy&& extract_arrow_proxy() &&;

arrow_proxy m_proxy;
keys_layout m_keys_layout;
values_layout p_values_layout;

template <class T>
friend class array_wrapper_impl;

friend class detail::array_access;
};

template <class IT>
Expand Down Expand Up @@ -309,6 +314,12 @@ namespace sparrow
return m_proxy;
}

template <std::integral IT>
auto dictionary_encoded_array<IT>::extract_arrow_proxy() && -> arrow_proxy&&
{
return std::move(m_proxy);
}

template <std::integral IT>
auto dictionary_encoded_array<IT>::get_arrow_proxy() const -> const arrow_proxy&
{
Expand Down
51 changes: 51 additions & 0 deletions include/sparrow/layout/list_layout/list_array.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,9 @@

#include <string> // for std::stoull


#include "sparrow/arrow_interface/arrow_array.hpp"
#include "sparrow/arrow_interface/arrow_schema.hpp"
#include "sparrow/array_factory.hpp"
#include "sparrow/layout/array_bitmap_base.hpp"
#include "sparrow/layout/array_wrapper.hpp"
Expand All @@ -26,6 +29,7 @@
#include "sparrow/utils/iterator.hpp"
#include "sparrow/utils/memory.hpp"
#include "sparrow/utils/nullable.hpp"
#include "sparrow/array.hpp"

namespace sparrow
{
Expand Down Expand Up @@ -255,8 +259,15 @@ namespace sparrow
fixed_sized_list_array(self_type&&) = default;
fixed_sized_list_array& operator=(self_type&&) = default;

template<class ...ARGS>
requires(mpl::excludes_copy_and_move_ctor_v<fixed_sized_list_array, ARGS...>)
fixed_sized_list_array(ARGS&& ...args): self_type(create_proxy(std::forward<ARGS>(args)...))
{}

private:

static arrow_proxy create_proxy(std::uint64_t list_size, array && flat_values);

static uint64_t list_size_from_format(const std::string_view format);
std::pair<offset_type, offset_type> offset_range(size_type i) const;

Expand Down Expand Up @@ -491,4 +502,44 @@ namespace sparrow
const auto offset = i * m_list_size;
return std::make_pair(offset, offset + m_list_size);
}


inline arrow_proxy fixed_sized_list_array::create_proxy(std::uint64_t list_size, array && flat_values)
Copy link
Collaborator

@JohanMabille JohanMabille Oct 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should allow to pass a range for the null values (similar to what we do here). Same remark for the other layouts where it makes sense.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree!

{
const auto size = flat_values.size() / static_cast<std::size_t>(list_size);

auto wrapper = detail::array_access::extract_array_wrapper(std::move(flat_values));
auto storage = std::move(*wrapper).extract_arrow_proxy();
auto flat_schema = storage.extract_schema();
auto flat_arr = storage.extract_array();

const auto bitmap_size = (size + 7 ) / 8;
auto bitmap_ptr = new std::uint8_t[bitmap_size];
std::fill_n(bitmap_ptr, bitmap_size, static_cast<std::uint8_t>(0xFF) /*all bits 1*/);

std::string format = "+w:" + std::to_string(list_size);
ArrowSchema schema = make_arrow_schema(
format,
std::nullopt, // name
std::nullopt, // metadata
std::nullopt, // flags,
1, // n_children
new ArrowSchema*[1]{new ArrowSchema(std::move(flat_schema))}, // children
nullptr // dictionary

);

ArrowArray arr = make_arrow_array(
static_cast<std::int64_t>(size), // length
0, // null_count
0, // offset
std::vector<buffer<std::uint8_t>>{
{bitmap_ptr, static_cast<std::size_t>(bitmap_size)},
},
1, // n_children
new ArrowArray*[1]{new ArrowArray(std::move(flat_arr))}, // children
nullptr // dictionary
);
return arrow_proxy{std::move(arr), std::move(schema)};
}
}
12 changes: 12 additions & 0 deletions include/sparrow/layout/null_array.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@
#include "sparrow/utils/contracts.hpp"
#include "sparrow/utils/iterator.hpp"
#include "sparrow/utils/nullable.hpp"
#include "sparrow/layout/array_access.hpp"

namespace sparrow
{
Expand Down Expand Up @@ -101,13 +102,19 @@ namespace sparrow

difference_type ssize() const;



[[nodiscard]] arrow_proxy&& extract_arrow_proxy() &&;
[[nodiscard]] arrow_proxy& get_arrow_proxy();
[[nodiscard]] const arrow_proxy& get_arrow_proxy() const;


arrow_proxy m_proxy;

template <class T>
friend class array_wrapper_impl;

friend class detail::array_access;
};

bool operator==(const null_array& lhs, const null_array& rhs);
Expand Down Expand Up @@ -241,6 +248,11 @@ namespace sparrow
return m_proxy;
}

inline arrow_proxy&& null_array::extract_arrow_proxy() &&
{
return std::move(m_proxy);
}

inline const arrow_proxy& null_array::get_arrow_proxy() const
{
return m_proxy;
Expand Down
Loading
Loading