Skip to content
This repository has been archived by the owner on Mar 21, 2024. It is now read-only.

Add output_writer_iterator #1575

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

karthikeyann
Copy link
Contributor

@karthikeyann karthikeyann commented Dec 3, 2021

This iterator serves as a generalized version of output iterator.
Typical usage is
output_writer_iterator(index_iterator, [result_begin](auto i, auto v) { result_begin[i]=v;});

Sometimes output iterator has to be complex. This output writer iterator can help achieve writing to complex data structures using binary function object or lambda which accepts an index and a value to write.
Example given in documentation is achieving functionality similar to std::bitset using a thrust::device_vector.

@alliepiper alliepiper added type: enhancement New feature or request. P2: nice to have Desired, but not necessary. labels Dec 13, 2021
@alliepiper alliepiper added this to the 1.16.0 milestone Dec 13, 2021
Copy link
Collaborator

@alliepiper alliepiper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition to the inline comments, I have some other suggestions for this PR:


Rather than having an output-only iterator, let's update this to support both input and output by replacing BinaryFunction with InputBinaryFunction and OutputBinaryFunction, where the two functions are inverse operations. This will avoid the current mess we have with the transform iterators (we currently provide transform_iterator, transform_output_iterator and transform_input_output_iterator for legacy reasons).


The name writer_iterator isn't very clear IMO, and won't make much sense once this supports input and output. Some alternative ideas:

  • proxy_iterator
  • functor_iterator
  • invoke_iterator
  • arg_iterator

I'm not sure I like these any better, though. Let's think about this a bit more, and maybe see if anyone else has some ideas.

Edit: decorate_iterator or something similar (decorating_, decorated_, etc) might be a good fit, since this effectively implements the "decorator" design pattern over an existing iterator. I've started a discussion on this with the rest of the team to see if we can find something better.


This will need tests and examples covering the input/output/both usecases.


So to summarize:

  • Change to input/output iterator.
  • Let's work out a more meaningful name for this.
  • Add tests and examples of using this iterator for input, output, and both.

thrust/iterator/output_writer_iterator.cuh Outdated Show resolved Hide resolved
thrust/iterator/output_writer_iterator.cuh Outdated Show resolved Hide resolved
thrust/iterator/output_writer_iterator.cuh Outdated Show resolved Hide resolved
thrust/iterator/output_writer_iterator.cuh Outdated Show resolved Hide resolved
thrust/iterator/output_writer_iterator.cuh Outdated Show resolved Hide resolved
@alliepiper
Copy link
Collaborator

alliepiper commented Jan 13, 2022

Updating with some internal conversations:


@harrism suggested indexed_iterator. This seems like the best candidate so far IMO, since the key feature of this iterator is that it provides the current index to the functor(s).


What would the input and input/output cases look like?

Input:

struct adj_diff_functor
{
  const int *input;
​
  __device__
  int operator()(std::size_t i) const
  {
    return input[i + 1] - input[i];
  }
};
​
auto indices = thrust::make_counting_iterator(0);
thrust::device_vector<int> data{0, 1, 4, 7, 9, 12};
auto iter = thrust::make_decorated_input_iterator(indices, adj_diff_functor{data.data()});
auto iter_end = iter + data.size() - 1;
​
// [iter, iter_end) -> {1, 3, 3, 2, 3}

@benbarsdell correctly points out that this could be implemented using existing counting + transform iterators.

Input/output is a more interesting case that cannot be implemented using other existing fancy iterators:

struct input_functor
{
  // Produces ones[i] + 10*tens[i]
  const int *ones; // Always 0-9
  const int *tens;
  
  __device__
  int operator()(std::size_t i) const
  {
    return ones[i] + 10 * tens[i];
  }
};

struct output_functor
{
  // Decomposes input values to perform the inverse of input_functor
  int *ones; // Always 0-9
  int *tens;
  
  __device__
  void operator()(std::size_t i, int val) const
  {
    ones[i] = val % 10;
    tens[i] = val / 10;
  }
};

thrust::device_vector<int> ones = {1, 5, 3, 2, 6};
thrust::device_vector<int> tens = {3, 2, 6, 7, 3};

auto indices = thrust::make_counting_iterator(0);
auto iter = thrust::make_decorated_iterator(indicies,
                                            input_functor{ones.data(), tens.data()},
                                            output_functor{ones.data(), tens.data()});
auto iter_end = iter + ones.size();

// [iter, iter_end) -> {31, 25, 63, 72, 36}

iter[0] = 19;
iter[1] = 42;
iter[2] = 98;
iter[3] = 132;
iter[4] = 15;

// ones -> {9, 2, 8, 2, 5}
// tens -> {1, 4, 9, 13, 1}

@jrhemstad
Copy link
Collaborator

@harrism suggested indexed_iterator. This seems like the best candidate so far IMO, since the key feature of this iterator is that it provides the current index to the functor(s).

Looking to thrust::tabulate for inspiration, how about tabulate_iterator?

@harrism
Copy link
Contributor

harrism commented Jan 14, 2022

@harrism suggested indexed_iterator. This seems like the best candidate so far IMO, since the key feature of this iterator is that it provides the current index to the functor(s).

Looking to thrust::tabulate for inspiration, how about tabulate_iterator?

I like tabulate_iterator for the case where you cannot provide an iterator that generates the indices. But if, as in @karthikeyann 's original example, we enable passing an iterator that generates the indices, these may not just be sequential/counting. In that case indexed_iterator may be a better name. For example you may want to pass a counting transform iterator to generate indices with a stride, or a reverse counting iterator, etc.

@jrhemstad
Copy link
Collaborator

jrhemstad commented Jan 14, 2022

I like tabulate_iterator for the case where you cannot provide an iterator that generates the indices. But if, as in @karthikeyann 's original example, we enable passing an iterator that generates the indices, these may not just be sequential/counting. In that case indexed_iterator may be a better name. For example you may want to pass a counting transform iterator to generate indices with a stride, or a reverse counting iterator, etc.

I've had to start at this iterator for quite a while to understand it's story.

There's nothing here that requires the adapted iterator to be an "index_iterator".

Generally, it looks like a cross between a zip and transform iterator. It binds iterator It0 (value_type is T) and a binary callable void F(T, U). Assigning any U u to this iterator at i results in invoking F(*(It0 + i), u).

Thinking about this in terms of std::bind or std::bind_front this iterator is effectively doing a partial binding of a binary callable, F(T,U), with it's first argument (It0). Assigning to the iterator provides the second argument, u, and invokes the callable.

When viewed through that lens, I'm thinking of a name more like bind_iterator or bound_iterator. Though with that naming it feels like this should be made more general to support a callable with N arguments where [0,N) of the arguments can be bound at iterator construction and the remaining arguments provided at assignment. It may take a bit more finesse, but I think that should be doable.

Thinking about @allisonvacanti's case of having this be both an input and output iterator (I only thought about the output case above), the input case wouldn't be a partial binding but a full binding of a callable with all of its arguments. Dereferencing the iterator will invoke the callable with the associated arguments provided by the bound iterators. (This is basically just a normal transform_iterator).

@alliepiper
Copy link
Collaborator

My omission of the "index" iterator from my later examples definitely clouded things.

With this new perspective, I'm back to liking decorated_iterator. From this summary of the design pattern:

Decorator pattern allows a user to add new functionality to an existing object without altering its structure.

This pattern creates a decorator class which wraps the original class and provides additional functionality keeping class methods signature intact.

This is an iterator that wraps all accesses to an existing iterator with decorator functions.

@jrhemstad
Copy link
Collaborator

Decorator pattern allows a user to add new functionality to an existing object without altering its structure.
This pattern creates a decorator class which wraps the original class and provides additional functionality keeping class methods signature intact.

This is an iterator that wraps all accesses to an existing iterator with decorator functions.

What gives me pause about this is that this pattern/definition applies just as well to transform_iterator.

In fact, the input iterator case for the decorated_iterator is exactly equivalent to a transform_iterator. The only difference is in assignment (or using it as an output iterator).

I'm not sure the name decorated_iterator vs. transform_iterator conveys that difference.

Now I'm thinking that maybe this new iterator can just be a new overload of transform_iterator as opposed to a wholly new iterator.

@alliepiper
Copy link
Collaborator

What gives me pause about this is that this pattern/definition applies just as well to transform_iterator.

It is a pretty generic name, and also very similar to the existing iterator_adaptor. I'd love to find something more specific.

Now I'm thinking that maybe this new iterator can just be a new overload of transform_iterator as opposed to a wholly new iterator.

Hmm, perhaps. Let's compare their user interfaces directly:

{ // Transform:
  T = value_t<Iter>;
  U InFunc(T); // Used for reads
  T OutFunc(U); // Used for writes
  auto thrust::make_transform_iterator(Iter, InFunc, OutFunc);
}

{ // New Iterator
  T = value_t<Iter>;
  U InFunc(T); // Used for reads
  void OutFunc(T, U); // Used for writes
  auto thrust::make_XXX_iterator(Iter, InFunc, OutFunc);
}

The key difference is in OutFunc. While transform will derive a new value from the assignment arg and write it to Iter, the new iterator will never modify Iter's data. Iter's data is only used as an input to modify state captured in the functors.

@alliepiper
Copy link
Collaborator

alliepiper commented Jan 14, 2022

Iter's data is only used as an input to modify state captured in the functors.

Which implies that this iterator is only useful with some external state, which could potentially be abstracted out:

{ // New Iterator with abstracted state:
  using T = value_t<Iter>;
  U InFunc(State, T); // Used for reads
  void OutFunc(State, T, U); // Used for writes
  auto thrust::make_XXX_iterator(Iter, State, InFunc, OutFunc);
}

This could be used to simplify the functors so they don't have to capture the same state in their members:

{ // current
  struct input_functor
  {
    const State *state;  
    U operator()(T) const;
  };

  struct output_functor
  {
    State *state;  
    void operator()(T, U);
  };

  State *state = ...;
  auto iter = thrust::make_XXX_iterator(
    indicies,
    input_functor{state},
    output_functor{state});
}

{ // with state abstraction
  struct input_functor
  {
    U operator()(const State&, T) const;
  };

  struct output_functor
  {
    void operator()(State&, T, U);
  };

  State *state = ...;
  auto iter = thrust::make_XXX_iterator(
    indicies,
    state,
    input_functor{},
    output_functor{});
}

I'm unsure whether this abstraction would be worthwhile, but I think a key point here is that this iterator does require extra state to do anything interesting.

@jrhemstad
Copy link
Collaborator

I think a key point here is that this iterator does require extra state to do anything interesting.

Yeah, I noticed that too. I'm not a fan of the fact that the binary callable doesn't return anything, which means it has to have side-effects to do anything useful.

@alliepiper
Copy link
Collaborator

I'm not a fan of the fact that the binary callable doesn't return anything, which means it has to have side-effects to do anything useful.

I don't think that'll be a problem. For instance, std::for_each uses a functor with no return value and relies on side effects, so it's not uncommon.

Do you see a different way to implement this functionality?

@alliepiper alliepiper modified the milestones: 1.16.0, 1.17.0 Feb 7, 2022
@alliepiper alliepiper modified the milestones: 1.17.0, Backlog Apr 25, 2022
@karthikeyann
Copy link
Contributor Author

The main reason for creating this iterator is to provide a generalized output iterator.
For input iterator, we can create counting_iterator and use a make_transform_iterator to translate it to access any datastructure.
But for creating output writer, AFAIK there is no way right now, except for the data structure to provide itself.

BinaryFunction is present only because first argument is supposed to the index. Instead of providing an pre-defined iterator index, I made it as an argument (which again gives more flexibility). Or else it could be UnaryFunction. If it is unary, tabulate_iterator or indexed_iterator name would make sense. but again it's missing output in the name.

Also, mixing input and output together would make it more than what it's designed for. My vote would be to limit this iterator's functionality to behave like output iterator only, and add new combined input and output iterator seperately.

@jrhemstad
Copy link
Collaborator

The main reason for creating this iterator is to provide a generalized output iterator.
For input iterator, we can create counting_iterator and use a make_transform_iterator to translate it to access any datastructure.
But for creating output writer, AFAIK there is no way right now, except for the data structure to provide itself.

@karthikeyann's description reminded me that I explored an alternative for this approach awhile back in NVIDIA/cccl#792.

However, coming back to this after some time, it occurs to me that we are effectively just trying to indirectly recreate projections via a complicated iterator.

That tells me there is probably a far more natural way to express this.

Curious for @ericniebler 's thoughts.

@karthikeyann
Copy link
Contributor Author

karthikeyann commented May 13, 2022

@jrhemstad In NVIDIA/cccl#792 godbolt example, it still uses f.begin() as starting point for the output iterator. This output iterator is transformed using projections. Instead of f.begin(), can we use a counting_iterator(0), and still get an output iterator?

@karthikeyann karthikeyann requested a review from alliepiper May 16, 2022 05:23
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
P2: nice to have Desired, but not necessary. type: enhancement New feature or request.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants