Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document edge case that causes stream_insert to insert a duplicate. #3596

Closed

Conversation

eadz
Copy link

@eadz eadz commented Dec 21, 2024

I have found an edge case that goes against the current stream_insert documentation.

I found that if you stream a collection on mount, and update an item in your handle_params with stream_insert, you get a duplicate rather than an updated item.

I'm not quite at the level where I would attempt to fix the edge, so I thought I would document it.

I have a repo script here : https://gist.github.com/eadz/5ef6f82944362dcb89386f27be54163f

@eadz
Copy link
Author

eadz commented Dec 22, 2024

Note that the documentation suggests you can stream on mount, then stream_insert on a "callback" - I guess handle_params is a callback?

Then, in a callback such as `handle_info` or `handle_event`, you

@SteffenDE
Copy link
Collaborator

This is basically the same as #2689.

Since in the lifecycle of a LV we initially call mount and handle_params directly after each other, you’ll need to ensure that there are no duplicates. If the handle_params call is a subsequent render, e.g. after a patch, everything is fine.

@eadz
Copy link
Author

eadz commented Dec 22, 2024

The documentation says you can can call stream in 'mount' and then stream_insert in a callback.

It would be good if this edge case is documented, but also it should be able to make this use case work?

Note this basically means you can't call stream_insert in handle params. Handle params doesn't know if mount was just called or not, so maybe stream_insert shouldn't be allowed to be called from handle_params if its functionality can't be guaranteed?

@eadz
Copy link
Author

eadz commented Dec 22, 2024

In terms of "needing to make sure there are no duplicates" - is there any way handle_params has access to the required information of what is already in the stream in order to check for duplicates before calling stream_insert?

@greven
Copy link

greven commented Dec 22, 2024

Don't think there is a way to track that right now in LV but...
You can keep an assign with a map set of ids to track what is already in the steam to accomplish that.

Anyway this could be documented for sure.

@eadz
Copy link
Author

eadz commented Dec 22, 2024

If you set a list of IDs, and check them in handle_params, you still don't know if stream_insert(update) will work or not.

From the functional programming perspective, it seems that calling the same function with the same arguments can have two different results which kind of goes against the functional programming philosophy?

Handle params(args)
Stream_insert(..., updated)
case 1: updates the item
Case 2: inserts a duplicate

I think in order to make handle_params / stream_insert behave in a predictable manner, we should expose something to the function (e.g. just_mounted=true in the socket) so it knows what to do.

@SteffenDE
Copy link
Collaborator

Sorry, I should have been clearer, but it was 6 AM and I was replying from my phone... Anyway!

The documentation says you can can call stream in 'mount' and then stream_insert in a callback.

That is correct. You can call stream in mount and you can then also call stream or stream_insert in any callback (handle_X). The problem is only the initial live mount, because mount and handle_params are called without a render in between.

It would be good if this edge case is documented, but also it should be able to make this use case work?

Yes, we should add something to the docs about this, and I'll send a suggestion to rephrase soon.

Note this basically means you can't call stream_insert in handle params. Handle params doesn't know if mount was just called or not, so maybe stream_insert shouldn't be allowed to be called from handle_params if its functionality can't be guaranteed?

If you indeed want to keep the diff over the wire minimal, you can detect this by assigning a flag in mount and checking for this flag in handle_params. mount is only called once in a LiveView's lifecycle, therefore you basically detect if you call handle_params for the first time or not:

def mount(_params, _session, socket) do
  {:ok, assign(socket, initial_mount: true, focused_item_id: nil)}
end

def handle_params(_params, _url, %{assigns: %{live_action: :index, initial_mount: true}} = socket) do
  # initial mount, list view: stream all items
  socket
  |> stream(:items, get_items(), reset: true)
  |> assign(:initial_mount, false)
  |> then(&{:noreply, &1})
end

def handle_params(%{"item_id" => id, _url, %{assigns: %{initial_mount: true}} = socket) do
  # set initial focus
  items =
    get_items()
    |> Enum.map(fn
      %{id: ^id} = item -> %{item | focused: true}
      item -> item
    end)

  socket
  |> stream(:items, items, reset: true)
  |> assign(:focused_item_id, id)
  |> assign(:initial_mount, false)
  |> then(&{:noreply, &1})
end

def handle_params(_params, _url, %{assigns: %{live_action: :index, focused_item_id: focused_id}} = socket) do
  # subsequent patch to list view, unfocus item
  {:noreply, maybe_unfocus(socket, focused_id)}
end

def handle_params(%{"item_id" => id, _url, %{assigns: %{focused_item_id: focused_id}} = socket) when id != focused_id do
  # subsequent patch to show view, unfocus old item, focus new item
  socket
  |> maybe_unfocus(focused_id)
  |> focus(id)
  |> then(&{:noreply, &1})
end

# nothing to do
def handle_params(_params, _uri, socket) do
  {:noreply, socket}
end

defp maybe_unfocus(socket, nil), do: socket
defp maybe_unfocus(socket, id) do
  unfocused_item = get_item_by_id(focused_id)
  stream_insert(socket, :items, unfocused_item)
end

defp focus(socket, id) do
  focused_item = %{get_item_by_id(id) | focused: true}

  socket
  |> stream_insert(:items, focused_item)
  |> assign(:focused_item_id, id)
end

This is some bookkeeping that is required to keep streams efficient and simple. If you can afford to update the whole stream, you could also work with reset:

def mount(_params, _session, socket) do
  {:ok, socket}
end

def handle_params(_params, _url, %{assigns: %{live_action: :index}} = socket) do
  # initial mount, list view: stream all items
  socket
  |> stream(:items, get_items(), reset: true)
  |> assign(:focused_item_id, nil)
  |> then(&{:noreply, &1})
end

def handle_params(%{"item_id" => id, _url, socket) do
  # set initial focus
  items =
    get_items()
    |> Enum.map(fn
      %{id: ^id} = item -> %{item | focused: true}
      item -> item
    end)

  socket
  |> stream(:items, items, reset: true)
  |> assign(:focused_item_id, id)
  |> then(&{:noreply, &1})
end

but this will always update all items in the stream. Note that streams are about limiting state kept in memory on the server. To achieve this they come with tradeoffs, which sometimes require more code to properly handle the bookkeeping. To keep this simple, people often use the stream reset functionality, which comes with more changes sent over the wire, but less bookkeeping.

From the functional programming perspective, it seems that calling the same function with the same arguments can have two different results which kind of goes against the functional programming philosophy?

The function is not called with the same arguments though. The stream functions (same for assign/3) always work with a socket struct, which contains state.

Finally, we know about the extra bookkeeping that is needed with the current streams. That's why we (Chris, José and me) did talk about streams that do some extra bookkeeping for you in the past and it's something we'll try to explore for future LiveView versions.

Comment on lines +1869 to +1871
One exception to this is if you call `stream` with a collection of items in your `mount` function,
and then `stream_insert` an updated item that was previously streamed in your `handle_params` function.
You will end up with a duplicate item.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
One exception to this is if you call `stream` with a collection of items in your `mount` function,
and then `stream_insert` an updated item that was previously streamed in your `handle_params` function.
You will end up with a duplicate item.
> #### A note on duplicates, streams and the LiveView life-cycle {: .info}
> LiveView does not support updating an already streamed item during in same render cycle.
> This can affect you in the following ways:
>
> * If your list of items passed to `stream/4` already contains duplicates,
> these duplicates will end up on the page. You'll need to deduplicate them before calling `stream/4`.
> * Due to the life-cycle of a LiveView, if you call `stream/4` with a list of items in
> your `c:mount/3` callback, you cannot use `stream_insert/4` in `c:handle_params/3`
> to update an item that was already inserted in `c:mount/3`. As there is no render in between
> those two callbacks on the initial mount, you would end up with a duplicate item in the stream.
> You can work-around this limitation by setting an "initial_mounted" assign in `c:mount/3`
> and pattern-match on this flag in `c:handle_params/3` to handle the different cases.

cc @josevalim

@josevalim
Copy link
Member

@SteffenDE I think we should allow it to be called twice and the later override the first one. This will be very useful if we want to add features later, such as postponing rendering (which we discussed in the past) as well as #3551.

If we are worried about the performance (it would make inserts O(n)), we could keep a map set of DOM IDs for the pending items. If there is an overlap, we append and do a pass to remove the past one.

Thoughts?

@SteffenDE
Copy link
Collaborator

@josevalim I‘d like to document the current behavior and revisit this when we implement stream bookkeeping.

@josevalim
Copy link
Member

@SteffenDE I think they are two separate issues, no? We don't need book-keeping to address it today. After all, we will purge the MapSet after every render.

@SteffenDE
Copy link
Collaborator

They are. It’s just: are we fine with a quick doc change, or wait until I find the time to implement it. I’ll add it to my todo list :)

@josevalim
Copy link
Member

I am fine either way :)

@eadz eadz mentioned this pull request Dec 22, 2024
@eadz
Copy link
Author

eadz commented Dec 22, 2024

Hi,

Just to be clear, this line here is the issue:

https://github.com/phoenixframework/phoenix_live_view/blob/main/lib/phoenix_live_view/live_stream.ex#L64

When I inspect the socket.assigns.streams from within handle_params on an initial render, the stream array is there from the previous stream call in mount.

If I call stream_insert, I get this in my stream.assigns

streams: %{
      items: %Phoenix.LiveView.LiveStream{
        name: :items,
        dom_id: #Function<3.83605031/1 in Phoenix.LiveView.LiveStream.new/4>,
        ref: "0",
        inserts: [
          {"items-0", -1, %{id: "0", name: "item0", focused: false}, nil},
          {"items-1", -1, %{id: "1", name: "item1", focused: false}, nil},
          {"items-2", -1, %{id: "2", name: "item2", focused: false}, nil},
          {"items-3", -1, %{id: "3", name: "item3", focused: false}, nil},
          {"items-1", -1, %{id: "1", name: "item1", focused: true}, nil}
        ],
        deletes: [],
        reset?: false,
        consumable?: false
      },
      __changed__: MapSet.new([:items]),
      __configured__: %{},
      __ref__: 1
    }

Given that it's appending to the list which is already O(n) is there any issue with changing insert_item to "insert or replace" rather than just append?

I have a PR here which fixes the issue on my end #3598

changing the current implementation ( O(n) )

 %{stream | inserts: stream.inserts ++ [{item_id, at, item, limit}]} 

to ( O(n) )

  updated_inserts =
    stream.inserts
    |> Enum.reject(fn {id, _at, _item, _limit} -> id == item_id end)
    |> Kernel.++([{item_id, at, item, limit}]) # O(n)

  %{stream | inserts: updated_inserts}

The time complexity stays the same.

SteffenDE added a commit that referenced this pull request Dec 22, 2024
Also adjusts stream_insert to prepend the insert instead of expensively
appending it.

Fixes #2689.
Closes #3596.
Closes #3598.
@SteffenDE
Copy link
Collaborator

The time complexity stays the same.

While that's true when talking about O(n), as O(2n) = O(n), in reality it is still traversing the list twice, which is at least suboptimal.

But you're completely right that the current stream_insert append behavior is problematic. I remember that I wanted to change this already in the past, but somehow forgot, so I created #3599 now to address both.

@eadz
Copy link
Author

eadz commented Dec 22, 2024

Great, thank you. 🎉

I think this is quite a tricky thing to come across so removing this edge case will make stream_insert behave how most people would expect it to behave reading the documentation.

@SteffenDE
Copy link
Collaborator

Thank you for getting this started! Will be fixed in the next release :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants