Missing docs or examples for how to use ICE restart #416

morajabi · 2023-12-12T10:11:49Z

Given that there's no failed/closed ICE state, how should we use ICE restart?

https://github.com/algesten/str0m/blob/main/src/ice/agent.rs#L109

    /// Connection failed. This is a less stringent test than `failed` and may trigger
    /// intermittently and resolve just as spontaneously on less reliable networks,
    /// or during temporary disconnections. When the problem resolves, the connection
    /// may return to the connected state.
    Disconnected,
    //
    // NB: The failed and closed state doesn't really have a mapping in this implementation.
    //     We never end trickle ice and it's always possible to "come back" if more remote
    //     candidates are added.
    //
    // The ICE candidate has checked all candidates pairs against one another and has
    // failed to find compatible matches.
    // Failed,
    // The ICE agent has shut down and is no longer handling requests.
    // Closed,

The text was updated successfully, but these errors were encountered:

k0nserv · 2023-12-12T10:24:43Z

Quoting from the specification:

Performing an ICE restart is recommended when iceConnectionState transitions to "failed". An application may additionally choose to listen for the iceConnectionState transition to "disconnected" and then use other sources of information (such as using getStats to measure if the number of bytes sent or received over the next couple of seconds increases) to determine whether an ICE restart is advisable.

For example, if a peer transitions from WiFi to cellular their selected candidate will no longer be usable and no media data will reach the other peer. This causes a disconnect event and the lack of media data can be used as a signal to trigger an ICE restart which will gather new candidates for the cellular connection.

However, a disconnect event can also occur due to loss, because the STUN binding requests or responses can be lost, in this case if media is still flowing the ICE connection state should recover.

xnorpx · 2023-12-12T13:19:49Z

I do think we are missing a timeout, in the case when signaling is setup but for some reason the stun packets are not reaching the str0m instance. Str0m will sit there and wait forever and it should probably have a connection timeout.

algesten · 2023-12-12T14:02:16Z

I do think we are missing a timeout, in the case when signaling is setup but for some reason the stun packets are not reaching the str0m instance. Str0m will sit there and wait forever and it should probably have a connection timeout.

I don't agree. Str0m has the Disconnected state, which is enough given that we don't have a "end of trickle ice candidates".

The ICE spec says that for "checklist" (we only have one such in str0m):

https://www.rfc-editor.org/rfc/rfc8838.html#name-performing-connectivity-che

regular ICE agents would set the state of a checklist to Failed if both of the following two conditions are satisfied:

all of the pairs in the checklist are in either the Failed state or the Succeeded state; and

there is not a pair in the valid list for each component of the data stream.

Trickle ICE also adds the conditions:

all candidate gathering has completed, and the agent is not expecting to discover any new local candidates;

and the remote agent has conveyed an end-of-candidates indication

To put that together:

Scenario 1: Without trickle ICE, fail when all pairs on checklist fails.
Scenario 2: With trickle ICE, fail when all pairs on checklist fails and we got a notification there are no more trickling ICE candidates.

Scenario 1 isn't relevant for str0m because we don't have an API for enumerating and providing all ICE candidates before we start the Rtc instance. Why would we have that API? Add ICE candidates whenever you want!

Ergo we only support Scenario 2: Only trickle ICE, even if we don't expect to do any trickling. The remaining question is "what about indicating end-of-candidates"?

The stance I have taken here is "why?"

You can always come back from "Disconnected" state, by providing more candidates.
Trickle ICE is providing more candidates.

What good does it do to notify str0m that there will never be more candidates? As far as I can see, the only reason is to encode some timeout for going into a "Failed" state. It appears to be a state/code complication only to fulfill the spec. A user of str0m can simply have a timer for the Disconnected state and decide themselves whether to expect more candidates or remove the connection – why encode that logic in str0m?

pthatcher · 2023-12-18T15:46:35Z

I agree with the latest comment. As someone who worked on defining WebRTC's "failed" state, I've always thought of it as mostly useless, especially when trickle ICE is used (as it should be). The rule for ICE restarts is pretty simple: if you have been disconnected more than X seconds, try an ICE restart. You pick the X.

xnorpx · 2024-01-24T15:45:26Z

I do think we are missing a timeout, in the case when signaling is setup but for some reason the stun packets are not reaching the str0m instance. Str0m will sit there and wait forever and it should probably have a connection timeout.

I don't agree. Str0m has the Disconnected state, which is enough given that we don't have a "end of trickle ice candidates".

The ICE spec says that for "checklist" (we only have one such in str0m):

https://www.rfc-editor.org/rfc/rfc8838.html#name-performing-connectivity-che

regular ICE agents would set the state of a checklist to Failed if both of the following two conditions are satisfied:

all of the pairs in the checklist are in either the Failed state or the Succeeded state; and

there is not a pair in the valid list for each component of the data stream.

Trickle ICE also adds the conditions:

all candidate gathering has completed, and the agent is not expecting to discover any new local candidates;

and the remote agent has conveyed an end-of-candidates indication

To put that together:

Scenario 1: Without trickle ICE, fail when all pairs on checklist fails.

Scenario 2: With trickle ICE, fail when all pairs on checklist fails and we got a notification there are no more trickling ICE candidates.

Scenario 1 isn't relevant for str0m because we don't have an API for enumerating and providing all ICE candidates before we start the Rtc instance. Why would we have that API? Add ICE candidates whenever you want!

Ergo we only support Scenario 2: Only trickle ICE, even if we don't expect to do any trickling. The remaining question is "what about indicating end-of-candidates"?

The stance I have taken here is "why?"

You can always come back from "Disconnected" state, by providing more candidates.

Trickle ICE is providing more candidates.

What good does it do to notify str0m that there will never be more candidates? As far as I can see, the only reason is to encode some timeout for going into a "Failed" state. It appears to be a state/code complication only to fulfill the spec. A user of str0m can simply have a timer for the Disconnected state and decide themselves whether to expect more candidates or remove the connection – why encode that logic in str0m?

Ok I added the 5 lines of timeout handling code in our server and I agree :)

algesten changed the title ~~When should trigger ICE restart given that there's no failed/closed ICE state~~ Missing docs or examples for how to use ICE restart Dec 12, 2023

algesten mentioned this issue Mar 21, 2024

Adding new candidates and invalidating all previous ones flaps instantly to disconnected #486

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Missing docs or examples for how to use ICE restart #416

Missing docs or examples for how to use ICE restart #416

morajabi commented Dec 12, 2023 •

edited by algesten

Loading

k0nserv commented Dec 12, 2023 •

edited

Loading

xnorpx commented Dec 12, 2023

algesten commented Dec 12, 2023 •

edited

Loading

pthatcher commented Dec 18, 2023

xnorpx commented Jan 24, 2024

Missing docs or examples for how to use ICE restart #416

Missing docs or examples for how to use ICE restart #416

Comments

morajabi commented Dec 12, 2023 • edited by algesten Loading

k0nserv commented Dec 12, 2023 • edited Loading

xnorpx commented Dec 12, 2023

algesten commented Dec 12, 2023 • edited Loading

pthatcher commented Dec 18, 2023

xnorpx commented Jan 24, 2024

morajabi commented Dec 12, 2023 •

edited by algesten

Loading

k0nserv commented Dec 12, 2023 •

edited

Loading

algesten commented Dec 12, 2023 •

edited

Loading