Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update docs & comments for the behavior of the RecordSubscribe service, #148

Merged
merged 1 commit into from
Jan 30, 2024

Conversation

haussli
Copy link
Contributor

@haussli haussli commented Jan 30, 2024

removing the keepalive concept and mentioning the behavior the server can adopt to timeout a dead client.

Addresses issue #137

removing the keepalive concept and mentioning the behavior the server
can adopt to timeout a dead client.
@morrowc morrowc merged commit c71f52c into openconfig:main Jan 30, 2024
2 checks passed
// order to signal its liveliness to the system. Failure to send a
// RecordRequest for more than 120 seconds will cause the connection to be
// reset.
// The stream continues ad infinitum, until the gNSI session is severed.
Copy link
Contributor

@nmahabaleshwar nmahabaleshwar Jan 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also update the comment at line 56 ?

If this results in no records to send, the server should return an OK error and zero records.

I don't think an OK error should be sent to the collector if no records need to be sent, the connection stays open and RecordResponses will be sent for future events.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What should it return?

If the subscription succeeds, but either the history is empty or a time is requested that both does not correspond with a record in the history and there are no records after that time in the history, then the current text seems reasonable. For example, the client might request now() or a future time (T+60s, which the server should probably be permitted to assume to be now()).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we return an OK error in those scenarios, the stream will be closed which we don't want right ?

In all those scenarios, we want the stream to be open and the collector should get RecordResponses when accounting events occur after the given timestamp in the original RecordRequest message.

If the server wants to explicitly signal the client that I don't have anything for you now, but will send records in the future, maybe we create a new message or add a new field in the RecordResponse which indicates that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that I must misunderstand/misremember how the streaming works due to my limited use of it. I thought that the return was the result of the subscription; ie: RecordSubscribe() received, parsed, and started successfully. After which, records would be sent asynchronously, allowing the client to send and receive other messages as it wishes, until the session is closed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, perhaps: "if there's an error on the RPC the connection is closed"

is the missing part here in the conversation?
Generally, Error on a streaming RPC means: "oops, bad things, please restart your process"
so we should say that no error is sent, but that also no records are sent (because there are none).

Perhaps REALLY in the text about the startup the reply is something like:

"device returns as many records as will satisfy the request, or none if there are no records which satisfy the request. The connection remains open such that future records may be streamed to the client as they are available."

Or something along those lines, at any rate.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's a fair point, we COULD have the server send back a status: ok (this is nirajan's text mostly I think)
but really .... we connect, we send a 'hope you have mail for me!, but I'm hanging out here anyway for later.'
and that's ok. Whether we get 0, 1, tons of records on our first connect is a little unimportant.

There IS a 'did you hear me???' problem, but that might be solve either in the metrics/alerting pipeline ("that device is oddly silent on acctz.. whats up?") or with a simple: "if nothing heard in N mins, disconnect and reconnect"

that model (disco/reco) is ALSO problematic, because it doesn't really solve anything and never surfaces the 'that device is oddly silent...' to the alerting pipeline.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops... and.. nirajan, if you think there's wording fixes / etc please don't hesitate to send along :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ja, thanks for the explanation, morrow. Please suggest the text nirajan, or suggestion for the text to consider is here: haussli@33b831a

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Raised #151 for this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants