Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make event data views more resilient of errors #2884

Closed
kschiffer opened this issue Jul 11, 2020 · 7 comments · Fixed by #3267
Closed

Make event data views more resilient of errors #2884

kschiffer opened this issue Jul 11, 2020 · 7 comments · Fixed by #3267
Assignees
Labels
c/console This is related to the Console c/sdk/js This is related to the JavaScript SDK in progress We're working on it needs/discussion We need to discuss this technical debt Not necessarily broken, but could be done better/cleaner ui/web This is related to a web interface

Comments

@kschiffer
Copy link
Contributor

Summary

Our event views are currently crashing altogether when there is any kind of error related to the event stream. In many situations, errors are related to the parsing of one single event. In these situations we can safely keep the overall event stream running.

Why do we need this?

Unnecessary crashing of the entire event data view.

What is already there? What do you see now?

We already make use of error boundaries that should help catching errors that result from faulty handling of event data on the frontend side.

What is missing? What do you want to see?

Before #2883, the SDK event streams would escalate some errors far too much in a way that caused the entire stream to be closed when it would actually not be necessary. Respectively, the Console also treats any error coming from the SDK stream as fatal and closes the event stream.

We need to treat event errors as non-fatal, since they mostly relate to the parsing of one specific data chunk. A failure does not warrant closing the event stream completely.

Environment

v3.8.5

How do you propose to implement this?

My idea is

  1. Refactor event store logic to handle errors in the SDK event stream when chunk parsing fails, e.g.
    notify(listeners[EVENTS.ERROR], new Error('Dismissed unparseable message chunk'))
  2. Handle such errors by adding synthetic (frontend defined) "meta" events to the console event store
  3. Distinguish such events accordingly by our event components to display a custom error message for them
    isSyntheticErrorEvent(event) and <SyntheticErrorEvent />
  4. Remove GET_EVENT_MESSAGE_FAILURE action type from the list of actions that close the event stream

@bafonins please let me know if you approve of such a solution.

How do you propose to test this?

We still need some ideas about how to properly end-to-end test event data views in the Console. This has already bitten us too often.

Can you do this yourself and submit a Pull Request?

Yes.

@kschiffer kschiffer added c/console This is related to the Console needs/discussion We need to discuss this technical debt Not necessarily broken, but could be done better/cleaner c/sdk/js This is related to the JavaScript SDK ui/web This is related to a web interface labels Jul 11, 2020
@kschiffer kschiffer added this to the Next Up milestone Jul 11, 2020
@kschiffer kschiffer self-assigned this Jul 11, 2020
@bafonins
Copy link
Contributor

Adding custom events is a good idea. I was thinking about adding events for pause, resume and clear events when implementing #2477, but wasnt sure about the UI. Not sure about proposed implementation, but can take a look myself when working on #2231

@kschiffer
Copy link
Contributor Author

Not sure about proposed implementation, but can take a look myself when working on #2231

I have a WIP with that solution. What exactly makes you not sure about it? Just want to reassure before I continue working on this.

@bafonins
Copy link
Contributor

I dont know how the final UI will look and Im not sure how helpful this new Error('Dismissed unparseable message chunk') error can be to the user. When we display a custom error event for me it reads like "hey, something went wrong here. we dont really know whether event handling in the app is broken or grpc-gateway chunk stream is not functioning properly". There is nothing the user can do with this error.

new Error('Dismissed unparseable message chunk') - I dont think this error should be a string, an object with some sort of code will be better.

Remove GET_EVENT_MESSAGE_FAILURE action type from the list of actions that close the event stream

We still need an action that closes the stream because of connection issues or because the user left the view with the events component.

It seems that Im missing something and if it is the case, please correct me. Looking forward to your solution

@johanstokking
Copy link
Member

Is this related to #2825?

@kschiffer
Copy link
Contributor Author

kschiffer commented Aug 24, 2020

Sort of. If the console loses internet connection, it will eventually receive an error event from the stream (via JS SDK), which will cause the whole event view to crash, even though it would be enough to display a message that the event stream has disconnected. It should still be possible to inspect the events that have arrived until then.

It's still good to have this as a separate issue since the specifics for the event streams are a bit special.

@sentry-io
Copy link

sentry-io bot commented Oct 1, 2020

Sentry issue: TTI-CLOUD-HOSTED-26

@sentry-io
Copy link

sentry-io bot commented Oct 1, 2020

Sentry issue: TTI-CLOUD-HOSTED-30

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c/console This is related to the Console c/sdk/js This is related to the JavaScript SDK in progress We're working on it needs/discussion We need to discuss this technical debt Not necessarily broken, but could be done better/cleaner ui/web This is related to a web interface
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants