Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling plugin extraction errors #125

Open
jasondellaluce opened this issue Nov 9, 2021 · 28 comments
Open

Handling plugin extraction errors #125

jasondellaluce opened this issue Nov 9, 2021 · 28 comments
Labels
help wanted Extra attention is needed kind/feature New feature or request
Milestone

Comments

@jasondellaluce
Copy link
Contributor

Motivation
Currently, errors are ignored during field extraction in the plugin framework. In theory, a plugin might fail extracting a field for two main reasons:

  • The field is not present in the given event, for which the ss_plugin_extract_field.field_present flag is set to false
  • The extract_fields exported plugin function encounters some error and returns a code different than SS_PLUGIN_SUCCESS.
    In the current implementation, in both cases the filtercheck returns a NULL pointer, which is interpreted as a not-available field. This is visible here 👇🏼
    return false;

    return NULL;

Although this is semantically correct, the two failure paths have a quite different meaning. In the second case, the plugin returns a failure code and the framework silently ignores it to maintain a non-blocking extraction flow. This is makes error handling efforts useless for plugin developers, and generally makes it harder to debug plugins at runtime.

Feature
I propose to catch the error and make it visible somehow.

I agree that maintaining field extraction non-blocking might be a priority here, so maybe throwing an exception might not be a viable option. We can consider some weaker error propagation methods, or maybe logging to stderr. To the bare minimum, we might log the error if a debug mode is enabled.

Alternatives
Keep things as they are, and just ignore plugin failures for extract_fields.

@jasondellaluce jasondellaluce added the kind/feature New feature or request label Nov 9, 2021
@jasondellaluce
Copy link
Contributor Author

@leogr @mstemm

@leogr
Copy link
Member

leogr commented Nov 10, 2021

Good catch. Not sure what's the best option atm, for sure I will take a look.

@poiana
Copy link
Contributor

poiana commented Feb 8, 2022

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

@leogr
Copy link
Member

leogr commented Feb 8, 2022

/remove-lifecycle stale

@poiana
Copy link
Contributor

poiana commented May 9, 2022

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

@FedeDP
Copy link
Contributor

FedeDP commented May 11, 2022

/remove-lifecycle stale

@poiana
Copy link
Contributor

poiana commented Aug 9, 2022

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

@jasondellaluce
Copy link
Contributor Author

/remove-lifecycle stale

@poiana
Copy link
Contributor

poiana commented Nov 8, 2022

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

@jasondellaluce
Copy link
Contributor Author

/remove-lifecycle stale

leogr pushed a commit to leogr/libs that referenced this issue Jan 5, 2023
falcosecurity#125)

Various sinsp logic sets m_tid_to_remove, to record the fact that a thread has
been identified as ready-to-be-removed from the threadtable.  And if automatic
threadtable purging is configured, sinsp::next() takes care of removing these
threads by calling remove_thread(), then clearing m_tid_to_remove.

But remove_thread() may itself set m_tid_to_remove in certain situations.
The current sinsp::next() logic loses track of this request; as a result, these
threads will languish in the threadtable until the next remove_inactive_threads()
interval, default value 20 minutes.

This fix changes the sinsp::next() logic to recognize and handle the case where
remove_thread() records a thread for removal.

Signed-off-by: Joseph Pittman <[email protected]>

Signed-off-by: Joseph Pittman <[email protected]>
@poiana
Copy link
Contributor

poiana commented Feb 6, 2023

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

@jasondellaluce
Copy link
Contributor Author

/remove-lifecycle stale

@jasondellaluce
Copy link
Contributor Author

/milestone 0.11.0

@poiana poiana added this to the 0.11.0 milestone Mar 20, 2023
@FedeDP
Copy link
Contributor

FedeDP commented Apr 27, 2023

/milestone 0.12.0

@poiana poiana modified the milestones: 0.11.0, 0.12.0 Apr 27, 2023
@leogr leogr added this to the 0.13.0 milestone May 3, 2023
@Andreagit97 Andreagit97 modified the milestones: 0.13.0, 0.12.0, libs-backlog Jun 7, 2023
@incertum
Copy link
Contributor

We have had lots of plugins refactors, is this still relevant?

@leogr
Copy link
Member

leogr commented Aug 24, 2023

I believe this is still relevant. @jasondellaluce to confirm.

However, I think it would be best to extend this discussion to the plugin domain and all field extraction mechanisms, including those built-in sinsp. I know Jason has some thoughts in this regard, so I'm eager to hear from him.

That being said, I don't think this is a top priority. However, it is still an improvement that is worth tackling.

Just my 2 cents

cc @Andreagit97 @FedeDP

@poiana
Copy link
Contributor

poiana commented Nov 22, 2023

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

@leogr
Copy link
Member

leogr commented Nov 22, 2023

/remove-lifecycle stale

@poiana
Copy link
Contributor

poiana commented Feb 20, 2024

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

@leogr
Copy link
Member

leogr commented Feb 20, 2024

/remove-lifecycle stale

@leogr
Copy link
Member

leogr commented Feb 20, 2024

/help

@poiana
Copy link
Contributor

poiana commented Feb 20, 2024

@leogr:
This request has been marked as needing help from a contributor.

Please ensure the request meets the requirements listed here.

If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-help command.

In response to this:

/help

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@poiana poiana added the help wanted Extra attention is needed label Feb 20, 2024
@poiana
Copy link
Contributor

poiana commented May 20, 2024

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

@leogr
Copy link
Member

leogr commented May 21, 2024

/remove-lifecycle stale

@poiana
Copy link
Contributor

poiana commented Aug 19, 2024

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

@Andreagit97
Copy link
Member

/remove-lifecycle stale

@poiana
Copy link
Contributor

poiana commented Nov 17, 2024

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

@leogr
Copy link
Member

leogr commented Nov 18, 2024

/remove-lifecycle stale

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed kind/feature New feature or request
Projects
None yet
Development

No branches or pull requests

6 participants