Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle hard time limits on tasks #2290

Closed
nforro opened this issue Dec 20, 2023 · 3 comments
Closed

Handle hard time limits on tasks #2290

nforro opened this issue Dec 20, 2023 · 3 comments
Assignees
Labels
area/general Related to whole service, not a specific part/integration. complexity/single-task Regular task, should be done within days. gain/high This brings a lot of value to (not strictly a lot of) users. impact/high This issue impacts multiple/lot of users. kind/internal Doesn't affect users directly, may be e.g. infrastructure, DB related.

Comments

@nforro
Copy link
Member

nforro commented Dec 20, 2023

A Celery task is killed after a timeout which is 900 seconds. If that happens, job statuses are no longer updated (so they usually stay in pending/running for good) and logs can be lost (for example logs from SRPM build in Sandcastle). Ideally in such case job status should be set to failed and logs preserved, if possible.

@lbarcziova lbarcziova added impact/high This issue impacts multiple/lot of users. area/general Related to whole service, not a specific part/integration. gain/high This brings a lot of value to (not strictly a lot of) users. kind/internal Doesn't affect users directly, may be e.g. infrastructure, DB related. complexity/single-task Regular task, should be done within days. labels Jan 2, 2024
@lbarcziova lbarcziova moved this from new to ready-to-refine in Packit Kanban Board Jan 2, 2024
@lbarcziova lbarcziova moved this from ready-to-refine to refined in Packit Kanban Board Jan 4, 2024
@lbarcziova lbarcziova self-assigned this Jan 11, 2024
@lbarcziova lbarcziova moved this from refined to in-progress in Packit Kanban Board Jan 11, 2024
@lbarcziova
Copy link
Member

So Celery enables to configure also soft_time_limit, that can be set to a little bit lower value than the hard time limit, that raises an exception, so that some cleanup can be done. But as for setting the logs and status, I still think this would be quite complicated to implement as the exception is raised from the task itself so we do not know the stage of the task when the exc happened.
I would be probably more inclined to handling the root causes of this issue separately - as for downstream jobs, this should not be happening anymore after implementing packit/packit#2186 ( CC @nforro ) - looking into the related Sentry issue and its occurrences.

So this now happens for process_message, where setting the status/logs doesn't really apply (when I checked the occurrences, these tasks get blocked in early stages, so the effect in the end is no feedback for user at all for some action - e.g. currently for events from gitlab.gnome.org) . Here I would maybe suggest for process_message to lower the hard time limit as there are no possible actions running and no time-demanding things should happen.

@nforro
Copy link
Member Author

nforro commented Jan 30, 2024

Here I would maybe suggest for process_message to lower the hard time limit as there are no possible actions running and no time-demanding things should happen.

Sounds good to me (if there is a way to configure the limit per task type).

Otherwise I think this issue can be closed in favor of #2323.

lbarcziova added a commit to lbarcziova/packit-service that referenced this issue Jan 31, 2024
This task should be exetuced much faster, it doesn't execute any
user defined actions etc., therefore set the time limit lower
than the global one (900s).
Related to packit#2290
@lbarcziova lbarcziova moved this from in-progress to in-review in Packit Kanban Board Jan 31, 2024
lbarcziova added a commit to lbarcziova/packit-service that referenced this issue Feb 2, 2024
This task should be exetuced much faster, it doesn't execute any
user defined actions etc., therefore set the time limit lower
than the global one (900s).
Related to packit#2290
softwarefactory-project-zuul bot added a commit that referenced this issue Feb 2, 2024
Process message task improvements

Fixes #2323
Related to #2290

RELEASE NOTES BEGIN
N/A
RELEASE NOTES END

Reviewed-by: Maja Massarini
Reviewed-by: Nikola Forró
@lbarcziova
Copy link
Member

I implemented both #2323 and lowering the limits in #2326, therefore closing this one.

@github-project-automation github-project-automation bot moved this from in-review to done in Packit Kanban Board Feb 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/general Related to whole service, not a specific part/integration. complexity/single-task Regular task, should be done within days. gain/high This brings a lot of value to (not strictly a lot of) users. impact/high This issue impacts multiple/lot of users. kind/internal Doesn't affect users directly, may be e.g. infrastructure, DB related.
Projects
Archived in project
Development

No branches or pull requests

2 participants