-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add timeout for node execution #66
Conversation
Codecov Report
@@ Coverage Diff @@
## main #66 +/- ##
============================================
+ Coverage 86.93% 91.68% +4.74%
- Complexity 163 167 +4
============================================
Files 13 13
Lines 574 565 -9
Branches 75 78 +3
============================================
+ Hits 499 518 +19
+ Misses 52 24 -28
Partials 23 23
|
Signed-off-by: Daniel Widdis <[email protected]>
Signed-off-by: Daniel Widdis <[email protected]>
Signed-off-by: Daniel Widdis <[email protected]>
Signed-off-by: Daniel Widdis <[email protected]>
Signed-off-by: Daniel Widdis <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall
src/main/java/org/opensearch/flowframework/workflow/WorkflowProcessSorter.java
Outdated
Show resolved
Hide resolved
src/test/java/org/opensearch/flowframework/FlowFrameworkPluginTests.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am aware that we are trying to map timeouts similar to REST handler timeouts for nodes but we should think about the right place for them rather taking it as an input from the user. For now, this change looks good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Dan for the changes. Lgtm, though I'm curious if the sum of all the node timeouts should be the timeout value of the provision
API. I think it could be an option, but perhaps it might be better to have the workflow execution run after the workflow_id
response is sent back to the user.
Signed-off-by: Daniel Widdis <[email protected]>
* Add timeout for node execution Signed-off-by: Daniel Widdis <[email protected]> * Properly implement delays using OpenSearch ThreadPool Signed-off-by: Daniel Widdis <[email protected]> * Add coverage for plugin class, make test threshold dynamic Signed-off-by: Daniel Widdis <[email protected]> * Tests don't like singletons Signed-off-by: Daniel Widdis <[email protected]> * More thorough ProcessNode testing Signed-off-by: Daniel Widdis <[email protected]> * Util method for timeout parsing Signed-off-by: Daniel Widdis <[email protected]> --------- Signed-off-by: Daniel Widdis <[email protected]> (cherry picked from commit 28326bd) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
I'll move them to the step JSON definition file described in #67 |
Add timeout for node execution (#66) * Add timeout for node execution * Properly implement delays using OpenSearch ThreadPool * Add coverage for plugin class, make test threshold dynamic * Tests don't like singletons * More thorough ProcessNode testing * Util method for timeout parsing --------- (cherry picked from commit 28326bd) Signed-off-by: Daniel Widdis <[email protected]> Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Description
Adds a timeout setting for each node's processing of the Workflow step's
execute()
method in theProcessNode
.Some steps should complete quickly (e.g., create index) while others may take more time to complete (upload/deploy model).
Presently implementing that in the user template (See #45 where I proposed this and nobody objected), with a default to 10s if omitted.
I considered a global timeout, but decided against it as the "sum of node timeouts" effectively enforces that. I also considered a timeout by workflow type, tied to the default timeouts for their implemented APIs.
Other changes needed along the way:
Issues Resolved
Fixes #42 properly
Fixes #45
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.