Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a flowable EmbeddedSubflow allowing to run subflow's tasks within the same parent execution #6518

Open
anna-geller opened this issue Dec 18, 2024 · 1 comment · May be fixed by #6625
Open
Assignees
Labels
area/backend Needs backend code changes area/frontend Needs frontend code changes enhancement New feature or request

Comments

@anna-geller
Copy link
Member

anna-geller commented Dec 18, 2024

Kestra initially supported Templates, which provided functionality similar to Subflows but lacked parameterization. This overlap caused frequent user confusion:
"What’s the difference between these? When should I use one over the other?"

Templates had several limitations:

  • Security: Tasks returning sensitive outputs like OAuth tokens allowed all calling flows to access them.
  • Privacy: Outputs were accessible to consumers, while Subflows allow explicit control over what is returned.
  • No versioning: Templates lacked revision history, unlike Subflows.
  • Skewed topology view: Tasks from Templates appeared as part of the parent flow, obscuring task origins.
  • Task ID conflicts: Direct copying of tasks caused ID collisions in flows.
  • No dependency tracking: Templates did not track which flows were using them, functioning more as static copies than reusable modules.

For these reasons, Templates were [deprecated in Kestra v0.11](https://kestra.io/docs/migration-guide/0.11.0/templates) in August 2023. Subflows became the recommended approach for reusable workflow logic.

Remaining Gaps in Subflows

While Subflows addressed many limitations, certain gaps remained:

  1. Extra executions: Subflows create separate executions, increasing complexity for some users.
  2. Restarts: Restarting a parent flow after a Subflow task fails creates a new execution for the entire Subflow instead of restarting only the failed task.
  3. Billing/Tracking: Metrics tracking becomes difficult when Subflows span multiple namespaces.
  4. RBAC Conflicts: Namespace-based secrets in Subflows can lead to permission issues when a parent flow accesses child flow secrets.

Proposal: EmbeddedSubflow

EmbeddedSubflow addresses these gaps by embedding a Subflow's tasks directly into the parent flow at runtime, avoiding the need for separate executions while resolving the challenges described above.

Example Use Case

Flow f1 (reusable Subflow):

id: f1
namespace: n1

tasks:
  - id: t1
    type: io.kestra.plugin.core.debug.Return
    format: using vars from n1 {{ namespace.myvar }}

  - id: t2
    type: io.kestra.plugin.core.debug.Return
    format: using secrets from n1 {{ secret('GCP_CREDS') }}

  - id: t3
    type: io.kestra.plugin.core.debug.Return
    format: value3

outputs:
  - id: myoutput
    type: STRING
    value: "{{ outputs.t3.value }}"

Flow f2 embedding f1 using EmbeddedSubflow:

id: f2
namespace: n2

tasks:
  - id: first_task
    type: io.kestra.plugin.core.log.Log
    message: value1

  - id: group
    type: io.kestra.plugin.core.flow.EmbeddedSubflow
    namespace: n1
    flowId: f1

  - id: log
    type: io.kestra.plugin.core.log.Log
    message: "{{ outputs.group.myoutput }}"

In the topology, the tasks from f1 appear directly in f2, prefixed by the EmbeddedSubflow task ID for clarity, while using namespace-specific secrets, KV pairs, and variables from f2.


Requirements for EmbeddedSubflow

  1. No Task ID Collisions
    Task IDs in the parent flow are prefixed with the EmbeddedSubflow task ID to avoid conflicts.

  2. No New Executions
    EmbeddedSubflows run within the parent flow's execution context, eliminating the need for separate executions.

  3. Restart Behavior
    Restarting a parent flow only restarts failed tasks within the EmbeddedSubflow rather than rerunning all tasks from scratch.

  4. Privacy and Security
    Task-level outputs are not returned directly. Only explicitly defined flow-level outputs are available to the parent.

  5. Versioning
    Allow specifying which revision of the Subflow to use, with the default set to latest.

  6. Topology Integration
    Tasks embedded from a Subflow are visually linked in the topology, allowing navigation to their source flow while keeping them uneditable in the parent.

  7. Namespace Context

    • Metrics emitted by tasks in the EmbeddedSubflow count toward the parent flow's namespace.
    • Secrets, KV pairs, and variables are resolved in the context of the parent namespace unless explicitly defined otherwise.

Additional Considerations

  1. Documentation

    • Clearly outline that outputs must be explicitly defined in the Subflow for access in the parent.
    • Provide best practices for avoiding deeply nested EmbeddedSubflows to prevent excessive task generation.
  2. Performance Constraints

    • Infinite nesting is possible, but users must be mindful of execution context limits.
    • Future iterations may introduce parameters like maxDepth to mitigate performance issues.
  3. Templates Removal
    Templates will be removed entirely from the codebase, and users must migrate to Subflows or EmbeddedSubflows.

@anna-geller anna-geller added area/backend Needs backend code changes enhancement New feature or request area/frontend Needs frontend code changes labels Dec 18, 2024
@github-project-automation github-project-automation bot moved this to Backlog in Issues Dec 18, 2024
@anna-geller anna-geller changed the title Add a flowable TaskGroup allowing to include subflow's tasks within the same parent execution without copy Add a flowable TaskGroup allowing to include subflow's tasks within the same parent execution Dec 18, 2024
@loicmathieu loicmathieu self-assigned this Jan 2, 2025
loicmathieu added a commit that referenced this issue Jan 2, 2025
Adds an EmbeddedFlow that allow to embed subflow tasks into a parent tasks.

Fixes #6518
@loicmathieu loicmathieu linked a pull request Jan 2, 2025 that will close this issue
loicmathieu added a commit that referenced this issue Jan 6, 2025
Adds an EmbeddedFlow that allow to embed subflow tasks into a parent tasks.

Fixes #6518
loicmathieu added a commit that referenced this issue Jan 7, 2025
Adds an EmbeddedFlow that allow to embed subflow tasks into a parent tasks.

Fixes #6518
@anna-geller
Copy link
Member Author

Embedded Subflow Requirements

  1. Prevent Task ID Collisions

    • Investigate using task run values or a modified run context for this purpose to avoid conflicts.
  2. Execution Context Isolation

    • Ensure embedded subflows execute within the parent flow's context without creating a new execution.
    • Embedded subflows can access flow-level inputs, task outputs, secrets, KV pairs, and namespace variables from the parent flow + its namespace.
  3. Input Handling

    • Implement explicit input mapping for embedded subflows.
  4. Loop Prevention

    • Add validation to ensure embedded subflows cannot embed themselves.
  5. Namespace-Based Access Control

    • Reinforce checks to ensure embedded subflows respect allowed namespace RBAC permissions.
  6. Tenant ID Isolation

    • Investigate the need to pass tenant IDs explicitly for subflows fetched without execution context.

@anna-geller anna-geller changed the title Add a flowable TaskGroup allowing to include subflow's tasks within the same parent execution Add a flowable EmbeddedSubflow allowing to include subflow's tasks within the same parent execution Jan 7, 2025
@anna-geller anna-geller changed the title Add a flowable EmbeddedSubflow allowing to include subflow's tasks within the same parent execution Add a flowable EmbeddedSubflow allowing to run subflow's tasks within the same parent execution Jan 7, 2025
loicmathieu added a commit that referenced this issue Jan 8, 2025
Adds an EmbeddedFlow that allow to embed subflow tasks into a parent tasks.

Fixes #6518
loicmathieu added a commit that referenced this issue Jan 8, 2025
Adds an EmbeddedFlow that allow to embed subflow tasks into a parent tasks.

Fixes #6518
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/backend Needs backend code changes area/frontend Needs frontend code changes enhancement New feature or request
Projects
Status: Backlog
Development

Successfully merging a pull request may close this issue.

2 participants