Skip to content

Infrastructure tooling for automated ContinueAsNew-based worker versioning

Notifications You must be signed in to change notification settings

kevindweb/temporal-version-drain

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Temporal Version Drain

Version Drain is a workflow for dynamically migrating long-running Temporal workflows to a new versioned task queue to reduce risk and toil from backwards incompatibility.

Background

Temporal Versioning

Please read Temporal worker versioning docs if you are not already familiar.

Context

Our team treats developer efficiency and deployment risk as the top priority. To enable quick iteration in Temporal with our month-long workflows, we tried other solutions which had too many failure modes

  • Patching withif/else is not comprehensive and developers make mistakes
  • Using replayer.ReplayWorkflowHistoryFromJSONFile in CI is great but has race conditions if a new workflow comes after CI passes but before your new code executes
  • Versioning entire workflows leaves toilsome cleanup, especially with our rate of CD iteration (10 commits per day)

The Version Drain workflow has been in production successfully for 5 months, performing hundreds of live workflow drains. Developers choose to version the task queue when they want to reduce risk, but since the drain workflow is idempotent, we run without bumping the version to avoid the risk of a race condition (ex. CI goes down for hours during the middle of deployment and new code gets pushed on top of the old).

QueueDrainWorkflow Logic

  1. Set the new compatible build version in the Temporal server
    • BuildIDOpAddNewIDInNewDefaultSet for new versions
    • BuildIDOpPromoteSet for existing versions (ex. reverting to an old version)
  2. Use a query to find the running workflows with a specific WorkflowType. Filter out workflows that are already on the new version (maintains idempotency)
  3. Execute ContinuanceWorkflow to ContinueAsNew all running workflows in parallel
  4. Poll checking if the workflow exits with ContinuedAsNew status

Requirements

The following are requirements of your system before invoking QueueDrainWorkflow

  • WorkflowType must be able to receive ContinueAsNewSignal and checkpoint itself for continuing
  • The drain workflow must be called separately for each WorkflowType you want to version

See the ExampleContinueWorkflow and worker to get started.

Benefits

  • History size is clipped as the mechanism uses ContinueAsNew which helps workflow performance
  • There is only ever at most 2 versions running in production concurrently (during a migration) so cognitive complexity is very low compared to other versioning solutions

About

Infrastructure tooling for automated ContinueAsNew-based worker versioning

Topics

Resources

Stars

Watchers

Forks

Packages

No packages published