Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add build-id-based reset #333

Merged
merged 4 commits into from
Nov 29, 2023
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 9 additions & 4 deletions temporal/api/batch/v1/message.proto
Original file line number Diff line number Diff line change
Expand Up @@ -90,10 +90,15 @@ message BatchOperationDeletion {
// BatchOperationReset sends reset requests to batch workflows.
// Keep the parameter in sync with temporal.api.workflowservice.v1.ResetWorkflowExecutionRequest.
message BatchOperationReset {
// Reset type.
temporal.api.enums.v1.ResetType reset_type = 1;
// History event reapply options.
temporal.api.enums.v1.ResetReapplyType reset_reapply_type = 2;
// The identity of the worker/client.
string identity = 3;

// Describes what to reset to and how (exclusive with `reset_type` and `reset_reapply_type`)
temporal.api.common.v1.ResetOptions options = 4;

// Reset type (deprecated, use `options`).
temporal.api.enums.v1.ResetType reset_type = 1;
// History event reapply options (deprecated, use `options`).
temporal.api.enums.v1.ResetReapplyType reset_reapply_type = 2;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use a oneof if these are exclusive?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think oneof helps here. I'm moving those fields into options and deprecating them. Callers should just migrate to options, but I don't want to break existing code yet.

Maybe I should write "if this is set, the other fields are ignored" on options instead?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just totally spaced that they said deprecated.

I do like that change in docstring though 👍


}
29 changes: 29 additions & 0 deletions temporal/api/common/v1/message.proto
Original file line number Diff line number Diff line change
Expand Up @@ -32,8 +32,10 @@ option ruby_package = "Temporalio::Api::Common::V1";
option csharp_namespace = "Temporalio.Api.Common.V1";

import "google/protobuf/duration.proto";
import "google/protobuf/empty.proto";

import "temporal/api/enums/v1/common.proto";
import "temporal/api/enums/v1/reset.proto";

message DataBlob {
temporal.api.enums.v1.EncodingType encoding_type = 1;
Expand Down Expand Up @@ -147,3 +149,30 @@ message WorkerVersionCapabilities {

// Later, may include info like "I can process WASM and/or JS bundles"
}

// Describes where and how to reset a workflow, used for batch reset.
message ResetOptions {
// Which workflow task to reset to.
oneof target {
// Resets to the first workflow task completed or started event.
google.protobuf.Empty first_workflow_task = 1;
// Resets to the last workflow task completed or started event.
google.protobuf.Empty last_workflow_task = 2;
// The id of a specific `WORKFLOW_TASK_COMPLETED`,`WORKFLOW_TASK_TIMED_OUT`, `WORKFLOW_TASK_FAILED`, or
// `WORKFLOW_TASK_STARTED` event to reset to.
Comment on lines +162 to +163
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I always got confused by this. Why not simply reset to the workflow task scheduled event?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I understand it, it always does effectively do that. It just branches history at the workflow task started event and adds a workflow task failed, so nothing from that workflow task takes effect. It's just that you can specify any of those types since they are always contiguous in history.

It does seem like it would be cleaner to branch before the workflow task scheduled event and then create a new one, but I guess there was/is probably a good reason to do it this way. I'm not proposing to change this here, I just copied the comment from ResetWorkflowExecutionRequest

int64 workflow_task_id = 3;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How are user suppose to use this in batch reset case?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They're not, this is so we can use the same message for single-workflow reset in the future. I'll add a comment

// Resets to the first workflow task processed by this build id.
// If the workflow was not processed by the build id, or the workflow task can't be
// determined, no reset will be performed.
// Note that by default, this reset is allowed to be to a prior run in a chain of
// continue-as-new.
string build_id = 4;
}

// History event reapply options.
temporal.api.enums.v1.ResetReapplyType reset_reapply_type = 10;

// If true, limit the reset to only within the current run. (Applies to build_id targets and
// possibly others in the future.)
bool current_run_only = 11;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm super late to this review but if this doesn't apply to all targets, maybe it would have been better to make build_id a message and put this in the message.
This would also make build_id extensible in the future.

message BuildId {
  string build_id = 1;
  bool current_run_only = 2;
}

...

BuildId build_id = 4;

}
6 changes: 3 additions & 3 deletions temporal/api/enums/v1/reset.proto
Original file line number Diff line number Diff line change
Expand Up @@ -40,11 +40,11 @@ enum ResetReapplyType {
RESET_REAPPLY_TYPE_NONE = 2;
}

// Reset type options
enum ResetType {
// Reset type options. Deprecated, see temporal.api.common.v1.ResetOptions.
enum ResetType {
RESET_TYPE_UNSPECIFIED = 0;
// Resets to event of the first workflow task completed, or if it does not exist, the event after task scheduled.
RESET_TYPE_FIRST_WORKFLOW_TASK = 1;
// Resets to event of the last workflow task completed, or if it does not exist, the event after task scheduled.
RESET_TYPE_LAST_WORKFLOW_TASK = 2;
}
}
8 changes: 6 additions & 2 deletions temporal/api/workflow/v1/message.proto
Original file line number Diff line number Diff line change
Expand Up @@ -106,9 +106,13 @@ message ResetPoints {
repeated ResetPointInfo points = 1;
}

// ResetPointInfo records the workflow event id that is the first one processed by a given
// build id or binary checksum. A new reset point will be created if either build id or binary
// checksum changes (although in general only one or the other will be used at a time).
message ResetPointInfo {
// A worker binary version identifier, will be deprecated and superseded by a newer concept of
// build_id.
// Worker build id.
string build_id = 7;
// A worker binary version identifier (deprecated).
string binary_checksum = 1;
// The first run ID in the execution chain that was touched by this worker build.
string run_id = 2;
Expand Down