-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bad build ids api changes #310
Changes from all commits
776bbb6
a9fcba2
e9efa10
b3c095e
88e9fb3
0346b32
e4fcfd3
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -32,10 +32,12 @@ option ruby_package = "Temporalio::Api::Common::V1"; | |||||
option csharp_namespace = "Temporalio.Api.Common.V1"; | ||||||
|
||||||
import "google/protobuf/duration.proto"; | ||||||
import "google/protobuf/empty.proto"; | ||||||
|
||||||
import "dependencies/gogoproto/gogo.proto"; | ||||||
|
||||||
import "temporal/api/enums/v1/common.proto"; | ||||||
import "temporal/api/enums/v1/reset.proto"; | ||||||
|
||||||
message DataBlob { | ||||||
temporal.api.enums.v1.EncodingType encoding_type = 1; | ||||||
|
@@ -149,3 +151,29 @@ message WorkerVersionCapabilities { | |||||
|
||||||
// Later, may include info like "I can process WASM and/or JS bundles" | ||||||
} | ||||||
|
||||||
message ResetOptions { | ||||||
// Which workflow task to reset to. | ||||||
oneof target { | ||||||
// Resets to event of the first workflow task completed, or if it does not exist, the event after task scheduled. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What if there's no event after task scheduled either? I think this wording is slightly more clear:
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'll just say "first workflow task completed or started event", I think that's enough for this purpose. |
||||||
google.protobuf.Empty first_workflow_task = 1; | ||||||
// Resets to event of the last workflow task completed, or if it does not exist, the event after task scheduled. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same comment as w/ first |
||||||
google.protobuf.Empty last_workflow_task = 2; | ||||||
// The id of a specific `WORKFLOW_TASK_COMPLETED`,`WORKFLOW_TASK_TIMED_OUT`, `WORKFLOW_TASK_FAILED`, or | ||||||
// `WORKFLOW_TASK_STARTED` event to reset to. | ||||||
int64 workflow_task_id = 3; | ||||||
// Resets to before the first workflow task processed by this build id. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Which event, specifically, before the first WFT processed by the build ID? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As I understand it, it doesn't matter whether it's the scheduled, started, or completed/failed/timedout event, the effect is the same for all. I'll just say "first workflow task processed by this build id" |
||||||
// If the workflow was not processed by the build id, or the workflow task can't be | ||||||
// determined, no reset will be performed. | ||||||
// Note that by default, this reset is allowed to be to a prior run in a chain of | ||||||
// continue-as-new. | ||||||
string build_id = 4; | ||||||
} | ||||||
|
||||||
// History event reapply options. | ||||||
temporal.api.enums.v1.ResetReapplyType reset_reapply_type = 10; | ||||||
|
||||||
// If true, limit the reset to only within the current run. (Applies to build_id targets and | ||||||
// possibly others in the future.) | ||||||
bool current_run_only = 11; | ||||||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -110,4 +110,6 @@ message WorkflowNotReadyFailure { | |
message NewerBuildExistsFailure { | ||
// The current default compatible build ID which will receive tasks | ||
string default_build_id = 1; | ||
// If this failure is due to the polling build id being marked as a bad build. | ||
bool is_bad_build = 2; | ||
Comment on lines
+113
to
+114
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe change to a reason enum to support other reasons in the future? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. can you imagine anything else (that would fit generally under There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not at the moment, but we didn't see this one coming either. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmm. Except without another example it's not clear if these should be exclusive (like an enum) or orthogonal (like other fields). E.g. a build can be both not-default and also a bad build. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm still not sure about this one. Note that these two fields are not exclusive and actually maybe we wouldn't set
At this point it's really the name of the message/error type that's not accurate, but I don't want to change that now There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. TBH, I don't think we need a custom error here. Just use There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We already have the custom error type, I'm just adding a field to it There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I know, this is more of a lesson for next time. |
||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -95,6 +95,10 @@ message StickyExecutionAttributes { | |
message CompatibleVersionSet { | ||
// All the compatible versions, unordered, except for the last element, which is considered the set "default". | ||
repeated string build_ids = 1; | ||
// Build ids that have been marked as bad. If the last element in build_ids is also in | ||
// bad_build_ids, then no tasks will be dispatched on this version set until the default is | ||
// changed. | ||
repeated string bad_build_ids = 2; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In the internal representation, you can add a new state in the enum, this is fine here though. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yup, that was the plan. we could try to transition the string list to a message but... seems more trouble than it's worth. I don't like the repetition but it's the simplest thing here. I guess it could be a parallel array of bools but that's worse There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yup, this is the best option given the current API, let's keep it and avoid breaking existing clients. |
||
} | ||
|
||
// Reachability of tasks for a worker on a single task queue. | ||
|
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -668,7 +668,7 @@ message ResetWorkflowExecutionRequest { | |||||
int64 workflow_task_finish_event_id = 4; | ||||||
// Used to de-dupe reset requests | ||||||
string request_id = 5; | ||||||
// Reset reapplay(replay) options. | ||||||
// Reset reapply (replay) options. | ||||||
temporal.api.enums.v1.ResetReapplyType reset_reapply_type = 6; | ||||||
} | ||||||
|
||||||
|
@@ -1110,10 +1110,26 @@ message UpdateWorkerBuildIdCompatibilityRequest { | |||||
string secondary_set_build_id = 2; | ||||||
} | ||||||
|
||||||
message MarkBadBuild { | ||||||
// The build id to mark as bad. | ||||||
string bad_build_id = 1; | ||||||
// If promote_build_id is set, then also promote it to be the default of the set. | ||||||
// If it's not set, then no tasks will be dispatched on the task queue set until another | ||||||
// build id is made the default by another call to UpdateWorkerBuildIdCompatibility that | ||||||
// sets/adds a new default for the set. | ||||||
Comment on lines
+1116
to
+1119
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This comment implies that |
||||||
string promote_build_id = 2; | ||||||
} | ||||||
|
||||||
string namespace = 1; | ||||||
// Must be set, the task queue to apply changes to. Because all workers on a given task queue | ||||||
// must have the same set of workflow & activity implementations, there is no reason to specify | ||||||
// a task queue type here. | ||||||
// The task queue to apply changes to. Must be set for most operations. | ||||||
// | ||||||
// Because all workers on a given task queue must have the same set of workflow & activity | ||||||
// implementations, there is no reason to specify a task queue type here. | ||||||
// | ||||||
// When operation is mark_bad_build, task_queue may be left empty and the operation will be | ||||||
// applied to all task queues that have that build id registered. Note that in this case, and if | ||||||
// an error is returned, the operation may have been partially applied to some task queues and | ||||||
// not others. | ||||||
string task_queue = 2; | ||||||
oneof operation { | ||||||
// A new build id. This operation will create a new set which will be the new overall | ||||||
|
@@ -1142,6 +1158,14 @@ message UpdateWorkerBuildIdCompatibilityRequest { | |||||
// declare as compatible. The unusual case of incomplete replication during failover could | ||||||
// also result in a split set, which this operation can repair. | ||||||
MergeSets merge_sets = 7; | ||||||
// Mark a build as "bad" for a particular task queue. This means that: | ||||||
// 1. the build will not receive any more tasks (workflow or activity) | ||||||
// 2. it may not be promoted to the set default anymore | ||||||
// See the MarkBadBuild for more information. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
// Note that this only affects new task dispatch. To also reset workflows with tasks | ||||||
// that were processed by this build, use StartBatchOperation with a query that includes | ||||||
// BuildIds and use a build_id target in ResetOptions. | ||||||
MarkBadBuild mark_bad_build = 8; | ||||||
} | ||||||
} | ||||||
message UpdateWorkerBuildIdCompatibilityResponse { | ||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO this comment should just say what this is for, and the comments for the older things can call out that they are old/deprecated.
Maybe keep the exclusivity mention, but don't need to say it's "New" I think.