Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TaskProxy.jobs: refactor job config #6442

Open
oliver-sanders opened this issue Oct 23, 2024 · 0 comments
Open

TaskProxy.jobs: refactor job config #6442

oliver-sanders opened this issue Oct 23, 2024 · 0 comments
Labels
code refactor Large code refactors
Milestone

Comments

@oliver-sanders
Copy link
Member

Separate out the task and job specific data into separate objects and ensure it is persisted correctly.

The TaskProxy object contains some data that is specific to the task, e.g:

  • tdef - The task definition.
  • tokens - The parsed task ID.
  • clock_trigger_time - External wallclock dependency.
  • state - Aggregate task state.

However, it also contains some data that is specific to particilar job submissions:

  • platform - The platform that the current/next job submission is using / will use.
  • submit_num - The submit num of the current / next job.
  • is_manual_submit - Were any(?) of the submissions the result of a manual trigger?
  • local_job_file_path - Path of the current submission.

Additionally, the task proxy has two separate objects containing specific attributes of job submissions:

  • jobs - (potentially incomplete) list of job selected configuration settings.
  • mode_settings - Transient job state pertaining to the current/next submission.

This is a bit of a jumble, and code interfaces can easily end up relying on interfaces which are not stable for the intended purposes (e.g. #6326).

Proposal:

  • Create a Job object to take over from TaskProxy.job[item] and TaskProxy.mode_settings. Use an efficient store, e.g. a class with __slots__ or a protobuf serialisation.
  • Migrate other job-specific TaskProxy fields into the job.
  • Ensure that these fields are either repopulated from the DB on task-re-spawn/workflow-reload/workflow-restart OR recomputed as required.
  • Manage job data lifecycle for efficiency (i.e. don't hang onto data we no longer need).
@oliver-sanders oliver-sanders added this to the some-day milestone Oct 23, 2024
@oliver-sanders oliver-sanders added the code refactor Large code refactors label Oct 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
code refactor Large code refactors
Projects
None yet
Development

No branches or pull requests

1 participant