Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplifications for DependencyManager #13533

Merged
merged 1 commit into from
Jul 27, 2023
Merged

Conversation

AlanCoding
Copy link
Member

@AlanCoding AlanCoding commented Feb 7, 2023

SUMMARY

I want to pull in some of the core changes from #13061 which isn't viable due to other reasons.

EDITed for changes that were done since opening

Some principles here...

  • virtually all of the logic for spawning an update-on-launch project update is shared between inventory updates & jobs, as this feature applies to both of them in the same way. This replaces what seemed like less-clear code sharing before.
  • only jobs generate an inventory update, so this is left as boutique logic in get_dep_for_job
  • A few operations like add_dependencies is generic to all dependency spawning, so it is moved further up in the stack which eliminates redundancy.
  • The utility method should_update_again answers whether a new update is needed due to latest update and cache timeout rules. It applies to update-on-launch regardless of the type (project vs inventory update), so the method is consolidated to reflect this. The old code strongly appeared to hint at different rules for different job types, but this was never the case.
  • Projects are cached along with inventory sources. Caching is for performance, but this is done here mainly for consistency.
  • Stop tracking created dependencies vs. all dependencies, as this is covered by excluding tasks with dependencies_processed=True in the main loop
ISSUE TYPE
  • Bug, Docs Fix or other nominal change
COMPONENT NAME
  • API

@AlanCoding AlanCoding marked this pull request as ready for review February 9, 2023 14:20
@AlanCoding
Copy link
Member Author

Surprisingly, this passes all the integration tests.

@@ -323,21 +303,13 @@ def should_update_inventory_source(self, job, latest_inventory_update):
return False

timeout_seconds = timedelta(seconds=latest_inventory_update.inventory_source.update_cache_timeout)
if (latest_inventory_update.finished + timeout_seconds) < now:
if (latest_inventory_update.finished + timeout_seconds) < job.created:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not good, need to change this or otherwise think about more.

@@ -393,7 +354,7 @@ def gen_dep_for_job(self, task):
continue
if not inventory_source.update_on_launch:
continue
latest_inventory_update = self.get_latest_inventory_update(inventory_source)
latest_inventory_update = InventoryUpdate.objects.filter(inventory_source=inventory_source).order_by("-created").first()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably want inventory_source.inventory_updates

for task in undeped_tasks:
if task.dependencies_processed:
continue
processed_ids.append(task.id)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fosterseth I added this because we call generate_dependencies an extra time for in case of deps-of-deps, passing in deps. In most cases those deps do not need to have their own deps generated. I will trust my optimizations to create_unified_job, and if a dependent project update has dependencies_processed as False then obviously we would like to exclude it from this loop for performance and to avoid errors from re-processing.

@AlanCoding AlanCoding changed the title Add in easy simplifications into DependencyManager Simplifications for DependencyManager Feb 18, 2023
If it has never updated, we need to update
If there is already an update in progress then we do not need to a new create one
If the last update failed, we always need to try and update again
If current time is more than cache_timeout after last update, then we need a new one
'''
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can summarize the diff around here by saying that we combined 2 methods into 1. That is:

should_update_inventory_source & should_update_related_project --> should_update_again

the method signature changed slightly from (self, job, latest_project_update/latest_inventory_update) to (update, cache_timeout). Maybe I should call it latest_update instead of update for clarity, and the fact that it's the same as the last parameter of the old method. This can be either a project update or a inventory update. And now that I think about it, it might be better to infer cache_timeout inside of this method, getting us down to 1 argument, because that is truly the only thing we need.

Remove a few practices no longer needed
  do not backdate created time, let it auto-set

Remove duplicate checks for update-on-launch

Cache projects as well as inventory sources, combine logic

Combine logic for creating project update

Update dependency manager docs slightly

Get rid of process_tasks we did not need
Copy link
Member

@fosterseth fosterseth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 simplification

@AlanCoding AlanCoding merged commit ab5cc2e into ansible:devel Jul 27, 2023
14 checks passed
djyasin pushed a commit to djyasin/awx that referenced this pull request Sep 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants