Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

REF: Change how costs are reduced #327

Merged
merged 3 commits into from
Jul 31, 2024

Conversation

carterbox
Copy link
Contributor

Purpose

Try to avoid host synchronization during cost function computation.

Approach

Store objective function values in an GPU array until the end of the epoch.

Pre-Merge Checklists

Submitter

  • Write a helpfully descriptive pull request title.
  • Organize changes into logically grouped commits with descriptive commit messages.
  • Document all new functions.
  • Click 'details' on the readthedocs check to view the updated docs.
  • Write tests for new functions or explain why they are not needed.
  • Address any complaints from pep8speaks.

Reviewer

  • Actually read all of the code.
  • Run the new code yourself; the included tests should make this easy.
  • Write a summary of the changes as you understand them.
  • Thank the submitter.

@pep8speaks
Copy link

pep8speaks commented Jul 22, 2024

Hello @carterbox! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

Line 291:81: E501 line too long (90 > 80 characters)
Line 321:81: E501 line too long (90 > 80 characters)

Line 105:81: E501 line too long (85 > 80 characters)
Line 185:81: E501 line too long (82 > 80 characters)

Comment last updated at 2024-07-29 15:19:00 UTC

@carterbox carterbox requested a review from a4894z July 22, 2024 16:28
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to look up what itertools.chain() does; it looks like the point of this change is to not reduce (i.e. compute the mean) of the costs here in ptycho.py because we need these for momentum acceleration in rPIE or LSTSQ?

And by using itertools.chain() here, we keep track of the costs vs scan position until any momentum acceleration computation is completed and then we can take the mean in LSTSQ or rPIE?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want to track a unique objective value for each GPU because of momentum acceleration. e.g. If you do a reconstruction with 4 GPUs for 100 epochs, the resulting costs matrix should be shaped (100, 4).

Locally (to each GPU), the convergence behavior can be different than the other GPUs, so we don't want to use the global mean cost to make decisions about momentum acceleration.

The FIXME exists because I think the momentum acceleration method is still using the global mean to make decisions about momentum acceleration. I should just fix that before merging this PR.

@a4894z a4894z self-requested a review July 24, 2024 15:18
@carterbox carterbox marked this pull request as draft July 24, 2024 15:56
@carterbox carterbox marked this pull request as ready for review July 29, 2024 18:56
@a4894z a4894z merged commit 519a331 into AdvancedPhotonSource:main Jul 31, 2024
7 checks passed
@carterbox carterbox deleted the costs-reduce branch July 31, 2024 18:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants