Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Classification - forced to calculate all metrics #953

Open
Linardos opened this issue Oct 2, 2024 · 4 comments
Open

[BUG] Classification - forced to calculate all metrics #953

Linardos opened this issue Oct 2, 2024 · 4 comments

Comments

@Linardos
Copy link
Contributor

Linardos commented Oct 2, 2024

Describe the bug

The current codebase sort of forces things to calculate all possible classification metrics, even if I explicitly ask for just one metric in my config file.

To Reproduce

Steps to reproduce the behavior:

  1. In config.yaml, set problem_type to "classification"
  2. Define your own metrics in config.yaml
  3. Run GaNDLF
  4. You will likely see all possible metrics being calculated even if you did not define them in config

Expected behavior

I expected to just calculate my defined metrics

Environment information

GaNDLF version is the current master, 0.1.0

Solution

How I solved it in my case is by setting calculate_overall_metrics parameters to false in both scripts they appear in (training_loop and forward_pass. See here: https://github.com/search?q=repo%3Amlcommons%2FGaNDLF%20calculate_overall_metrics&type=code ).

@sarthakpati
Copy link
Collaborator

So, there are 2 types of metrics:

  1. Per-subject metrics (all of the metrics here [ref]) - these are calculate on a batch-level [ref]
  2. Per-cohort metrics [ref] - these are calculated on an epoch-level (since we need all targets and predictions) [ref]

Currently, only the per-subject metrics are user-controlled, and the per-cohort metrics are always calculated. I keep hearing that for regression and classification problems, the per-subject metrics are kind of meaningless (since we always end up calculating averages through the epoch). And from a statistical perspective, the per-cohort metrics are much more important.

That being said, how would you propose we update this so that this can be accommodated?

@Linardos
Copy link
Contributor Author

Linardos commented Oct 2, 2024

Hmm, I understand per-cohort are considered more important, but even so I'd leave the flexibility to the user in case they want to run sanity checks on per-subject or some other preliminary experiments before getting to the per-cohort ones. In my case, the per-cohort metrics were causing a bug in the FeTS Challenge code which remains to be traced, so I needed to disable them for now at least.

I see two options:

  • Give the user the flexibility, adding a WARNING message to present what you told me here when the user skips the per-cohort metrics.
  • If you want to stick with forcing all the per-cohort metrics calculation as the expected behavior, there should either way be again a WARNING that all per-cohort metrics are calculated nevermind what the user is passing in the config.

The main thing here is for the user to not be in the dark about this. It seems unintuitive for the code to be over-riding commands that a user is passing to the config, and cause for confusion

@sarthakpati
Copy link
Collaborator

The main thing here is for the user to not be in the dark about this. It seems unintuitive for the code to be over-riding commands that a user is passing to the config, and cause for confusion

I agree with you 💯%. I am fine with either suggestion. Do you want to put the PR together and I can review?

Copy link
Contributor

github-actions bot commented Dec 2, 2024

Stale issue message

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants