Tuning - Augmentation Subsets Support #35

klemen1999 · 2024-05-31T08:53:58Z

Adds ability to randomly select subset of augmentations each tuning run. It can be defined in the config like this:

tuner:
  params:
    trainer.preprocessing.augmentations_subset: [["Defocus", "Sharpen", "Flip"], 2] # randomly selects 2 augs to be active in the run

Note: Currently this is only supported for augmentations.

github-actions · 2024-05-31T09:14:29Z

☂️ Python Coverage

current status: ✅

Overall Coverage

Lines	Covered	Coverage	Threshold	Status
4900	3792	77%	0%	🟢

New Files

No new covered files...

Modified Files

File	Coverage	Status
luxonis_train/core/tuner.py	80%	🟢
luxonis_train/utils/config.py	95%	🟢
TOTAL	88%	🟢

updated for commit: 4abe9f8 by action🐍

github-actions · 2024-05-31T09:22:35Z

Test Results

4 files 4 suites 1h 0m 32s ⏱️
68 tests 43 ✅ 25 💤 0 ❌
272 runs 172 ✅ 100 💤 0 ❌

Results for commit 4abe9f8.

♻️ This comment has been updated with latest results.

kozlov721

LGTM

tersekmatija · 2024-05-31T10:29:41Z

Won't have time to review the code, just a conceptual question: What is the behavior given we now have both is_active and subset option for augmentations? Do inactive augmentations become active?

It would be good to document this somewhere.

klemen1999 · 2024-06-02T18:13:24Z

Won't have time to review the code, just a conceptual question: What is the behavior given we now have both is_active and subset option for augmentations? Do inactive augmentations become active?

It would be good to document this somewhere.

The _subset sets is_active to True for those augs in the set that are chosen and False for other ones in the set. Other augs not specified in the Tuner set (but defined in the trainer preprocessing block) are left as they are. I've added additional clarification in the README.

luxonis_train/core/tuner.py

kozlov721 · 2024-06-06T19:39:56Z

Can we also tune the size of the selected subset?

klemen1999 · 2024-06-07T08:34:44Z

Can we also tune the size of the selected subset?

Not really. The way tuner params work is they override existing Config params each trial. Since Size of the subset is not a Config param we can't override it. Quick solution could be to check for special tuner param key an then generate random int in desired range to use as subset size but this solution seems a bit dirty. Do you have something nicer in mind?

kozlov721 · 2024-06-13T03:28:19Z

luxonis_train/core/tuner.py

+                            "Subset sampling currently only supported for augmentations"
+                        )
+                    whole_set_indices = self._augs_to_indices(whole_set)
+                    subset = random.sample(whole_set_indices, subset_size)


How are we preventing it from selecting a subset that was already selected in a previous run?

For this question I assume that typically we would use the augmentations tuning with all the other tunings disabled to get purely augmentation results. Would that be correct?

Good point, there isn't really any prevention in place for selecting same subset in subsequent runs. We could change it to first get all possible combinations and then loop through them each run. But then number of trials should also be set to number of combinations so we go through all of them. I guess it depends on what would be the typical usecase for this aug subset sampling.

As you mentioned below, perhaps looping over whole powerset is a more sensible impelmentation since normally we shouldn't be limited by the number of augmentations used and instead use all of those that produce better results. But to see if augmentation works you normally have to train the model for more epochs (at the beginning hard augs could produce worse results but with time perhaps those are the ones that actually improve acc on test set) and going over whole powerset could take a while (maybe add ability to limit minimum size of the subset to prune smaller ones).
CC: @tersekmatija on what would be the intended usecase of aug subsampling.

When I'm thinking more about the augmentation tuning, I think this will actually be more complex and will need more optimization. For now, I can think of a few optimizations if we'll work with some assumptions about the augmentations' effect on the training.

My idea: I think we can assume the order of augmentations should not (reasonably) matter, so if augmentation $A$ increases the model's performance, it doesn't matter where in the pipeline it's placed so we can lock it in place and continue only with the remaining augmentations and one-smaller subset size.
This would decrease the number of possibilities from $n^k$ to $n \choose k$ ($n$ being the number of all augmentations and $k$ being the subset size), but that's still way too many for any reasonable usage. We support ~80 augmentations and even with a subset of size only 5, there's 22 milion combinations to go through.

However, as an extension of the above, if augmentation $A$ did not increase the performance, we can discard it for the rest of the tuning. That's because if $A$ didn't improve, a different augmentation $B$ did improve, and the order doesn't matter, then $A \circ B \simeq B \simeq B \circ A$, so there's no need to try $A$ ever again.
This would bring the number of combinations all the way down to $n$, even with tunable subset size.

More digestible in code:

best_augmentations = [] for a in all_augmentations: if improves(a): best_augmentations.append(a) return best_augmentations

Someone would need to double-check my math on this though (and test whether the assumption even holds in the first place).

Math looks correct :) The challenge here is if you use more augmentations at once. If you use a and b and they improve the accuracy, you don't know whether contribution was made by a or b. This means that in any case you need to run at least n+1 (where n=80) combinations to find out which show improvement and which do not. Furthermore, I think the challenge then might be that single augmentations improve the performance, but a combination of them decreases it (image gets too corrupted to be useful).

I still think it should be up to the user to define which augmentations are reasonable, and then merely test a few combinations to check whether they are useful or not. I think in the above scenario it's also hard to define what k is?

kozlov721 · 2024-06-13T03:30:20Z

Can we also tune the size of the selected subset?

Not really. The way tuner params work is they override existing Config params each trial. Since Size of the subset is not a Config param we can't override it. Quick solution could be to check for special tuner param key an then generate random int in desired range to use as subset size but this solution seems a bit dirty. Do you have something nicer in mind?

I think instead of tuning the subset size, we can add another option like _powerset that would go through all possible subsets. This might quickly get too resource intensive though, so not sure if it even makes sense to implement.

tersekmatija · 2024-06-17T10:39:36Z

Yeah I'd skip for now. You can still achieve this with active flag on aug if you wanted to?

Co-authored-by: Martin Kozlovsky <[email protected]>

klemen1999 added 2 commits May 31, 2024 10:42

added tuner subset sampling for augmentations

e3e915e

updated README

0269143

klemen1999 requested review from kozlov721 and tersekmatija May 31, 2024 08:53

klemen1999 changed the base branch from main to dev May 31, 2024 08:54

formatting

d38d395

kozlov721 reviewed May 31, 2024

View reviewed changes

updated readme

2a40f14

tersekmatija approved these changes Jun 3, 2024

View reviewed changes

luxonis_train/core/tuner.py Outdated Show resolved Hide resolved

kozlov721 assigned klemen1999 Jun 6, 2024

kozlov721 added the enhancement New feature or request label Jun 6, 2024

kozlov721 changed the title ~~Feature/aug subset tune~~ Tuning - Augmentation Subsets Support Jun 6, 2024

klemen1999 and others added 3 commits June 7, 2024 11:29

cleaner aug name index mapping

a6ad01b

Merge branch 'dev' into feature/aug_subset_tune

3242631

fixed CLI archiver type

85f267e

kozlov721 reviewed Jun 13, 2024

View reviewed changes

Merge branch 'dev' into feature/aug_subset_tune

4abe9f8

kozlov721 merged commit 88e8ff5 into dev Jun 19, 2024
8 checks passed

kozlov721 deleted the feature/aug_subset_tune branch June 19, 2024 13:44

kozlov721 mentioned this pull request Oct 9, 2024

LuxonisTrain - v0.1.0 #102

Merged

kozlov721 added a commit that referenced this pull request Oct 9, 2024

Tuning - Augmentation Subsets Support (#35)

5d33466

Co-authored-by: Martin Kozlovsky <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tuning - Augmentation Subsets Support #35

Tuning - Augmentation Subsets Support #35

klemen1999 commented May 31, 2024

github-actions bot commented May 31, 2024 •

edited

Loading

github-actions bot commented May 31, 2024 •

edited

Loading

kozlov721 left a comment

tersekmatija commented May 31, 2024

klemen1999 commented Jun 2, 2024

kozlov721 commented Jun 6, 2024

klemen1999 commented Jun 7, 2024

kozlov721 Jun 13, 2024

kozlov721 Jun 13, 2024

klemen1999 Jun 13, 2024

kozlov721 Jun 17, 2024 •

edited

Loading

tersekmatija Jun 18, 2024

kozlov721 commented Jun 13, 2024

tersekmatija commented Jun 17, 2024

Tuning - Augmentation Subsets Support #35

Tuning - Augmentation Subsets Support #35

Conversation

klemen1999 commented May 31, 2024

github-actions bot commented May 31, 2024 • edited Loading

☂️ Python Coverage

Overall Coverage

New Files

Modified Files

github-actions bot commented May 31, 2024 • edited Loading

Test Results

kozlov721 left a comment

Choose a reason for hiding this comment

tersekmatija commented May 31, 2024

klemen1999 commented Jun 2, 2024

kozlov721 commented Jun 6, 2024

klemen1999 commented Jun 7, 2024

kozlov721 Jun 13, 2024

Choose a reason for hiding this comment

kozlov721 Jun 13, 2024

Choose a reason for hiding this comment

klemen1999 Jun 13, 2024

Choose a reason for hiding this comment

kozlov721 Jun 17, 2024 • edited Loading

Choose a reason for hiding this comment

tersekmatija Jun 18, 2024

Choose a reason for hiding this comment

kozlov721 commented Jun 13, 2024

tersekmatija commented Jun 17, 2024

github-actions bot commented May 31, 2024 •

edited

Loading

github-actions bot commented May 31, 2024 •

edited

Loading

kozlov721 Jun 17, 2024 •

edited

Loading