refactor: a lot #167

eddiebergman · 2025-01-08T17:56:47Z

Overview, there was a lot of things which bled in to each-other. TLDR; the code-base is vastly simpler at this point which given the amount of actual work done on the code-base means it should be less to worry about maintaining.

Yaml

Some previous PR's aimed to add a lot of yaml support into neps through `neps.run(..., run_args="path/to/config.yaml"). These yamls could inside themselves, even then provide paths to other yamls and so on.

After talking to some people who used the feature it ended up being more confusing than useful, where just loading in yamls themselves and passing arguments into NePS would have been the preferred option, i.e. neps does not dictate experiment setup.

To actually make this easier, the preferred flow is now:

with open("config.yaml") as f:
    config = yaml.load(f)
    
neps.run(**config)

The only change to actually make this easily possible is that the neps.run can take in things like a search space definition in it's serialized dict version as an argument to neps.run(). For example, you could also specify neps.run(evaluate_pipeline="/path/to/file.py:my_eval_func"), or just import the function and pass it in yourself.

This prevents the issue where we try to see what was set in run_args=, but also merge that with defaults, but also merge that with nested yaml things.

Estimated this removed around ~1000 lines of code and about 3000 lines of test code/test yamls.

Optimizer Heirarchy

There is none anymore. At the end of the day, this is all the runtime needs from an optimizer. It can be a class, a dataclass, a function, a method, neps shouldn't care.

def __call__(
    self,
    trials: Mapping[str, Trial],     # Contains all the information we know about runs
    budget_info: BudgetInfo | None,  # If you need to be cost aware
    n: int | None = None,            # Don't support batching? raise an error
) -> SampledConfig | list[SampledConfig]: ...

What does this give us?

It's easier to actually develop an optimizer now and pass it in to neps to be used. The core logic of what's going on with each optimizer is also a lot more linear and legible. For example, you can see the core logic in the __call__ of the BracketOptimizer in 90 lines of code which supports all of the bracket/rung based optimizer we had before.

If you like heirarchies and such, you're still free to do so, neps runtime and logic loop, does not care.

The benefit of moving away from heirarchies is it actually forces us to have re-usable components, rather than re-using methods. We've all been there where the attributes in inheritance don't really make sense but you just want to change out one method.

Here's the simplest template we could give to implement your own optimizer:

@dataclass
class MyOpt:
    pipeline_space: SearchSpace
    # ... other attributes
    
    def __call__(
        self,
        trials: Mapping[str, Trial],
        budget_info: BudgetInfo | None,
        n: int | None = None,
    ) -> SampledConfig | list[SampledConfig]:
        # A high level overview of what the optimizer is actually doing logically,
        # calling out to functions where needed.
        # Calls out to other functions where needed

Yup, that's it. You can now do neps.run(..., optimizer=MyOpt)

Re-usable components

TODO

Graphs

TODO

Replaced with a partial of `SuccessiveHalvingBase`

The only thing inheriting from it was `PriorBand`, which ended up replacing all the defaults of it's `__init__()`. The only other thing the `HyperbandCustomDefault.__init__()` did was change the sampling args of the SH brackets, which then `PriorBand` would overwrite it its own `__init__()`.

The only user of the was `PriorBand`, in the only thing it did was explicitly pass `use_priors=`True`. Everything else it set was overwritten by `PriorBand` passing down its args.

eddiebergman added 16 commits October 21, 2024 10:04

refactor(mf): Remove AsynchronousSuccessiveHalvingWithPriors

50ca818

refactor(mf): Replace AsynchronousSuccessiveHalving

1cc1fa5

Replaced with a partial of `SuccessiveHalvingBase`

refactor(mf): Replace SuccessiveHalvingWithPriors with partial

1cdfba9

refactor(mf): Replace HyperbandWithPrior

8ff3c22

The only user of the was `PriorBand`, in the only thing it did was explicitly pass `use_priors=`True`. Everything else it set was overwritten by `PriorBand` passing down its args.

refactor(mf): Remove AsynchornousHyperbandWithPriors

da02112

refactor(mf): Handle completed and pending jointly to df

293d605

refactor(mf): Implementation of SH and ASHA

66ab244

checkpoint

4b44817

Merge branch 'master' into cleanup-mf-heirarchies

8246043

refactor: minor import changes

ae8c29f

refactor: Cleanup acquisition

8bfc88b

in progress... and lol amount of changes

a51db57

lol more

d96df5e

working version

88b5747

ayyy it works and passes, test, mypy, ruff, docs

08a2fb6

eddiebergman mentioned this pull request Jan 15, 2025

docs: cleanup #127

Closed

eddiebergman changed the title ~~refactor: optimizers~~ refactor: a lot Jan 15, 2025

eddiebergman merged commit fd6bf3b into master Jan 15, 2025
13 checks passed

eddiebergman deleted the cleanup-mf-heirarchies branch January 15, 2025 16:39

eddiebergman mentioned this pull request Jan 15, 2025

feat: graph space simplification #151

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: a lot #167

refactor: a lot #167

eddiebergman commented Jan 8, 2025 •

edited

Loading

refactor: a lot #167

refactor: a lot #167

Conversation

eddiebergman commented Jan 8, 2025 • edited Loading

Yaml

Optimizer Heirarchy

Re-usable components

Graphs

eddiebergman commented Jan 8, 2025 •

edited

Loading