You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The following is a to-do list of work that need be done for the pre-processing and histogramming codes to interface nicely and provide a consistent UX:
For sklimming:
Sklimming should replace the Branch objects with Variable or Observable.
The idea of Functor sub-class called Builder for a Variable is much neater than the new_branches dictionary.
Remove the arg_types argument for Variable/Branch objects and replace with an optional list_str_arg where the default assumption when a str is provided in argument becomes that it is a another Branch
Sklimming back-end is too messy -- especially the organisation of different type of branches (new, tmp, on, off)
Dask distribution for the reading in the backend in a similar way to the histogramming code.
Consistent way of outputting yields using a given branch
Config to use a schema in a similar way to histogramming
reader code should be a processor code
Argparser to select branches and samples in stirring script
Look into a coffea backend
Make variables/branches optional in Sample constructor -- in case it is needed at only histogramming
Remove or actually use job name in general settings
indirs in general_settings to become used as default for samples if no where is given
Branch/Variable awate of tree rather than using a dict? this way user can just specify which trees to use when
looking for this variable withour repition
Better multiple-tree support
For histogramming:
Provide an easy way for user to change sample settings -- does user really need anythig other than to set regex (i.e tag and specify if a sample isdata)?
The input_paths finder in InputManager should be able to handle user provided methods of finding paths, possibly via a decorator in the config? This is for users who have NTuples that they want to Histogram without running through pre-prcoessing
Data rendering before passing awkward arrays to boost_histograms. This probably means handling masked awkward arrays, since it seems like boost_histograms do not deal well with None in the awkward/numpy arrays (they get dumped in the underflow) boost-histogram issue
Regions should not be required -- only Observables should be (inclusive sample can be selected by a dummy function)
Overall Systematics
Observable.fromFunc() does it really need args -- can we not just replace args with var
access to samples if specified sklimconfig in settings, or no?
Auto binning (uniform betwen min and max if user not provided binning -- safeguard for user? warn them? flag to tell us its acceptable not to have binning?)
Do we need general.from_hists flag to allow retrieving histograms from Histogram files?
[] Histo name goes to :: or __ between different XP components
For both sklimming and histogramming:
Remote distribution of jobs with dask(e.g. HTCondotCluster)
Can Functor fromStr as it is parse slicing syntax?
Test functions for all features
Variable and Observable can maybe inherit from a parent class -- what is the benefit a user will get if they have to specify the binning for var by var? except if we support no binning and just do a uniform binning on behalf of user, then yhey can just import their variables from sklimming to histogramming.
The text was updated successfully, but these errors were encountered:
The following is a to-do list of work that need be done for the pre-processing and histogramming codes to interface nicely and provide a consistent UX:
For sklimming:
Branch
objects withVariable
orObservable
.Functor
sub-class calledBuilder
for aVariable
is much neater than thenew_branches
dictionary.arg_types
argument forVariable
/Branch
objects and replace with an optionallist_str_arg
where the default assumption when astr
is provided in argument becomes that it is a anotherBranch
reader
code should be a processor codebranches
andsamples
in stirring scriptcoffea
backendSample
constructor -- in case it is needed at only histogrammingwhere
is givenlooking for this variable withour repition
For histogramming:
tag
and specify if a sampleisdata
)?input_paths
finder inInputManager
should be able to handle user provided methods of finding paths, possibly via a decorator in the config? This is for users who have NTuples that they want to Histogram without running through pre-prcoessingawkward
arrays toboost_histograms
. This probably means handling masked awkward arrays, since it seems likeboost_histograms
do not deal well withNone
in theawkward
/numpy
arrays (they get dumped in the underflow) boost-histogram issueargs
-- can we not just replaceargs
withvar
For both sklimming and histogramming:
dask
(e.g.HTCondotCluster
)The text was updated successfully, but these errors were encountered: