Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integration of correction-lib within CorrectedJets/METFactory #1134

Open
wants to merge 6 commits into
base: backports-v0.7.x
Choose a base branch
from

Conversation

anpicci
Copy link

@anpicci anpicci commented Jul 19, 2024

This PR is intended to track the progress on integrating correction-lib within the coffea JME tools.

The very first contribution is the calculation of the JECs using inputs from the correction-lib tool.

FYI @lgray @nsmith-

class CorrectedJetsFactory(object):
def __init__(self, name_map, jec_stack):
def __init__(self, name_map, jec_stack, jec_names, jec_year):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do not alter the signature of this function (especially with new non-defaulted arguments). We want to maintain backwards compatibility for the time being.

Could you try implementing this such that the code detects correction-lib inputs coming in from the JEC stack, and then follow a different code path in the case of correctionlib inputs?

Copy link
Author

@anpicci anpicci Aug 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lgray thank you for pointing out this. Would it work if jec_stack is a list or a dict if one wants to use correction-lib? IMU, the JECStack class (as well as the auxiliary ones) is not needed when using correction-lib, as the latter already has functionalities to get the right numbers from the formulas that are part of the JME corrections/

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The JEC stack also defines the mapping from names in your data to the arguments of the functions to evaluate. Sure, you can have the jec_stack be something much simpler (like a list) if correctionlib does a bunch of the bookkeeping for you.

Though I think you'll need to do some arranging yourself for the JES uncertainties.

@@ -171,6 +191,29 @@ def build(self, jets, lazy_cache):
out_dict = dict(in_dict)

# take care of nominal JEC (no JER if available)
## General setup to use correction-lib
clib_json = self.jec_year
json_path = f"/cvmfs/cms.cern.ch/rsync/cms-nanoAOD/jsonpog-integration/POG/JME/{clib_json}/jet_jerc.json.gz"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we do not put hard dependencies on cvmfs paths in coffea.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be configurable.

I'd also be careful of building new objects every time you call uncertainties, this can become very slow.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, the json file name would change depending on if it's AK8 or AK4. So, I would also recommend against hard-coding this and instead keeping this in the jec_stack.

self.jec_names = jec_names
self.jec_year = jec_year
## the last element of jec_names has to be a boolean, specifying if the use wants to save the different correction levels
self.separated = self.jec_names.pop()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be a useful debug feature, but from the JME point of view, providing just the name of the JEC tag for the required step should suffice. I don't foresee a need for analyses to store the result of each step, but I'm open to hearing about any exceptions.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@garvitaa I would suggest keeping it to provide users more flexibility, without the need for them to put the hands on the coffee scripts

@anpicci
Copy link
Author

anpicci commented Aug 22, 2024

@lgray @nsmith- @garvitaa I have pushed a commit to address your comments, as well as to extend the correction-lib implementation to JER and JEC uncertainties

@anpicci anpicci requested review from lgray and garvitaa September 12, 2024 12:04

cumCorr = None
if len(corrections_list) > 0:
ones = numpy.ones_like(corrections_list[-1], dtype=numpy.float32)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try without ones

has_jer = False
if self.jec_stack.jer is not None and self.jec_stack.jersf is not None:
has_jer = True
elif self.tool == "clib":
has_jer = True
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a sanity check if the input contains jer or not

elif self.tool == "clib":
## needed to attach to jets the JECs
jer_out_parms = out.layout.parameters
jer_out_parms["corrected"] = True
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

out_dict["jet_energy_resolution"], numpy.float32
),
awkward.values_astype(
out_dict["jet_resolution_rand_gauss"], numpy.float32
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try to use the correction-lib random values

@anpicci
Copy link
Author

anpicci commented Sep 17, 2024

Following the in-person discussion with @nsmith- , I will implement his suggestions (that I left here and there as comments to my commits), as well as work to make jec_stack a data class when using correction-lib in place of JECStack

@green-cabbage
Copy link
Contributor

Hello, may I ask for the status of this pull request? I am looking for a method where I could ask for compounded JEC from a given list, something like [L1FastJet, L2Relative, L3Absolute, L2L3Residual], and then also get JEC and JER uncertainties in a clean manner. This pull request seems to do that, but perhaps I am wrong?

@anpicci
Copy link
Author

anpicci commented Dec 13, 2024

Hi @green-cabbage , that's correct, feel free to use it and provide any type of feedback!
This is an example of how to use this version of the jetmet-tools:
https://github.com/TopEFT/topeft/blob/master/topeft/modules/corrections.py#L1425-L1462

@green-cabbage
Copy link
Contributor

@anpicci Noted! On the surface, this seems to work with coffea 0.7, which is convenient for me.

@green-cabbage
Copy link
Contributor

green-cabbage commented Jan 8, 2025

Dear @anpicci and @lgray ,

Quick question. Has this code been validated with coffea 202X ? It seems like from this code that I have to add lazy_cache to get the Factory. Unfortunately, events.caches[0] isn't available in the newest coffea afaik. I am tempted to just copy paste the CorrectedJetsFactory code from the newest coffea, but I am not so sure if that's a wise choice, given there must be some reason why CorrectedJetsFactory has also been touched from coffea 0.7

Thank you!

@anpicci
Copy link
Author

anpicci commented Jan 8, 2025

Hi @green-cabbage , I have validated this PR only with coffea 0.7

@green-cabbage
Copy link
Contributor

I see. Would it be theoretically difficult to migrate this method to coffea 202X? I am willing to put in some of my own time if I could get some test cases I could validate with. @anpicci @lgray

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants