Skip to content

PicoProducer corrections

Izaak edited this page Feb 23, 2024 · 8 revisions

[Under construction; Please note we are migrating phasing out these tools from ROOT files to correctionlib.]

Several tools to get corrections are provided in PicoProducer/python/corrections/ These include efficiencies, scale factors (SFs), event weights, etc. The data for the corrections are saved in in PicoProducer/data/

Pileup reweighting

PileupTool.py provides the pileup event weight based on the data and MC profiles in PicoProducer/data/pileup/. Please note that, as an alternative, the CMS also provide the official puWeightProducer.py module, which adds the weights to the nanoAOD.

The data profile can be computed with the pileupCalc.py tool. The MC profile can be taken from the distribution of the Pileup_nTrueInt variable in nanoAOD, for each MC event:

    self.out.pileup.Fill(event.Pileup_nTrueInt)

and then extracted with PicoProducer/data/pileup/getPileupProfiles.py. Most analysis modules already store the relevant information, but for a quick run, one can also use the simple, reduced module, PicoProducer/python/analysis/PileUp.py. Run it as:

pico.py channel pileup PileUp # link channel to module
pico.py submit -c pileup -y UL2016 --dtype mc
# wait until the jobs are done
pico.py hadd -c pileup -y UL2016 --dtype mc
./getPileupProfiles.py -y UL2016 -c pileup

Comparisons are shown here: 2016, 2017, 2018, UL2016, UL2017, and UL2018.

Pileup profiles for 2016 Pileup profiles for 2017 Pileup profiles for 2018

Lepton efficiencies

Several classes are available to get corrections for electrons, muons and hadronically-decayed tau leptons:

  • ScaleFactorTool.py
    • ScaleFactor: general class to get SFs from histograms
    • ScaleFactorHTT: class to get SFs from histograms, as measured by the HTT group
  • MuonSFs.py: class to get muon trigger / identification / isolation SFs
  • ElectronSFs.py class to get electron trigger / identification / isolation SFs

ROOT files with efficiencies and SFs are saved in PicoProducer/data/lepton. Scale factors can be found here:

In case you use lepton scale factors and efficiencies as measured by the HTT group, you need to make sure you get them with

cd lepton
git clone https://github.com/CMS-HTT/LeptonEfficiencies HTT

Tau scale factors

Please use the official TauID tool. Installation instructions are given in the installation wiki page.

B tagging tools

BTagTool.py provides two classes: BTagWPs for saving the working points (WPs) per year and type of tagger, and BTagWeightTool to provide b tagging weights. These can be called during the initialization of your analysis module, e.g.:

class ModuleMuTau(Module):
  
  def __init__(self, ... ):
    # ...
    if not self.isData:
      self.btagTool = BTagWeightTool('DeepCSV','medium',channel=channel,year=year)
    self.deepcsv_wp = BTagWPs('DeepCSV',year=year)
  
  def analyze(self, event):
    nbtag = 0
    jets  = [ ]
    for jet in Collection(event,'Jet'):
      # ...
      jets.append(jet)
      if jet.btagDeepB > self.deepcsv_wp.medium:
        nbtag += 1
    if not self.isData:
      self.out.btagweight[0] = self.btagTool.getWeight(event,jets)

BTagWeightTool calculates b tagging reweighting based on the SFs provided from the BTagging group and analysis-dependent efficiencies measured in MC. These are saved in ROOT files in PicoProducer/data/btag/. The event weight is calculated according to this method.

Computing the b tag efficiencies

The b tag efficiencies are analysis-dependent. They can be computed from the analysis output run on MC samples. For each (pre-)selected MC event, fill the numerator and denominator histograms with BTagWeightTool.fillEfficiencies, after removing overlap with other selected objects, e.g. the muon and tau object in ModuleMuTau.py:

  def analyze(self, event):
    # select isolated muon and tau
    # ...
    jet = [ ]
    for jet in Collection(event,'Jet'):
      if jet.pt<30: continue
      if abs(jet.eta)>4.7: continue
      if muon.DeltaR(jet)<0.5: continue
      if tau.DeltaR(jet)<0.5: continue
      jets.append(jet)
    if not self.isData:
      self.btagTool.fillEfficiencies(jets)
    ...

Do this for as many MC samples as possible, to gain as many events as possible (also note that jets in Drell-Yan, W+jets and ttbar events typically have different jet flavor content). Then edit and run PicoProducer/data/btag/getBTagEfficiencies.py to extract all histograms from analysis output, add them together for maximum statistics, and compute the efficiencies. (You should edit this script to read in your analysis output.) Examples of efficiency maps per jet flavor, and as a function of jet pT versus jet eta for the mutau analysis in 2017 are shown here.

B tagging efficiency map B tagging c misidentification map B tagging misidentification map

Z pT reweighting

The observed Z pT spectrum is harder than in the LO MadGraph simulation, such as DYJetsToLL_*_TuneCP5_13TeV-madgraphMLM-pythia8 samples. Therefore LO Drell-Yan events have to be reweighted as a function of Z pT (and maybe other variables such as mass, jet multiplicity, and/or MET). The TauFW provides a measurement tool in Fitter/Zpt/. The weights are stored in PicoProducer/data/zpt/, and RecoilCorrectionTool.py provides a tool to read them. Alternatively, you can use a simple C++ macro to run it on the fly in TTree::Draw.

Trigger object matching

For matching trigger objects, please use the TrigObjMatcher.py tool. This tool uses the JSON files in PicoProducer/data/trigger, to get a list of all available trigger filter bits in nanoAOD, as well as the commonly used combination of trigger paths per year (at least by ditau analysis). Please use with caution, as some part may be incomplete and still need validation.

The JSON files were created with tools in this repo.

Test SFs

testSFs.py provides a simple and direct way of testing the correction tool classes, without running the whole framework.