Tau ID scale factors

This repository contains the recommended scale factors (SFs) for several tau discriminators, and tools to read them. More detailed recommendations can be found on this TWiki page: https://twiki.cern.ch/twiki/bin/viewauth/CMS/TauIDRecommendationForRun2

⚠️ Please note that in the near future the SFs in the format of ROOT files will be phased out, and in the long term superseded by the correctionlib tool and JSON files provided centrally by the XPOG. More detailed instructions for tau corrections are here.

Installation of the tool

To install the tool for reading the tau ID SFs, do

export SCRAM_ARCH=slc6_amd64_gcc700 # for CMSSW_10_3_3, check "scram list"
CMSSW_BASE=CMSSW_10_3_3             # or whichever release you desire
cmsrel $CMSSW_BASE
cd $CMSSW_BASE/src
git clone https://github.com/cms-tau-pog/TauIDSFs TauPOG/TauIDSFs
cmsenv
scram b -j8

Python

After compiling with this respective directory hierarchy, you can acces the tool (python/TauIDSFTool.py) in python as

from TauPOG.TauIDSFs.TauIDSFTool import TauIDSFTool

A test of the tool in python can be run with

./test/testTauIDSFTool.py

C++

A similar C++ implementation is available in src/TauIDSFTool.cc, with a simple example of usage in (test/testTauIDSFTool.cc). This is also an installation test that can be compiled and run with

scram b runtests -j8

Python without CMSSW

Alternatively, if you want to use the python tool standalone without CMSSW, clone the repository and assure that your PYTHONPATH points to the TauIDSFTool module.

export PYTHONPATH=<path to python directory>:$PYTHONPATH

Afterwards, you should be able to do:

from TauIDSFTool import TauIDSFTool

Summary of available SFs

This is a rough summary of the available SFs for DeepTau2017v2p1 and DeepTau2018v2p5 in data/:

Tau component	`genmatch`	`DeepTau2017v2p1` `VSjet`	`DeepTau2017v2p1` `VSe`	`DeepTau2017v2p1` `VSmu`	`DeepTau2018v2p5` `VSjet`	energy scale
real tau	`5`	vs pT and DM (for MC) or vs. pT, or vs. DM (for Embed.)	– (*)	– (*)	vs pT and DM (for MC), no Embed. corrections derived yet	vs. DM
e -> tau fake	`1`, `3`	–	vs. eta	–		vs. DM and eta
mu -> tau fake	`2`, `4`	–	–	vs. eta		– (±1% unc.)

(*) The scale factors are provided only for a sub-set of the working points. For the VSele discriminator, they are measured for the VVLoose and Tight WPs - users are strongly encoraged to use one of these two working points and should report to the TauPOG for approval if another working point is used. For the VSmu, they are measured for the Tight WP but we don't expect a large dependence on the chosen VSmu WP in this case so you are free to use any available WP you like for the muon rejection.

The gen-matching is defined as:

1 for prompt electrons
2 for prompt muons
3 for electrons from tau decay
4 for muons from tau decay
5 for real taus
6 for no match, or jets faking taus. For more info on gen-matching of taus, please see here. Note that in nanoAOD this is available as Tau_GenPartFlav, but jet or no match correspond to Tau_GenPartFlav==0 instead of 6.

The SFs are meant for the following campaigns:

Year label	MC campaign	Data campaign
`2016Legacy` (*)	`RunIISummer16MiniAODv3`	`17Jul2018`
`2017ReReco` (*)	`RunIIFall17MiniAODv2`	`31Mar2018`
`2018ReReco` (*)	`RunIIAutumn18MiniAOD`	`17Sep2018`/`22Jan2019`
`UL2016_preVFP`	`RunIISummer20UL16*APV`	`(HIPM_)UL2016_MiniAODv*`
`UL2016_postVFP`	`RunIISummer20UL16`	`UL2016_MiniAODv*`
`UL2017`	`RunIISummer20UL17`	`UL2017_MiniAODv*`
`UL2018`	`RunIISummer20UL18`	`UL2018_MiniAODv*`

(*) The SFs provided for pre-UL samples follow the old conventions for the binning by either pT or DM, and follow the old uncertainty scheme where only total uncertainties are reported

Usage

A simple script is given to dump the corrections saved in histograms or functions of the files in data/. Use for example

./test/dumpTauIDSFs.py data/TauID_SF_*_DeepTau2017v2p1VSjet_*.root

DM and pT-dependent SFs

The DM and pT dependent SFs are provided as TF1 functions in the "TauID_SF_dm_DeepTau2017v2p1VSjet_VSjetX_VSeleY_Mar07.root" ROOT files for DeepTau2017v2p1 and "TauID_SF_dm_DeepTau2018v2p5VSjet_VSjetX_VSeleY_Jul18.root" for DeepTau2018v2p5, where X corresponds to the VSjet WP and Y corresponds to the VSele WP.

The ROOT files contain several functions. The central values are obtained from the functions named like "DM$DM_$ERA_fit" where $DM is the decay mode = 0, 1, 10, or 11, and $ERA = 2016_preVFP, 2016_postVFP, 2017, or 2018.

For example to obtain the central value of the SFs for the Medium VSjet and VVLoose VSele WPs of the 'DeepTau2017v2p1VSjet' discriminator for DM=1 in 2018, use

file = TFile("data/TauID_SF_dm_DeepTau2017v2p1VSjet_VSjetMedium_VSeleVVLoose_Mar07.root")
func = file.Get('DM1_2018_fit')
sf   = func.Eval(pt)

There are also functions that correspond to systematic variations that can be accessed in the same way. The table below gives a summary of the function names and what uncertainties they correspond to for DeepTau2017v2p1:

Uncertainty	Function name in ROOT files	String to pass to the tool	Notes	Correlated by era	Correlated by DM
`Statistical uncertainty 1`	`DM$DM_$ERA_fit_uncert0_{up,down}`	`uncert0_{up,down}`	`Statistical uncertainty on linear fit parameters from eigendecomposition of covariance matrix.`	✗	✗
`Statistical uncertainty 2`	`DM$DM_$ERA_fit_uncert1_{up,down}`	`uncert1_{up,down}`	`Statistical uncertainty on linear fit parameters from eigendecomposition of covariance matrix.`	✗	✗
`Systematic alleras`	`DM$DM_$ERA_syst_alleras_{up,down}_fit`	`syst_alleras_{up,down}`	`The component of the systematic uncertainty that is correlated across DMs and eras`	✓	✓
`Systematic by-era`	`DM$DM_$ERA_syst_$ERA_{up,down}_fit`	`syst_$ERA_{up,down}`	`The component of the systematic uncertainty that is correlated across DMs but uncorrelated by eras`	✗	✓
`Systematic by-era and by-DM`	`DM$DM_$ERA_syst_dm$DM_$ERA_{up,down}_fit`	`syst_dm$DM_$ERA_{up,down}`	`The component of the systematic uncertainty that is uncorrelated across DMs and eras`	✗	✗

The table below gives a summary of the function names and what uncertainties they correspond to for DeepTau2018v2p5:

Uncertainty	Function name in ROOT files	String to pass to the tool	Notes	Correlated by era	Correlated by DM
`Statistical uncertainty 1`	`DM$DM_$ERA_fit_uncert0_{up,down}`	`uncert0_{up,down}`	`Statistical uncertainty on linear fit parameters from eigendecomposition of covariance matrix.`	✗	✗
`Statistical uncertainty 2`	`DM$DM_$ERA_fit_uncert1_{up,down}`	`uncert1_{up,down}`	`Statistical uncertainty on linear fit parameters from eigendecomposition of covariance matrix.`	✗	✗
`Systematic alleras`	`DM$DM_$ERA_syst_alleras_{up,down}_fit`	`syst_alleras_{up,down}`	`The component of the systematic uncertainty that is correlated across DMs and eras`	✓	✓
`Systematic by-era`	`DM$DM_$ERA_syst_alldms_$ERA_{up,down}_fit`	`syst_alldms_$ERA_{up,down}`	`The component of the systematic uncertainty that is correlated across DMs but uncorrelated by eras`	✗	✗
`Systematic Tau Energy scale`	`DM$DM_$ERA_TES{Up,Down}_fit`	`TES_{up,down}`	`The uncertainty due to the tauenergy scale systematic uncertainty`	✗	✗

The SFs can also be accessed using the tool:

from TauPOG.TauIDSFs.TauIDSFTool import TauIDSFTool
tauSFTool = TauIDSFTool(year='UL2018',id='DeepTau2017v2p1VSjet',wp='Medium',wp_vsele='VVLoose',ptdm=True)
sf        = tauSFTool.getSFvsDMandPT(pt,dm,genmatch)

And uncertainty variations can be accessed using:

sf        = tauSFTool.getSFvsDMandPT(pt,dm,genmatch,unc)

where the unc string is used to identify the systematic variation as given in the third column in the above table

High-pT pT-dependent SFs

Analyses that are sensitive to taus with pT>140 GeV should switch to the dedicated high pT SFs measured in bins of pT above 140 GeV

The SFs are provided as TGraphAsymmErrors objects in the "TauID_SF_Highpt_DeepTau2017v2p1VSjet_VSjetX_VSeleY_Mar07.root" ROOT files, where X corresponds to the VSjet WP and Y corresponds to the VSele WP.

The ROOT files contain several graphs. The central values are obtained from the graphs named like "DMinclusive_$ERA" where $ERA = 2016_preVFP, 2016_postVFP, 2017, or 2018. These graphs contain 2 pT bins with pT 100-200, and pT>200 GeV. You should only use these as binned values. For taus between 140-200 GeV use the first bin, and for taus with pT>200 GeV use the second bin.

The SFs can also be accessed using the tool:

from TauPOG.TauIDSFs.TauIDSFTool import TauIDSFTool
tauSFTool = TauIDSFTool(year='UL2018',id='DeepTau2017v2p1VSjet',wp='Medium',wp_vsele='VVLoose',highpT=True)
sf        = tauSFTool.getHighPTSFvsPT(pt,genmatch)

And uncertainty variations can be accessed using:

sf        = tauSFTool.getHighPTSFvsPT(pt,genmatch,unc)

where "unc" dependends on the uncertainty source. The table below describes the uncertainty sources and the string you need to pass to the tool to retrieve them:

Uncertainty	String to pass to the tool	Notes	Correlated by era	Correlated by pT
`Statistical uncertainty 1`	`stat_bin1_{up,down}`	`Statistical uncertainty on the pT 140-200 GeV bin. Note this also includes systematic uncertainties that are decorrelated by pT bin and era (since they also behave like statistical uncertainties)`	✗	✗
`Statistical uncertainty 2`	`stat_bin2_{up,down}`	`Statistical uncertainty on the pT >200 GeV bin. Note this also includes systematic uncertainties that are decorrelated by pT bin and era (since they also behave like statistical uncertainties)`	✗	✗
`Systematic`	`syst_{up,down}`	`The systematic uncertainty that is correlated across pT regions and eras`	✓	✓
`Extrapolation Systematic`	`extrap_{up,down}`	`The systematics uncertainty due to the extrapolation of the SF to higher pT regions`	✓	✓

pT-dependent SFs

Deprecated for UL MC - use DM and pT-dependent SFs instead!

The embedded scale factors still follow the old prescriptions for pT or DM binned SFs so these instructions still apply in this case

The pT-dependent SFs are provided as TF1 functions. For example, to obtain those for the medium WP of the 'DeepTau2017v2p1VSjet' discriminator for 2016, use

file = TFile("data/TauID_SF_pt_DeepTau2017v2p1VSjet_2016Legacy.root")
func = file.Get('Medium_cent')
sf   = func.Eval(pt)

The tool can be used as

from TauPOG.TauIDSFs.TauIDSFTool import TauIDSFTool
tauSFTool = TauIDSFTool('2016Legacy','DeepTau2017v2p1VSjet','Medium',ptdm=False)

and to retrieve the SF for a given tau pT, do

sf = tauSFTool.getSFvsPT(pt)

The SF should only be applied to tau objects that match "real" taus at gen-level (genmatch==5). You can pass the optional genmatch argument and the function will return the appropriate SF if genmatch==5, and 1.0 otherwise,

sf = tauSFTool.getSFvsPT(pt,genmatch)

The recommended uncertainties can be retrieved as

sf_up   = tauSFTool.getSFvsPT(pt,genmatch,unc='Up')
sf_down = tauSFTool.getSFvsPT(pt,genmatch,unc='Down')

or, all three in one go:

sf_down, sf, sf_up = tauSFTool.getSFvsPT(pt,genmatch,unc='All')

For the tau ID SF of the embedded samples, set the emb flag to True:

tauSFTool = TauIDSFTool('2017ReReco','DeepTau2017v2p1VSjet','Medium',emb=True)

If your analysis uses a DeepTauVSe WP looser than VLoose and/or DeepTauVSmu looser than medium discriminators, should add additional uncertainty using the otherVSlepWP flag:

tauSFTool = TauIDSFTool('2017ReReco','DeepTau2017v2p1VSjet','Medium',otherVSlepWP=True)

DM-dependent SFs

Deprecated for UL MC - use DM and pT-dependent SFs instead!

The embedded scale factors still follow the old prescriptions for pT or DM binned SFs so these instructions still apply in this case

Analyses using ditau triggers and tau pT > 40 GeV, may use DM-dependent SFs. Please note that no SFs are available for decay modes 5 and 6, and the tool will return 1 by default, please read this TWiki section. They are provided as TH1 histograms. For example, to obtain those for the medium WP of the 'DeepTau2017v2p1VSjet' discriminator for 2016, use

file = TFile("data/TauID_SF_dm_DeepTau2017v2p1VSjet_2016Legacy.root")
hist = file.Get('Medium')
sf   = hist.GetBinContent(hist.GetXaxis().FindBin(dm))

or with the tool,

from TauPOG.TauIDSFs.TauIDSFTool import TauIDSFTool
tauSFTool = TauIDSFTool('2017ReReco','MVAoldDM2017v2','Tight',dm=True,ptdm=False)
sf        = tauSFTool.getSFvsDM(pt,dm,genmatch)
sf_up     = tauSFTool.getSFvsDM(pt,dm,genmatch,unc='Up')
sf_down   = tauSFTool.getSFvsDM(pt,dm,genmatch,unc='Down')

where genmatch is optional.

Eta-dependent fake rate SFs for the anti-lepton discriminators

To apply SFs to electrons or muons faking taus, use the eta of the reconstructed tau and the genmatch code. They are provided as TH1 histograms:

file = TFile("data/TauID_SF_eta_DeepTau2017v2p1VSmu_2016Legacy.root")
hist = file.Get('Medium')
sf   = hist.GetBinContent(hist.GetXaxis().FindBin(eta))

or with the tool,

python/TauIDSFTool.py
antiEleSFTool = TauIDSFTool('2017ReReco','antiEleMVA6','Loose')
antiMuSFTool  = TauIDSFTool('2017ReReco','antiMu3','Tight')
antiEleSF     = antiEleSFTool.getSFvsEta(eta,genmatch)
antiMuSF      = antiMuSFTool.getSFvsEta(eta,genmatch)

The uncertainty is obtained in a similar way as above.

DM-dependent tau energy scale

Usage for DeepTau2018v2p5

The tau energy scale (TES) corrections for taus with pT<140 GeV are provided in the files data/TauES_dm_DeepTau2018v2p5VSjet_$ERA_VSjet$X_VSele$Y_Jul18.root, where $X corresponds to the VSjet WP, $Y corresponds to the VSele WP, and $ERA = UL2016_preVFP, UL2016_postVFP, UL2017, or UL2018

Each file contains one histogram ('tes') with the TES centered around 1.0 measured in bins of the tau decay mode. It should be applied to a genuine tau by multiplying the tau TLorentzVector, or equivalently, the tau energy, pT and mass as follows:

file = TFile("data/TauES_dm_DeepTau2018v2p5VSjet_UL2018_VSjetMedium_VSeleVVLoose_Jul18.root")
hist = file.Get('tes')
tes  = hist.GetBinContent(hist.GetXaxis().FindBin(dm))

# scale the tau's TLorentzVector
tau_tlv *= tes

# OR, scale the energy, mass and pT
tau_E  *= tes
tau_pt *= tes
tau_m  *= tes

The uncertainties are equal to 1.5% for decay modes 0, 1, and 10, and 2% for decay mode 11. The uncertainties should be decorrelated by decay modes and eras. For taus with pT>140 GeV, no corrections should be applied to the nominal TES value from MC but a larger 3% uncertainty should be included. A simple class, TauESTool, is provided to obtain the TES as

from TauPOG.TauIDSFs.TauIDSFTool import TauESTool
testool = TauESTool('UL2018','DeepTau2018v2p5VSjet',wp='Medium', wp_vsele='VVLoose')
tes     = testool.getTES(pt,dm,genmatch)
tesUp   = testool.getTES(pt,dm,genmatch,unc='Up')
tesDown = testool.getTES(pt,dm,genmatch,unc='Down')

This method computes the central values and uncertainty for low pT (20 GeV < pT < 140 GeV) and higher pT values (pT > 140 GeV).

Usage for DeepTau2017v2p1

The tau energy scale (TES) is provided in the files data/TauES_dm_*.root. Each file contains one histogram ('tes') with the TES centered around 1.0. It should be applied to a genuine tau by multiplying the tau TLorentzVector, or equivalently, the tau energy, pT and mass as follows:

file = TFile("data/TauES_dm_DeepTau2017v2p1VSjet_UL2018.root")
hist = file.Get('tes')
tes  = hist.GetBinContent(hist.GetXaxis().FindBin(dm))

# scale the tau's TLorentzVector
tau_tlv *= tes

# OR, scale the energy, mass and pT
tau_E  *= tes
tau_pt *= tes
tau_m  *= tes

A simple class, TauESTool, is provided to obtain the TES as

from TauPOG.TauIDSFs.TauIDSFTool import TauESTool
testool = TauESTool('2017ReReco','DeepTau2017v2p1VSjet')
tes     = testool.getTES(pt,dm,genmatch)
tesUp   = testool.getTES(pt,dm,genmatch,unc='Up')
tesDown = testool.getTES(pt,dm,genmatch,unc='Down')

This method computes the right uncertainty at intermediate (34 GeV < pT < 170 GeV) and higher pT values (pT > 170 GeV). Analyses that only want to use the TES at high pT, can use the following instead:

tes     = testool.getTES_highpt(dm,genmatch)

Eta- & DM-dependent e -> tau fake energy scale

The e -> tau fake energy scale (FES) is provided in the files data/TauFES_eta-dm_*.root. Each file contains one graph ('fes') with the FES centered around 1.0. It should only be applied to reconstructed taus that are faked by electrons (i.e. genmatch==1 or 3) and have DM 0 or 1. The application is the similar as for the TES above. A simple class, TauFESTool, is provided to obtain the FES as

from TauPOG.TauIDSFs.TauIDSFTool import TauFESTool
festool = TauESTool('2017ReReco')
fes     = festool.getFES(eta,dm,genmatch)
fesUp   = festool.getFES(eta,dm,genmatch,unc='Up')
fesDown = festool.getFES(eta,dm,genmatch,unc='Down')

Name		Name	Last commit message	Last commit date
Latest commit History 206 Commits
data		data
docs		docs
interface		interface
python		python
scripts		scripts
src		src
test		test
utils		utils
.gitignore		.gitignore
BuildFile.xml		BuildFile.xml
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Tau ID scale factors

Table of Contents

Installation of the tool

Python

C++

Python without CMSSW

Summary of available SFs

Usage

DM and pT-dependent SFs

High-pT pT-dependent SFs

pT-dependent SFs

DM-dependent SFs

Eta-dependent fake rate SFs for the anti-lepton discriminators

DM-dependent tau energy scale

Eta- & DM-dependent e -> tau fake energy scale

About

Releases

Packages

Contributors 13

Languages

cms-tau-pog/TauIDSFs

Folders and files

Latest commit

History

Repository files navigation

Tau ID scale factors

Table of Contents

Installation of the tool

Python

C++

Python without CMSSW

Summary of available SFs

Usage

DM and pT-dependent SFs

High-pT pT-dependent SFs

pT-dependent SFs

DM-dependent SFs

Eta-dependent fake rate SFs for the anti-lepton discriminators

DM-dependent tau energy scale

Eta- & DM-dependent e -> tau fake energy scale

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 13

Languages

Packages