Standard Ntuplizer

This code serves as a template for new Ntuplizers to work with CMSSW. Basic instructions for installation and standard modifications can be found below. It presents an example where a ROOT tree if filled with plain Ntuples made from pat::Muon variables read from MiniAOD. It is configured to read Cosmic data from the NoBPTX dataset.

The Ntuplizer is an EDAnalyzer. More information about this class and its structure can be found in https://twiki.cern.ch/twiki/bin/view/CMSPublic/WorkBookWriteFrameworkModule.

How to install

Recommended release for this analyzer is CMSSW_12_4_0 or later. Commands to setup the analyzer are:

cmsrel CMSSW_12_4_0

cd CMSSW_12_4_0/src

cmsenv

mkdir Analysis

cd Analysis

git clone [email protected]:CeliaFernandez/standard-Ntuplizer.git

scram b -j 8

Ntuplizer structure

The analyzer consists of three folders:

plugins/: which contains the plugins (EDAnalyzer's) where the analyzers are defined in .cc files. These are the main code.
python/: which contains cfi files to setup the sequences that run with the plugins contained in plugins/. A sequence is an specific configuration of the parameters that run with one of the plugins defined in plugins. One single plugin may have different sequences defined in the same or multiple files.
test/: which contains cfg files to run the sequences defined in the python/ folder.
macros/ (optional): which contains .py files to read the produced ntuples and create the plots if we don't have an external analyzer.

EDAnalyzer plugin

EDAnalyzer is a class that is designed to loop over the events of one or several ROOT files. It has several actions that are executed before the event loop in the beginJob() function, actions that are executed per event in the analyze() function and actions that are executed once the loop has finished in the endJob() function.

Each EDAnalyzer instance is associated with a module (don't forget to include this line):

standard-Ntuplizer/plugins/ntuplizer.cc

Line 265 in 5e3b77f

DEFINE_FWK_MODULE(ntuplizer);

In the case of the ntuplizer we would like to initialize the output file in the beginJob() function, fill the information per event in the analyze() function and finally close and save the file in the analyze() once all the information is saved.

Configuration cfi files and parameters

Parameters are values that are defined "per sequence" and serve to configure how the code should run. For example, if we want to run the same EDAnalyzer for both data and Monte Carlo we may need to know if the generation variables can be accesed or not as if we try to access them in data we may likely get an error. This could be done via parameters.

The parameter values are defined in a cfi file, whose structure is as follows e.g. python/ntuples_cfi.py:

standard-Ntuplizer/python/ntuples_cfi.py

Lines 1 to 12 in ac4da89

    
           import FWCore.ParameterSet.Config as cms 
        
           ntuples = cms.EDAnalyzer('ntuplizer', 
        
               nameOfOutput = cms.string('output.root'), 
        
               isData = cms.bool(True), 
        
               EventInfo = cms.InputTag("generator"), 
        
               RunInfo = cms.InputTag("generator"), 
        
               BeamSpot = cms.InputTag("offlineBeamSpot"), 
        
               displacedGlobalCollection = cms.InputTag("displacedGlobalMuons"), 
        
               displacedStandAloneCollection = cms.InputTag("displacedStandAloneMuons"), 
        
               displacedMuonCollection = cms.InputTag("slimmedDisplacedMuons") 
        
           )

where ntuples is the name of the sequence (instance of EDAnalyzer) and 'ntuplizer' matches the name of the plugin we want to run.

Each parameter as a variable that is declared in the EDAnalyzer constructor as a private variable that can be used when the code is running. For example, to indicate if we are running on data samples we have can define a bool variable isData:

standard-Ntuplizer/plugins/ntuplizer.cc

Line 73 in 5e3b77f

bool isData = false;

and we also define a isData parameter in the cfi file ntuples_cfi.py:

standard-Ntuplizer/python/ntuples_cfi.py

Line 5 in ac4da89

isData = cms.bool(True),

The isData variable is initiated with the value set in the cfi file. The values defined there can be accessed in the constructor with iConfig variable:

standard-Ntuplizer/plugins/ntuplizer.cc

Line 115 in 5e3b77f

ntuplizer::ntuplizer(const edm::ParameterSet& iConfig) {

To access iConfig in other parts of the code is useful to define a edm::ParameterSet variable, which in our case is called parametersand it is declared in the class definition as

standard-Ntuplizer/plugins/ntuplizer.cc

Line 49 in 5e3b77f

edm::ParameterSet parameters;

and initiated in the constructor as a copy of iConfig:

standard-Ntuplizer/plugins/ntuplizer.cc

Line 119 in 5e3b77f

parameters = iConfig;

Then we can assign the correct value to isData before the analyzer runs in beginJob() function like:

standard-Ntuplizer/plugins/ntuplizer.cc

Line 147 in 5e3b77f

isData = parameters.getParameter<bool>("isData");

Configuration cfg files

The configurarion cfg file serves to run the plugins as described in Section "How to run".

How to run

This example runs with a file of the 2023 NoBPTX dataset that may need to be accessed throught xrootd. Make sure that you have a valid proxy before running and do at least once:

voms-proxy-init --voms cms

Then you can run the Ntuplizer with the setup configuration through the cfg file:

cmsRun test/runNtuplizer_cfg.py

Quick start: How to modify the analyzer

In this section (to be completed) there are several examples of how modify the existing analyzer.

How to add new variables of an existing collection

We first need to declare a new variable that will act as a container for the value we want to store e.g. the number of displacedGlobalMuon tracks ndgl. It is defined in the constructor of the EDAnalyzer as a private variable (although it could be also a global variable):

standard-Ntuplizer/plugins/ntuplizer.cc

Line 79 in 8656711

Int_t ndgl = 0;
We then need to link this variable's address &ndlg to the TTree branch. This is done at the beginning, where the TTree is created in beginJob():

standard-Ntuplizer/plugins/ntuplizer.cc

Line 147 in 8656711

tree_out->Branch("ndgl", &ndgl, "ndgl/I");
This variable will be saved inside the TTree once the Fill() command is executed:

standard-Ntuplizer/plugins/ntuplizer.cc

Line 244 in 8656711

tree_out->Fill();

So the value of this variable should be assigned before that like:

standard-Ntuplizer/plugins/ntuplizer.cc

Line 210 in 8656711

ndgl = 0;

standard-Ntuplizer/plugins/ntuplizer.cc

Line 216 in 8656711

ndgl++;
It is possible to save an array of values. In this case we must define a container array with a long enough length (as it is declared when the analyzer is defined and bound to be always the same for every event). For example, for the pt of the stored displacedGlobalMuon tracks we define a default array container of 200 entries (no event has more than 200 displaced global muons):

standard-Ntuplizer/plugins/ntuplizer.cc

Line 80 in df839c1

Float_t dgl_pt[200] = {0.};

And then set the array length for the branch. If the array length is dependent of the event, we can make the array branch length dependent of another TTree variable. In this case we have as much pt measurements as number of displacedGlobalMuon tracks i.e. ndgl:

standard-Ntuplizer/plugins/ntuplizer.cc

Line 148 in df839c1

tree_out->Branch("dgl_pt", dgl_pt, "dgl_pti[ndgl]/F");

Since the array is itself a contained of adresses, it is not needed to include the &. The pt values are filled per displacedGlobalMuon track in a loop:

standard-Ntuplizer/plugins/ntuplizer.cc

Line 213 in df839c1

dgl_pt[ndgl] = dgl.pt();

How to read a new collection

To read collections we need to know the class of the objects we want to access and the label of the collection itself. If you don't know this information this command is useful:

edmDumpEventContent sample.root > eventcontent.txt

For example, to access displaced muons in MiniAOD we need to know that the name of the collection is slimmedDisplacedMuons and that these are saved as pat::Muon objects.

Then, we need to define a Token and a Handler in the EDAnalyzer declaration as private variables:

standard-Ntuplizer/plugins/ntuplizer.cc

Lines 66 to 67 in 5e3b77f

    
           edm::EDGetTokenT<edm::View<pat::Muon> > dmuToken; 
        
           edm::Handle<edm::View<pat::Muon> > dmuons;

The Token is initialized in the constructor with the label of the collection and the type with the consumes method:

standard-Ntuplizer/plugins/ntuplizer.cc

Line 126 in 5e3b77f

    
           dmuToken = consumes<edm::View<pat::Muon> >  (parameters.getParameter<edm::InputTag>("displacedMuonCollection"));

In this case the name of the collection is given as a parameter in the cfi file with the name of displacedMuonCollection:

standard-Ntuplizer/python/ntuples_cfi.py

Line 11 in fa2da2d

displacedMuonCollection = cms.InputTag("slimmedDisplacedMuons")

Then, we use the Token to load the collection (per event) in the Handler:

standard-Ntuplizer/plugins/ntuplizer.cc

Line 205 in 5e3b77f

iEvent.getByToken(dmuToken, dmuons);

And this collection can be accessed inside analyze() as an std::vector of pat::Muon.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
plugins		plugins
python		python
test		test
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Standard Ntuplizer

How to install

Ntuplizer structure

EDAnalyzer plugin

Configuration cfi files and parameters

Configuration cfg files

How to run

Quick start: How to modify the analyzer

How to add new variables of an existing collection

How to read a new collection

About

Releases

Packages

Languages

	import FWCore.ParameterSet.Config as cms

	ntuples = cms.EDAnalyzer('ntuplizer',
	nameOfOutput = cms.string('output.root'),
	isData = cms.bool(True),
	EventInfo = cms.InputTag("generator"),
	RunInfo = cms.InputTag("generator"),
	BeamSpot = cms.InputTag("offlineBeamSpot"),
	displacedGlobalCollection = cms.InputTag("displacedGlobalMuons"),
	displacedStandAloneCollection = cms.InputTag("displacedStandAloneMuons"),
	displacedMuonCollection = cms.InputTag("slimmedDisplacedMuons")
	)

	edm::EDGetTokenT<edm::View<pat::Muon> > dmuToken;
	edm::Handle<edm::View<pat::Muon> > dmuons;

CeliaFernandez/standard-Ntuplizer

Folders and files

Latest commit

History

Repository files navigation

Standard Ntuplizer

How to install

Ntuplizer structure

EDAnalyzer plugin

Configuration cfi files and parameters

Configuration cfg files

How to run

Quick start: How to modify the analyzer

How to add new variables of an existing collection

How to read a new collection

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages