Plotting and Cutflow Software

Introduction

We have some standard programs to make plots and dump cutflows from the ttH ntuples. These exploit TTree::Draw() to let the user draw quantities or count events with appropriate cross sections and corrections applied. The cuts and plots are specified in files using "yaml" syntax (see a very simple introduction here).

Installing the Software

The software has been separated from other code and is now its own standalone package. You need to check out the =SimplePlottingCutflow= package, e.g.

lsetup git
kinit cern_user_name@CERN.CH
git clone https://:@gitlab.cern.ch:8443/UTAustin/SimplePlottingCutflow.git Run2Plotting
cd Run2Plotting
lsetup "asetup 21.0.38,Athena"       # setup your favorite Athena release

You will need to set up an Athena release to use the code. Unfortunately it cannot run in a pure RootCore environment yet because we depend on libraries not available from RootCore.

Within the package there are two subdirectories. The subdirectory plotting contains all the scripts and configuration files. The subdirectory XsectionInput contains the configuration files for the sample cross sections, k-factors, filter efficiencies, and "priorities" (only priority 1 samples are included by default). It contains a script to allow you to convert the .txt files (which are nicely human-readable) to .yaml files (which the scripts actually use).

The scripts we're going to use are called do_cutflow.py and dump_plots.py. They're part of a broader suite of programs based on stack.py, which allows you to do interactive querying and plotting of the MC and data from the command line. Basically do_cutflow.py and dump_plots.py automate calls to stack.py.

The Input Data

You can use the files in /data_ceph/harish/v29_aug/l30tau/mc_sys. Any output of our ntupler will be compatible with this software (see 13 TeV tth Analysis with MultiLepAnalysisNtupler for instructions to build and run that ntupler). The ntupler expects the MC files to be provided with names of the form "DSID.root", and the data files to be "period*.root". All files should be in the same directory.

The Configuration and Cross Section Files

The cross sections for all processes are specified in XsectionInput/Xsection13TeV_tth_bkg_v1.yaml and Xsection13TeV_tth_sig_v1.yaml. These are autogenerated from the corresponding .txt files in the same directory, which are much easier to read. By default the "priority 1" samples are the ones that are used, although in practice we override these choices with some regularity. When you run either the cutflow dumper or the plotter, they will first print out all the MC files they have loaded, so you can keep track of what is being done.

You are unlikely to ever need to edit plotting/config_13TeV.yaml, but for completeness, this is the file that maps the process names given in the cross section files to categories of processes (e.g. "single top" or "diboson"). Effectively it groups the DSIDs and configures how they will be displayed in the cutflows and plots (and what colors the histograms will be). The color choices are the result of a carefully negotiated agreement, don't touch them unless really necessary. This file also contains weights for various Alpgen Z samples which you can mostly ignore.

If you need to tweak DSIDs for input files (e.g. you want to use an alternate MC for some process), you will likely need to add a command line argument to common_sample_arguments.py and configure the DSIDs in stack_defs.py (see those files for examples of doing this).

Making Cutflows

The cuts are specified in yaml files, in lists like:

- name: Three leptons, Mll
  cut: trilep_type>0&&passEventCleaning&&(lep_Pt_0>10e3&&lep_Pt_1>20e3&&lep_Pt_2>20e3)&&(top_hfor_type!=4)&&Mll01>12e3&&Mll02>12e3&&(abs(lep_ID_1)==13||lep_isVeryTightLH_1)&&(abs(lep_ID_2)==13||lep_isVeryTightLH_2)
- name: Trigger
  cut: trilep_type>0&&passEventCleaning&&(lep_Pt_0>10e3&&lep_Pt_1>20e3&&lep_Pt_2>20e3)&&(top_hfor_type!=4)&&Mll01>12e3&&Mll02>12e3&&(abs(lep_ID_1)==13||lep_isVeryTightLH_1)&&(abs(lep_ID_2)==13||lep_isVeryTightLH_2)&&(lep_Match_EF_mu24i_tight_0||lep_Match_EF_mu36_tight_0||lep_Match_EF_e24vhi_medium1_0||lep_Match_EF_e60_medium1_0||lep_Match_EF_mu24i_tight_1||lep_Match_EF_mu36_tight_1||lep_Match_EF_e24vhi_medium1_1||lep_Match_EF_e60_medium1_1||lep_Match_EF_mu24i_tight_2||lep_Match_EF_mu36_tight_2||lep_Match_EF_e24vhi_medium1_2||lep_Match_EF_e60_medium1_2)

Each cut is specified as a new list item ( - ) which contains a name (description which will be seen in the cutflow tex table) and the cut definition.

The simplest execution of the code is
DISPLAY="" python do_cutflow.py standard_3l_cutflow.yaml --filedir '/data_ceph/harish/v29_aug/l30tau/mc_sys' --texout mycutflow.tex
This will dump the expected yields for the cuts specified in standard_3l_cutflow.yaml to the output file mycutflow.tex. Since these cuts do not need to have any particular relationship to each other, you can make cutflows, or scan alternative signal regions, or whatever you would like to do.

To get a list of options, run
python do_cutflow.py - --help
(the extra hyphen is needed to tell ROOT not to interpret the --help itself). These options let you change the MC that is being used, tweak the output of the cutflow script, or change the event weights that are being used (this is important for running systematic variations, but not for general running).

Some standard configuration files:

standard_3l_cutflow.yaml: shows the standard 3l cutflow.

Dumping Plots

Two different things need to be specified for plots:

which selections (cuts) you want to apply for the events to appear in the plots,
which plots you want to make.

These are specified in two sections of the yaml file:

cuts:
- name: standardSR_l3
  label: 3l
  cut: trilep_type>0&&passEventCleaning&&(lep_Pt_0>10e3&&lep_Pt_1>20e3&&lep_Pt_2>20e3)&&(top_hfor_type!=4)&&Mll01>12e3&&Mll02>12e3&&(abs(lep_ID_1)==13||lep_isVeryTightLH_1)&&(abs(lep_ID_2)==13||lep_isVeryTightLH_2)&&(lep_Match_EF_mu24i_tight_0||lep_Match_EF_mu36_tight_0||lep_Match_EF_e24vhi_medium1_0||lep_Match_EF_e60_medium1_0||lep_Match_EF_mu24i_tight_1||lep_Match_EF_mu36_tight_1||lep_Match_EF_e24vhi_medium1_1||lep_Match_EF_e60_medium1_1||lep_Match_EF_mu24i_tight_2||lep_Match_EF_mu36_tight_2||lep_Match_EF_e24vhi_medium1_2||lep_Match_EF_e60_medium1_2)&&abs(total_charge)==1&&((nJets_OR_MV1_70>=1&&nJets_OR>=4)||(nJets_OR_MV1_70>=2&&nJets_OR==3))&&(lep_ID_0!=-lep_ID_1||(Mll01<81e3||Mll01>101e3))&&(lep_ID_0!=-lep_ID_2||(Mll02<81e3||Mll02>101e3))
- name: standardSR_l4Zdepleted
  label: 4l Z depleted
  cut: passEventCleaning&&(top_hfor_type!=4)&&quadlep_type>0&&(lep_Pt_0>25e3&&lep_Pt_1>15e3)&&passTriggerMatch&&abs(total_charge)==0&&minOSSFMll==0&&passZVeto&&nJets_OR>=2&&nJets_OR_MV1_70>=1&&100e3<Mllll0123&&Mllll0123<500e3

for cuts, and

plots:
    - name: lep_Pt_0
      x: lep_Pt_0/1e3
      xlabel: 'p_{T}(lepton 0)'
      rng: [0,200]
      nbins: 20
      units: GeV
    - name: 'lep_Eta_0'
      x: 'lep_Eta_0'
      xlabel: '#eta(lepton 0)'
      rng: [-3,3]
      nbins: 20

for the plot specifications.

The simplest running of the code is
DISPLAY="" python dump_plots.py standardCR_fornote.yaml --filedir '/data_ceph/harish/v29_aug/l30tau/mc_sys ' --outdir output_dir_for_plots/ --texout standardCR_fornote.tex Many of the options to dump_plots.py are the same as for do_cutflow.py (especially regarding the MC samples that are used).

Some useful configuration files:

plotconfigs/standardCR_13TeV.yaml: control regions

Space shortcuts

Child pages