Difference: AnalysisTutorial (61 vs. 62)

Revision 622014/06/16 - Main.RyanKelley

Line: 1 to 1
 
META TOPICPARENT name="ParticlePhysics2013"

Analysis Tutorial

Line: 75 to 75
  Open the ROOT file and try playing with the following examples:
Changed:
<
<
>
>
 [rwk7t@tensor lesson1]$ root ../data/tracking_ntuple.root root [0] Attaching file ../data/tracking_ntuple.root as _file0...
Line: 83 to 83
 

List the branches in the tree:

Changed:
<
<
>
>
 root [2] tree->Print()

List the branches in the tree with a less verbose printout:

Changed:
<
<
>
>
 root [3] tree->GetListOfBranches()->ls()

Draw the tracking particles's pT.

Changed:
<
<
>
>
 root [4] tree->Draw("tps_p4.pt()") tps_p4_pt_ex1.png

On the previous plot, the automatic binning choice was sub optimal since it tries to get everything included in a bin. To specific the binning explicitly:

Changed:
<
<
>
>
 root [5] tree->Draw("tps_p4.pt()>>(100,0,10)") tps_p4_pt_ex2.png

In order to keep have a handle to the histogram for later manipulation, you can name the output hist. Now you can do subsequent operations on it.

Changed:
<
<
>
>
 root [7] tree->Draw("tps_p4.pt()>>h1(100,0,10)") root [8] h1->SetLineColor(kRed) root [9] h1->SetTitle("tracking particle p_{T};p_{T} (GeV? );A.U.")
Line: 112 to 112
  tps_p4_pt_ex3.png

To make a selection, use the 2nd field. This is also an example of how to overlay to plots.

Changed:
<
<
>
>
 root [12] tree->Draw("tps_p4.pt()>>h_pt_barrel(100,0,10)", "fabs(tps_p4.eta())<1.1"); root [13] h_pt_barrel->Draw(); root [14] h_pt_barrel->SetLineColor(kBlue);
Line: 133 to 133
  tps_p4_pt_overlay.png

Now at this point, you may be sick of typing in commands everytime. Time to use a macro (see chapter 7). Consider the following macro (https://github.com/kelleyrw/AnalysisTutorial/tree/master/lesson1/macros/overlay.C) which is the same as the previous example

Changed:
<
<
>
>
 { tree->Draw("tps_p4.pt()>>h_pt_barrel(100,0,10)", "fabs(tps_p4.eta())<1.1"); h_pt_barrel->Draw();
Line: 158 to 158
 

To run, you open the ROOT tree and the run it:

Changed:
<
<
>
>
 [rwk7t@tensor lesson1]$ root ../data/tracking_ntuple.root root [0] Attaching file ../data/tracking_ntuple.root as _file0...
Line: 183 to 183
 The numerator selection is the same as the denominator selection except that we require the tracking particle to be matched to a reconstructed track.

The following macro produces this plot: https://github.com/kelleyrw/AnalysisTutorial/tree/master/lesson1/macros/eff.C

Changed:
<
<
>
>
 [rwk7t@tensor lesson1]$ root ../data/tracking_ntuple.root root [0] Attaching file ../data/tracking_ntuple.root as _file0...
Line: 193 to 193
  As an exercise we're going to make this same plot two more times. The next example, we're going to compile the macro. In general it is a good idea to compile you macros because the interpreter (CINT) is not robust and can incorrectly interpret even simple code. Also, it will greatly increase the execution time if the macro is doing anything significant. See chapter 7 of the User's Guide for more details. The following macro produces this plot same plot as above but is meant to run compiled: https://github.com/kelleyrw/AnalysisTutorial/tree/master/lesson1/macros/eff_compiled.C
Changed:
<
<
>
>
 [rwk7t@tensor lesson1]$ root root [0] .L macros/eff_compiled.C++ Info in <TUnixSystem::ACLiC>: creating shared library /Users/rwk7t/temp/.rootbuild//Users/rwk7t/Development/newbies/lesson1/./macros/eff_compiled_C.so
Line: 209 to 209
 The final example from this lesson is a very simple "looper". A looper is simple a macro that manually loops over the entries in a TTree rather than relying on TTree::Draw(). This has the advantage of speed since there will be only one loop over the tree instead of one for each plot. Also, it has the flexibility to implement arbitrary logic whereas TTree::Draw, while flexible, can still be limited on what you can calculate and plot.

The following macro is a simple looper: https://github.com/kelleyrw/AnalysisTutorial/tree/master/lesson1/macros/eff_looper.C. It breaks up the process into two steps: create the numerator and denominator plots (CreatePlots) and doing the division to make the efficiency (FinalPlots). The purpose of this is making the numerator/denominator is the "slow" part -- you don't want to remake these plots if you all you need is to change the label or something on the efficiency plot. This is a simple example of breaking up the work flow to keep the analysis running efficiently. You can, of course, make a function that simply calls both if you want the option of doing it in two steps (exercise left to the reader)

Changed:
<
<
>
>
 root [2] .L macros/eff_looper.C++ Info in <TUnixSystem::ACLiC>: creating shared library /Users/rwk7t/temp/.rootbuild//Users/rwk7t/Development/newbies/lesson1/./macros/eff_looper_C.so root [3] CreatePlots? ("../data/tracking_ntuple.root", "plots/counts_vs_eta_looper.root")
Line: 252 to 252
 

  • Note: that I'm using the CMSSW convention that *.C files are intended to be ROOT compiled macros and *.cc files are supposed to be "fully C++" complaint code.
Changed:
<
<
The main analysis is done in TrackingEfficiencyAnalysis.cc. This is a C++ class that holds all of the metadata and runs the analysis. The main reason to make this a class is to keep all of the relevant variables and data together. If this were a set of functions, you would have to pass a bunch of parameters back and forth -- a class is more more suited for this purpose. Also, In the class definition below
>
>
The main analysis is done in TrackingEfficiencyAnalysis.cc. This is a C++ class that holds all of the metadata and runs the analysis. The main reason to make this a class is to keep all of the relevant variables and data together. If this were a set of functions, you would have to pass a bunch of parameters back and forth -- a class is more more suited for this purpose. Also, In the class definition below
 class TrackingEfficiencyAnalysis? { public:
Line: 306 to 306
  I've provided two ways to build the code. The first is using ROOT wrapper to gcc called ACLiC (the .L macros.C++ thing). The second example is with GCC directly to show you what is under the hood. There is no reason you have to stick with these two methods and they have their pluses and minuses; however, this is all that is really needed to get started on analysis.

ACLiC
Changed:
<
<
I've provided a simple macro to compile the code called compile.C. To compile this analysis code, you simple run the macro in ROOT:
>
>
I've provided a simple macro to compile the code called compile.C. To compile this analysis code, you simple run the macro in ROOT:
 root [0] .x compile.C+ Info in <TUnixSystem::ACLiC>: creating shared library /Users/rwk7t/temp/.rootbuild//Users/rwk7t/Development/newbies/lesson2/./compile_C.so Info in <TUnixSystem::ACLiC>: creating shared library /Users/rwk7t/Development/newbies/lesson2/lib/libHistTools.so
Line: 316 to 316
 

When you ready to run the code, there is a simple wrapper to compile the code, create an TrackingEfficiencyAnalysis object, and run it called run_all.C:

Changed:
<
<
>
>
 root [0] .x macros/run_all.C Info in <TUnixSystem::ACLiC>: creating shared library /Users/rwk7t/temp/.rootbuild//Users/rwk7t/Development/newbies/lesson2/./compile_C.so [TrackingEfficiencyAnalysis::ScanChain] finished processing 1000 events
Line: 347 to 347
  To see what is really going on, I provided a simple script that contains a single line to compile this code as a stand alone program (rather than running in ROOT/CINT). The main reason is to demonstrate that it is possible to compile ROOT objects and classes without the interpreter at all (CINT). You will find the interpreter has limitations and sometimes you may want to go around it. One could easily extend this using GNU Make or even a fully flushed out build system or IDE (Ecplipse, Boost Build, CMake, CMSSW's scram, etc.).

To compile from the command prompt:

Changed:
<
<
>
>
 [rwkelley@uaf-7 lesson2]$ ./compile.sh g++ -O2 source/TrackingEfficiencyAnalysis.cc source/TRKEFF.cc source/HistTools.cc -o tracking_eff_analysis -pthread -m64 -I/code/osgcode/imacneill/root/05.34.07/include -L/code/osgcode/imacneill/root/05.34.07/lib -lCore -lCint -lRIO -lNet -lHist -lGraf -lGraf3d -lGpad -lTree -lRint -lPostscript -lMatrix -lPhysics -lMathCore -lThread -pthread -lm -ldl -rdynamic -lGenVector -Iinclude

This produces an executable program called "tracking_eff_analysis".

Changed:
<
<
>
>
 [rwk7t@tensor lesson2]$ ./tracking_eff_analysis [TrackingEfficiencyAnalysis::ScanChain] finished processing 1000 events
Line: 383 to 383
  The output plots will be put in the plots directory. I've provided a macro called Formated to produce a nice version of the residual plots. First you must load the HistTools? so I use them in the macro:
Changed:
<
<
>
>
 root [0] .L lib/libHistTools.so root [1] .x macros/FormattedPlots.C+ Info in <TUnixSystem::ACLiC>: creating shared library /Users/rwk7t/temp/.rootbuild//Users/rwk7t/Development/newbies/lesson2/./macros/FormattedPlots_C.so
Line: 421 to 421
 
Changed:
<
<
>
>
 source /code/osgcode/cmssoft/cms/cmsset_default.sh export SCRAM_ARCH=slc5_amd64_gcc462
    • for this exercise, use CMSSW_5_3_2_patch4 (this is what we currently use in cms2)
    • setup a release by doing the following:
Changed:
<
<
>
>
 cmsrel CMSSW_5_3_2_patch4 pushd CMSSW_5_3_2_patch4/src cmsenv popd
    • It is often convenient to create a CMSSW project with a special name, so that its contents are more easily recognized by you. For example, one could
Changed:
<
<
>
>
 scramv1 p -n CMSSW_5_3_2_patch4_Tutorial CMSSW CMSSW_5_3_2_patch4 pushd CMSSW_5_3_2_patch4_Tutorial/src cmsenv
Line: 442 to 442
 
  • Get a ROOT file. You can find them on the dbs (or DAS) and use xrootd to either open then or copy them:
    • Find a file that you want do down load
Changed:
<
<
>
>
 dbsql find file where dataset=/DYJetsToLL_M-50_TuneZ2Star_8TeV-madgraph-tarball/Summer12_DR53X-PU_S10_START53_V7A-v1/AODSIM
    • Open it in xrootd
Changed:
<
<
>
>
 root root://xrootd.unl.edu//store/mc/Summer12_DR53X/DYJetsToLL_M-50_TuneZ2Star_8TeV-madgraph-tarball/AODSIM/PU_S10_START53_V7A-v1/0000/00037C53-AAD1-E111-B1BE-003048D45F38.root
    • Or copy it locally
Changed:
<
<
>
>
 xrdcp root://xrootd.unl.edu//store/mc/Summer12_DR53X/DYJetsToLL_M-50_TuneZ2Star_8TeV-madgraph-tarball/AODSIM/PU_S10_START53_V7A-v1/0000/00037C53-AAD1-E111-B1BE-003048D45F38.root dy.root
    • If you don't have a grid certificate yet, then use the file I copied locally
Changed:
<
<
>
>
 root /nfs-7/userdata/edm/53X/DYJetsToLL_M-50_TuneZ2Star_8TeV-madgraph-tarball_AODSIM_PU_S10_START53_V7A-v1.root
Changed:
<
<
>
>
 [rwkelley@uaf-7 lesson3]$ root /nfs-7/userdata/edm/53X/DYJetsToLL_M-50_TuneZ2Star_8TeV-madgraph-tarball_AODSIM_PU_S10_START53_V7A-v1.root root [0] Attaching file /nfs-7/userdata/edm/53X/DYJetsToLL_M-50_TuneZ2Star_8TeV-madgraph-tarball_AODSIM_PU_S10_START53_V7A-v1.root as _file0...
Line: 475 to 475
  What were all these errors? The cause is CINT/ROOT is independent of CMSSW and thus does not natively understand any of the CMSSW specific classes (reco::Track, edm::EventAuxiliary, TrackingParticle, etc.). If you are running fully compiled this is not an issue; however, if you want to look at this file via CINT, you will need to load the dictonaries such that ROOT will understand these classes. Fortunately, this can be accomplished by setting of CMSSW's FWLite:
Changed:
<
<
>
>
 { // Set up FW Lite for automatic loading of CMS libraries // and data formats. As you may have other user-defined setup
Line: 499 to 499
  Before you load the file, load FWLite. Now the warnings should disapear.
Changed:
<
<
>
>
 [rwkelley@uaf-7 lesson3]$ root /nfs-7/userdata/edm/53X/DYJetsToLL_M-50_TuneZ2Star_8TeV-madgraph-tarball_AODSIM_PU_S10_START53_V7A-v1.root [rwkelley@uaf-7 lesson3]$ root -b root [0] .x macros/load_fwlite.C
Line: 545 to 545
 

Why do we care?

C++ is object oriented. This encourages modular code that can be spread among multiple files. Unfortunately, nobody updated the old file management system from C to handle this so C++ is stuck with quite the antiquated system -- very crude compared to more recently developed languages. The next subsections illustrates some of the issues.
Simple one file program
Changed:
<
<
Consider the following simple program in a file called hello.cpp in the src directory.
>
>
Consider the following simple program in a file called hello.cpp in the src directory.
 #include

int main()

Line: 556 to 556
 

Use g++ to compile

Changed:
<
<
>
>
 [rwk7t@tensor lesson4]$ g++ src/hello.cpp

This produces an exe called a.out. To run:

Changed:
<
<
>
>
 [rwk7t@tensor lesson4]$ ./a.out hello world!

You can control the name of the output exe with the option -o to g++. Here we put the exe in the bin directory:

Changed:
<
<
>
>
 [rwk7t@tensor lesson4]$ g++ src/hello.cpp -o bin/hello [rwk7t@tensor lesson4]$ ./bin/hello hello world!
Line: 576 to 576
 
Multiple File programs

Now what happens if we want to write some reusable functions and put them in separate files. How does this change our building procedure? Consider the function Module1::PrintHello in a separate file called module1.cpp:

Changed:
<
<
>
>
 #include

namespace Module1

Line: 589 to 589
 

And the main program that goes with it is called test1.cpp:

Changed:
<
<
>
>
 #include "module1.cpp"

int main()

Line: 600 to 600
 

Now we compile into and exe in the bin directory:

Changed:
<
<
>
>
 [rwk7t@tensor lesson4]$ g++ src/test1.cpp -o bin/test1 [rwk7t@tensor lesson4]$ ./bin/test1 hello from module1!

Everything here looks fine. Now, consider another function added into our program which uses the Module1::PrintHello function from module1.cpp -- call this module2.cpp:

Changed:
<
<
>
>
 #include #include "module1.cpp"
Line: 622 to 622
 

Now our main program changes to the following which is saved in test2.cpp:

Changed:
<
<
>
>
 #include "module1.cpp" #include "module2.cpp"
Line: 635 to 635
 

Now compiling will give you an error:

Changed:
<
<
>
>
 [rwk7t@tensor lesson4]$ g++ src/test2.cpp -o bin/test2 In file included from src/module2.cpp:2:0, from src/test2.cpp:2:
Line: 648 to 648
 The reason for the error is module1.cpp has been included twice. Once in the test2.cpp and once in the module2.cpp. Recall from C++ that you can declare a function or class many times but you can only define it once. Here because we are including the definition twice, we get a compile error.

To fix the problem, we forward declare the two functions in test3.cpp:

Changed:
<
<
>
>
 namespace Module1 {void PrintHello? ();} namespace Module2 {void PrintHello? ();}
Line: 661 to 661
 

We also need to foward declare the function in "module2.cpp"

Changed:
<
<
>
>
 #include

namespace Module1 {void PrintHello? ();}

Line: 676 to 676
 

And when we compile, we need to supply the definitions to the compiler. This is done by adding all three files to the g++ call and now g++ knows to compile all 3 files and then link them together:

Changed:
<
<
>
>
 [rwk7t@tensor lesson4]$ g++ src/test3.cpp src/module1.cpp src/module2.cpp -o bin/test3 [rwk7t@tensor lesson4]$ ./bin/test3 hello from module1!
Line: 687 to 687
 Now, editing the files and forward declaring is quite cumbersome so the convention is to factor out forward declarations into interface or "header files" usually denoted by *.h or *.hpp. So refactoring the modules and main program:

module1.h:

Changed:
<
<
>
>
 #ifndef MODULE1_H #define MODULE1_H
Line: 702 to 702
 Notice that the header file had the code surrounded some C preprocessor commands (called "header guard"). This really only needed for class definitions since these are typically defined in header files. This is protect from multiple class definitions. In this particular example, you wouldn't need them but its good practice.

module1.cpp:

Changed:
<
<
>
>
 #include "module1.h" #include
Line: 716 to 716
 

module2.h:

Changed:
<
<
>
>
 #ifndef MODULE2_H #define MODULE2_H
Line: 729 to 729
 

module2.cpp:

Changed:
<
<
>
>
 #include "module2.h" #include "module1.h" #include
Line: 745 to 745
 

test4.cpp:

Changed:
<
<
>
>
 #include "module1.h" #include "module2.h"
Line: 759 to 759
  It is customary to put the header files in a different folder than the source code. This is to facilitate a "user interface" that is easy to parse and not polluted with implementation details. This is merely a convention but it is standard practice. Because the headers in this example live in the include director, we need to tell g++ where to find them. This is done with the -I switch to g++:
Changed:
<
<
>
>
 [rwk7t@tensor lesson4]$ g++ src/test4.cpp -I include src/module1.cpp src/module2.cpp -o bin/test4 [rwk7t@tensor lesson4]$ ./bin/test4 hello from module1!
Line: 773 to 773
  To save time, you could compile each piece at a time and then you will only have to link them which can save some building time:
Changed:
<
<
>
>
 [rwk7t@tensor lesson4]$ g++ -c src/module1.cpp -Iinclude -o obj/module1.o [rwk7t@tensor lesson4]$ g++ -c src/module2.cpp -Iinclude -o obj/module2.o [rwk7t@tensor lesson4]$ g++ -c src/test4.cpp -Iinclude -o obj/test4.o
Line: 786 to 786
  A static library is essentially where you collect all of the binary files for set of related code and combine it into one file that can be linked against to create you main program. The main program will essentially pull the binaries that are relevant into the main program and absorb the binary content into the exe file. In the following example, we pull module1 and module2 into a single reusable static library:
Changed:
<
<
>
>
 [rwk7t@tensor lesson4]$ g++ -c src/module1.cpp -Iinclude -o obj/module1.o [rwk7t@tensor lesson4]$ g++ -c src/module2.cpp -Iinclude -o obj/module2.o [rwk7t@tensor lesson4]$ ar rvs lib/libmodule.a obj/module1.o obj/module2.o
Line: 798 to 798
 NOTE: the file extension convention for static libraries is .a for "archive".

Now you can use this static library to use this reusable package:

Changed:
<
<
>
>
 [rwk7t@tensor lesson4]$ g++ -Iinclude src/test4.cpp lib/libmodule.a [rwk7t@tensor lesson4]$ ./bin/test4 hello from module1!
Line: 807 to 807
 

There is an interface that g++ uses to link against libraries. The following is equivalent to the above:

Changed:
<
<
>
>
 [rwk7t@tensor lesson4]$ g++ -Iinclude src/test4.cpp -L lib -l module [rwk7t@tensor lesson4]$ ./bin/test4 hello from module1!
Line: 822 to 822
 Static libraries are great for small or intermediate size systems; however, since the binary code is copied to the exe, this can cause "code bloat" and cause the exe to be very large (~GB). Also, if you use a hierarchical library scheme where some libs are dependent on other libs, then you are duplicating binary code and again causing more bloated binaries. The solution is to use dynamic libraries (also called shared libraries). This allows the exe to find the library at runtime. There is a slight initialization penalty but not really that noticeable with modern computers. The advantage is the exe will be small and you don't have to copy around binaries into other libs or exes. This for systems with "standard" shared libraries installed so only the exe needs to be shipped. However, the down side to shared libraries is now you have to keep track of where the dynamic libraries are and if you wish to run the exe somewhere else (like the grid), you will have to send the appropriate libraries as well if they are not already installed somewhere.

To create a dynamic lib:

Changed:
<
<
>
>
 [rwk7t@tensor lesson4]$ g++ -shared obj/module*.o -o lib/libmodule.so [rwk7t@tensor lesson4]$ g++ -I include src/test4.cpp -L lib -l module [rwk7t@tensor lesson4]$ ./bin/test4
Line: 844 to 844
  There is no reason you have to use ROOT interactively though CINT -- it is really just a set of C++ libraries. Consider the following simple program called with_root.cpp:
Changed:
<
<
>
>
 #include "TH1F.h" #include "TFile.h"
Line: 861 to 861
  If I try to compile this, I'll get an error since g++ doesn't know where to find the ROOT headers and libraries:
Changed:
<
<
>
>
 [rwk7t@tensor lesson4]$ g++ src/with_root.cpp -o bin/with_root src/with_root.cpp:1:18: fatal error: TH1F? .h: No such file or directory compilation terminated.
Line: 869 to 869
  ROOT ships with a nice command line utility that will give you the appropriate g++ options to pass to your command:
Changed:
<
<
>
>
 [rwk7t@tensor lesson4]$ root-config --cflags --libs -pthread -std=c++0x -m64 -I/usr/local/cmssw/osx107_amd64_gcc472/cms/cmssw/CMSSW_6_1_2/external/osx107_amd64_gcc472/bin/../../../../../../lcg/root/5.34.03-cms2/include -L/usr/local/cmssw/osx107_amd64_gcc472/cms/cmssw/CMSSW_6_1_2/external/osx107_amd64_gcc472/bin/../../../../../../lcg/root/5.34.03-cms2/lib -lCore -lCint -lRIO -lNet -lHist -lGraf -lGraf3d -lGpad -lTree -lRint -lPostscript -lMatrix -lPhysics -lMathCore -lThread -lpthread -Wl,-rpath,/usr/local/cmssw/osx107_amd64_gcc472/cms/cmssw/CMSSW_6_1_2/external/osx107_amd64_gcc472/bin/../../../../../../lcg/root/5.34.03-cms2/lib -lm -ldl

So to build a standalone exe with ROOT objects, you can do the following:

Changed:
<
<
>
>
 [rwk7t@tensor lesson4]$ g++ src/with_root.cpp `root-config --cflags --libs` -o bin/with_root [rwk7t@tensor lesson4]$ ./bin/with_root
Line: 893 to 893
  Make has the simple structure which is a series of rules:
Changed:
<
<
>
>
 target: dependency1 dependency2 ... command1 command2
Line: 902 to 902
  The goal of the each rule is to create the target. If the target is older than any of its dependencies then the commands are executed. You can then build up a series of interconnect rules to build your code. Consider the following example:
Changed:
<
<
>
>
 # simple Makefile for test4

# target

Line: 934 to 934
 

The ultimate goal is create the all rule:

Changed:
<
<
>
>
 [rwk7t@tensor lesson4]$ make all g++ -c src/module1.cpp -Iinclude -o obj/module1.o g++ -c src/module2.cpp -Iinclude -o obj/module2.o
Line: 945 to 945
 Notice the rules get called in reverse since nothing has been made. Also, not that if you just type make, the rule all is implied. So this is convenient since whenever a file is out of date, it will call the appropriate rule and so on until the all rule is satisfied.

Also, there is a user defined rule clean which deletes all of the binary and library files:

Changed:
<
<
>
>
 [rwk7t@tensor lesson4]$ make clean if [ -f lib/libmodule.a ]; then rm lib/libmodule.a; fi if [ -f bin/test4 ]; then rm bin/test4; fi
Line: 961 to 961
 
ACLiC

ROOT for interactive work uses an interactive system that ships with CINT called The Automatic Compiler of Libraries for CINT (ACLiC).

Changed:
<
<
>
>
 root [0] .L macro.C++
Line: 974 to 974
 CMSSW uses SCRAM systems (Source Configuration, Release, And Management tool). The following twikis explain it more detail:
Changed:
<
<
    • https://twiki.cern.ch/twiki/bin/view/CMSPublic/WorkBookBuildFilesIntro
>
>
    • https://twiki.cern.ch/twiki/bin/view/CMSPublic/WorkBookBuildFilesIntro
 
    • https://twiki.cern.ch/twiki/bin/view/CMSPublic/SWGuideBuildFile
    • More in depth info: https://twiki.cern.ch/twiki/bin/view/CMSPublic/SWGuideScram
Line: 1012 to 1012
  To properly compile and LINK this analysis package, consider the following BuildFile.xml in the $CMSSW_BASE/src/Ryan/Analysis":
Changed:
<
<
>
>
 
Line: 1039 to 1039
  Not all the code in CMSSW is an EDM plugin. Sometimes, you want to just make a reusable library that other packages use (e.g. data formats or analysis tools). This is simple in SCRAM. You just don't declare a EDM_PLUGIN and "export" the lib:
Changed:
<
<
>
>
 
Line: 1061 to 1061
 Here, the implementation (below the squiggly line) of MyTools uses three different packages: ROOT, the BOOST C++ libraries and GNU Scientific Library (GSL). The MyTools package used these three package in the implementation of the code but not necessarily in the interface. In the interface (above the squiggly line), it only uses ROOT. For example, consider the following pseudo-code from MyTools:

MyAwesomePlotMaker.h

Changed:
<
<
>
>
 #ifndef MYAWESOMEPLOTTOOLS_H #define MYAWESOMEPLOTTOOLS_H #include "TH1F.h"
Line: 1074 to 1074
 

MyAwesomePlotMaker.cc

Changed:
<
<
>
>
 #include "MyAwesomePlotMaker.h" #include "TH1F.h" #include "boost/shared_ptr.hpp"
Line: 1092 to 1092
  To properly link against these, the following lines are needed in the BuildFile.xml:
Changed:
<
<
>
>
 
Line: 1100 to 1100
  However, the user or "client" of the MyTools package, does not need to know about Boost and GSL since these are now binaries are linked dynamically to MyTools. However, whenever you use something from another package in the interface (header files) of your package, you will need to tell the client about it. For example, if in my tools I included
Changed:
<
<
>
>
 #include "TH1F.h"

Now, the client of MyTools has to also know about ROOT. If he just includes the MyTools package, he will get a linker error. To fix this he needs to explicitly include ROOT as a dependency in the client's BuildFile.xml. This is not sustainable since the client would have to figure out that he needed to include ROOT and whenever you use MyTools. This is error prone and inelegant since what if MyTools adds another dependency --> if there are 1000 "clients" of MyTools, are you going to change the 1000 build files that included MyTools?

The way SCRAM deals with this is using "export". When I export my lib, I also export the dependencies that the client needs to link this lib:

Changed:
<
<
>
>
 
Line: 1120 to 1120
  Client now gets the ROOT dependency implicitly. The final BuildFile.xml to realize this picture will look like:
Changed:
<
<
>
>
 
Line: 1172 to 1172
  Create a base level project directory:
Changed:
<
<
>
>
 mkdir cms2_project_example cd cms2_project_example

Check out the AnalysisTutorial code:

Changed:
<
<
>
>
 git clone https://github.com/kelleyrw/AnalysisTutorial.git

Create a CMSSW release area and setup the environment:

Changed:
<
<
>
>
 scramv1 p -n CMSSW_5_3_2_patch4_cms2example CMSSW CMSSW_5_3_2_patch4 cd CMSSW_5_3_2_patch4_cms2example/src cmsenv
Line: 1191 to 1191
  Checkout and compile the CMS2 CORE and Dictionaries:
Changed:
<
<
>
>
 git clone https://github.com/cmstas/Dictionaries.git CMS2/Dictionaries git clone https://github.com/cmstas/CORE.git CMS2/NtupleMacrosCore cd CMS2/NtupleMacrosCore
Line: 1207 to 1207
  Copy the Analysis code over to the CMSSW area (For your code, depending on how you set it up, you could just check it out directly).
Changed:
<
<
>
>
 cp -r ../../AnalysisTutorial/cms2_examples/CMS2Project/* .

Compile the code one last time to make sure all the modules are compiled:

Changed:
<
<
>
>
 scram b -j20
Line: 1233 to 1233
 
    • ExampleCMS2BabyMaker: an example baby maker (see next section)
    • ...add as many analyses as you need...
Changed:
<
<
>
>
 . |____CMS2 | |____Dictionaries
Line: 1250 to 1250
  There two examples of running the looper. Interactively use the ROOT non-compiled script:
Changed:
<
<
>
>
 cd Analysis/ExampleCMS2Looper root -b -q -l macros/run_sample.C

Also, there is an example of a fully compiled executable which is defined in bin/run_sample.cc:

Changed:
<
<
>
>
 run_sample
Line: 1292 to 1292
 
    • ExampleCMS2BabyMaker: an example baby maker
    • ...add as many analyses as you need...
Changed:
<
<
>
>
 . |____CMS2 | |____Dictionaries
Line: 1331 to 1331
  The following is an example of how to run the baby maker:
Changed:
<
<
>
>
 root -b -q "macros/create_baby.C(\"dyll\", \"/nfs-7/userdata/rwkelley/cms2/DYJetsToLL_M-50_TuneZ2Star_8TeV-madgraph-tarball_Summer12_DR53X-PU_S10_START53_V7A-v1.root\", \"babies/dyll.root\", \"json/Merged_190456-208686_8TeV_PromptReReco_Collisions12_goodruns.txt\", -1, 0.082, 0)"
Line: 1341 to 1341
  Just call the script to execute the baby maker on all the datasets:
Changed:
<
<
>
>
 ./scripts/create_babies_all.sh

examples on the ROOT interpreter
Changed:
<
<
>
>
 // raw root [1] tree->Draw("gen_p4.mass()>>h1(150, 0, 150)", "is_gen_ee");Info in <TCanvas::MakeDefCanvas>: created default TCanvas with name c1 root [2] cout << h1->Integral(0,-1) << endl;
Line: 1377 to 1377
 This is the same work flow as before with one exception. This was written in a newer CMSSW release area to take advantage of the newer compiler. The instructions are the same as here except use the following CMSSW release:

Create a CMSSW release area and setup the environment:

Changed:
<
<
>
>
 export SCRAM_ARCH=slc5_amd64_gcc472 scramv1 p -n CMSSW_6_1_2_cms2example CMSSW CMSSW_6_1_2 cd CMSSW_6_1_2_cms2example/src
Line: 1400 to 1400
 
    • ExampleCMS2BabyMaker: an example baby maker (see previous sub-sections)
    • LeptonSelectionStudy: A single lepton analysis to study the selections.
Changed:
<
<
>
>
 . |____CMS2 | |____Dictionaries
Line: 1429 to 1429
  To compile the code, you use scram:
Changed:
<
<
>
>
 scram b -j20

This code defines an executables called create_singlelep_baby. It uses boost program option to parse command line arguments. To see the available arguments, use --help.

Changed:
<
<
>
>
 [rwk7t@tensor LeptonSelectionStudy? ]$ create_singlelep_baby --help Allowed options: --help print this menu
Line: 1447 to 1447
  The following is an example of how to run the baby maker:
Changed:
<
<
>
>
 [rwk7t@tensor LeptonSelectionStudy? ]$ create_singlelep_baby --sample dyll [create_singlep_baby] inputs: sample = dyll
Line: 1476 to 1476
 
  • "QCD" → QCD dataset
If you run on another, it should fail...
Changed:
<
<
>
>
 [rwk7t@tensor LeptonSelectionStudy? ]$ create_singlelep_baby --sample bogus [create_singlep_baby] Error: failed... [create_singlelp_baby] sample bogus is not valid.
Line: 1491 to 1491
  Just call the script to execute the baby maker on all the datasets:
Changed:
<
<
>
>
 ./scripts/create_babies_all.py
Line: 1500 to 1500
 
Load Interactive HistTools

In order to maximize code-reuse and make plotting easier, use the HistTools interactively in CINT. To load the tools, add the following lines to your .rootlogon.C (or call it in each session):

Changed:
<
<
>
>
  // Load HistTools? gROOT->ProcessLine(".x $CMSSW_BASE/src/Packages/HistTools/macros/load_histtools.C");
Line: 1521 to 1521
 
  • CreatePlots → main program of sorts

Example:

Changed:
<
<
>
>
 [rwk7t@tensor LeptonSelectionStudy? ]$ root Loading CMSSW FWLite. Loading HistTools?
Line: 1580 to 1580
  Here I show a couple of examples of using scale1fb using TTree::Draw statements. A consider the following example code fragment:
Changed:
<
<
>
>
 TChain chain("Events"); chain.Add("/nfs-7/userdata/rwkelley/cms2/DYJetsToLL_M-50_TuneZ2Star_8TeV-madgraph-tarball_Summer12_DR53X-PU_S10_START53_V7A-v1.root");
Line: 1614 to 1614
 
nevts_file The number of events in this cms2 level file

The result of this for lumi = 0.082 pb-1 is:

Changed:
<
<
>
>
 raw_count = 23332 scale1fb = 56.5367 scaled count = lumi * scale1fb * raw_count = 108167
Line: 1627 to 1627
 scalefb' = scale1fb x (NCMS2 / NAOD)

In code:

Changed:
<
<
>
>
 const double nevts_aod = 30459503; const double nevts_cms2 = 27137253; const double cms2_filter_eff = nevts_cms2/nevts_aod;
Line: 1635 to 1635
 

This gives:

Changed:
<
<
>
>
 properly scale count = lumi * scale1fb * raw_count * cms2_filter_eff = 96369.5
Line: 1652 to 1652
  Repeating the above exercise using the branches stored in the ntuple:
Changed:
<
<
>
>
 // scale1fb correction since we used a subset of the events const double scale = nevts_cms2/nevts_file;
Line: 1684 to 1684
 scale1fb 1 file = evt_scale1fb x (Ncms2/Nfile)

The result is:

Changed:
<
<
>
>
 hist entries = 23332 scale1fb = 0.115984 applying weight = 0.082000*434.286380*evt_scale1fb*(Sum$(genps_status==3 && genps_id==11)>=1 && Sum$(genps_status==3 && genps_id==-11)>=1)
Line: 1697 to 1697
  The above example is useful for quick studies on the command line. Also, this can be done in looper code when filling histograms.
Changed:
<
<
>
>
 void DrellYanLooper? ::Analyze(const long event) { // select e+e- events (NOTE: no test for Z)
Line: 1737 to 1737
 

The result is:

Changed:
<
<
>
>
 [rwkelley@uaf-7 DrellYan? ]$ dy_scale1fb_example Loading CMSSW FWLite. running drell-yan looper...
Line: 1786 to 1786
 

Single electron/muon data

Changed:
<
<
>
>
  TChain chain("Events"); chain.Add("/hadoop/cms/store/group/snt/papers2012/Data2012/CMSSW_5_3_2_patch4_V05-03-24/SingleMu_Run2012A-recover-06Aug2012-v1_AOD/merged/*.root"); chain.Add("/hadoop/cms/store/group/snt/papers2012/Data2012/CMSSW_5_3_2_patch4_V05-03-24/SingleElectron_Run2012A-recover-06Aug2012-v1_AOD/merged/*.root");
Line: 1802 to 1802
 
    • We are using only a subset of the full dataset since this is an exercise and we want this to run interactively

Drell-Yan

Changed:
<
<
>
>
  TChain chain("Events"); chain.Add("/hadoop/cms/store/group/snt/papers2012/Summer12_53X_MC/DYJetsToLL_M-50_TuneZ2Star_8TeV-madgraph-tarball_Summer12_DR53X-PU_S10_START53_V7A-v1/V05-03-23/merged_ntuple_1[0-9].root");

W + Jets → l + ν

Changed:
<
<
>
>
  TChain chain("Events"); chain.Add("/hadoop/cms/store/group/snt/papers2012/Summer12_53X_MC/WJetsToLNu_TuneZ2Star_8TeV-madgraph-tarball_Summer12_DR53X-PU_S10_START53_V7A-v1/V05-03-28/merged_ntuple_1[0-9].root");

tt(bar) → 2l + 2ν

Changed:
<
<
>
>
  TChain chain("Events"); chain.Add("/hadoop/cms/store/group/snt/papers2012/Summer12_53X_MC/TTJets_FullLeptMGDecays_8TeV-madgraph_Summer12_DR53X-PU_S10_START53_V7A-v2/V05-03-24/merged_ntuple_1[0-9].root");

tt(bar) → l + ν + jj

Changed:
<
<
>
>
  TChain chain("Events"); chain.Add("/hadoop/cms/store/group/snt/papers2012/Summer12_53X_MC/TTJets_SemiLeptMGDecays_8TeV-madgraph_Summer12_DR53X-PU_S10_START53_V7A_ext-v1/V05-03-24/merged_ntuple_1[0-9].root");

tt(bar) → hadronic

Changed:
<
<
>
>
  TChain chain("Events"); chain.Add("/hadoop/cms/store/group/snt/papers2012/Summer12_53X_MC/TTJets_HadronicMGDecays_8TeV-madgraph_Summer12_DR53X-PU_S10_START53_V7A_ext-v1/V05-03-24/merged_ntuple_1[0-9].root");

QCD muon enriched (use for μμ analysis). For now, let's ignore QCD for ee analysis since it requires more statistics than is practical without using the grid.

Changed:
<
<
>
>
  TChain chain("Events"); chain.Add("/hadoop/cms/store/group/snt/papers2012/Summer12_53X_MC/QCD_Pt_20_MuEnrichedPt_15_TuneZ2star_8TeV_pythia6_Summer12_DR53X-PU_S10_START53_V7A-v3/V05-03-18_slim/merged_ntuple_1[0-9].root");

WW → 2l + 2ν

Changed:
<
<
>
>
  TChain chain("Events"); chain.Add("/hadoop/cms/store/group/snt/papers2012/Summer12_53X_MC/WWJetsTo2L2Nu_TuneZ2star_8TeV-madgraph-tauola_Summer12_DR53X-PU_S10_START53_V7A-v1/V05-03-23/merged_ntuple_1[0-9].root");

WZ → 2l + 2q

Changed:
<
<
>
>
  TChain chain("Events"); chain.Add("/hadoop/cms/store/group/snt/papers2012/Summer12_53X_MC/WZJetsTo2L2Q_TuneZ2star_8TeV-madgraph-tauola_Summer12_DR53X-PU_S10_START53_V7A-v1/V05-03-23/merged_ntuple_1[0-9].root");

WZ → 3l + ν

Changed:
<
<
>
>
  TChain chain("Events"); chain.Add("/hadoop/cms/store/group/snt/papers2012/Summer12_53X_MC/WZJetsTo3LNu_TuneZ2_8TeV-madgraph-tauola_Summer12_DR53X-PU_S10_START53_V7A-v1/V05-03-23/merged_ntuple_1[0-9].root");

ZZ → 2l + 2ν

Changed:
<
<
>
>
  TChain chain("Events"); chain.Add("/hadoop/cms/store/group/snt/papers2012/Summer12_53X_MC/ZZJetsTo2L2Nu_TuneZ2star_8TeV-madgraph-tauola_Summer12_DR53X-PU_S10_START53_V7A-v3/V05-03-23/merged_ntuple_1[0-9].root");

ZZ → 2l + 2q

Changed:
<
<
>
>
  TChain chain("Events"); chain.Add("/hadoop/cms/store/group/snt/papers2012/Summer12_53X_MC/ZZJetsTo2L2Q_TuneZ2star_8TeV-madgraph-tauola_Summer12_DR53X-PU_S10_START53_V7A-v1/V05-03-23/merged_ntuple_1[0-9].root");

ZZ → 4l

Changed:
<
<
>
>
  TChain chain("Events"); chain.Add("/hadoop/cms/store/group/snt/papers2012/Summer12_53X_MC/ZZJetsTo4L_TuneZ2star_8TeV-madgraph-tauola_Summer12_DR53X-PU_S10_START53_V7A-v1/V05-03-23/merged_ntuple_1[0-9].root");
Line: 1907 to 1907
 
      • Liam: muon isolation (Friday)
  1. Due Friday 5/16: Implement a cut flow, kinematic and N-1 plots.
  2. Due Friday 5/27: Implement a cut flow, kinematic and N-1 plots (2nd iteration).
Changed:
<
<
  1. Due Friday 6/3: Implement all selections
  2. Due Friday 6/10: Slides of the Results at the level of a "full status report"
>
>
  1. Due Friday 6/13: Implement all selections
  2. Due Friday 6/20: Slides of the Results at the level of a "full status report"
 

Current Results

Line: 1954 to 1954
 
    • other dataset dependent issues...
  • Write a script (bash, python, or whatever), to call the jobs in parallel on the uaf. I personally like to use python for this but whatever you are most comfortable with. The following is a psuedo-script to illustrate my point:
Changed:
<
<
>
>
 #!/bin/bash

nevts=-1

Line: 1987 to 1987
 
What branches should I use?
You can see what branches are available by:
Changed:
<
<
>
>
 Events->GetListOfAliases()->ls()
Line: 2014 to 2014
  It is often a good idea (if you can), to check your results in more than one way. You can do this on the ROOT command line doing the following:
Changed:
<
<
>
>
 { // data TChain chain("Events");
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback