This repository contains Jupyter notebooks as examples on how to use the CMS OpenData CNNPixelSeeds data set (see here).
First clone the repository to your working directory. In requirements.txt
you may find the needed python packages.
$ git git@github.com:cernopendata-datascience/CNNPixelSeedsMachineLearning.git
$ cd CNNPixelSeedsMachineLearning/
Then opening up the Jupyter Notebook
$ jupyter notebook
you can start to explore the notebooks. A sample list of files is included in file_index.txt
. The dataset.py
is an helper class to access the dataset. The doublets_visualisation
notebook will help you in visualising and accessing the dataset while ml_filtering
contains a first example on how to apply ML techniques (BDTs or DNNs) for pixel seeds filtering.
The Pixel Seeds dataset provided consists of a collection of pixel doublet seeds that would be used by CMS track reconstruction workflow. Each doublet is characterised by a list of features:
Event Info | |
run | Run number |
evt | Event number |
lumi | Lumisection number |
PU | Number of primary vertices in the event |
bSX, bSY, bSZ, bSdZ | Beam spot coordinates (x,y,z) and \sigma_{z} |
Features | (“in” or “out” prefix to indicate the inner or the outer hit of the doublet, e.g. inDetSeq, outX . . .) |
DetSeq | Sequential number for the inner hit and outer hit layer. For the silicon pixel detectors these numbers may be {0,1,2,3} for the four pixel barrel layers {14,15,16} for the three negative encap and {29,30,31} for the three positive endcap layers. |
X, Y, Z, R | Doublet inner [outer] hit spatial coordinates. |
Phi | Doublet inner [outer] hit azimuthal angle \phi. |
R | Doublet inner [outer] hit radial (r=\sqrt{x^2 + y^2}) direction. |
IsBarrel | Flag for inner [outer] hit being on a barrel layer |
Layer, Ladder, Side, Disk, Panel, Module | Inner [outer] hit detector specifics. For the barrel detector hit two numbers are meaningful: the layer number indicates on which cylindrical layer the hit lies; the ladder number |
IsFlipped | Flag indicating if the module is flipped with respect to the standard outward orientation. |
Ax1 | Length of the vector connecting the the origin to the local module coordinate reference system origin (0,0,0) for the inner [outer] hit. |
Ax2 | Length of the vector connecting the the origin to the point (0,0,1) in the local module coordinate reference system for the inner [outer] hit. |
ClustX, ClustY | Pixel cluster local, i.e. in the local module layer system of reference, coordinates for the inner [outer] hit. |
OverFlowX, OverFlowY, | Flags indicating if the the pixel cluster for the inner [outer] hit spans over the pad size (16) along the X or Y local detector module axes. |
ClustSize, ClustSizeX, ClustSizeY | Inner [outer] pixel cluster absolute size, i.e. number of pixel composing it, and sizes along X and Y local detector module axes. |
SumADC | Sum of the A.D.C. levels of all the pixels composing the cluster. |
IsBig | Flag indicating that the inner [outer] hits spans two (or more) ROCs modules. |
IsBad | Flag indicating that at least one pixel composing the inner [outer] hit is marked as malfunctioning. |
IsEdge | Flag indicating that the inner [outer] hit is on the edge of a ROC module. |
PixelZero | Highest equivalent released charge (in A.D.C. levels) for a single pixel belonging to the inner [outer] hit pixel cluster. |
AvgCharge | Average charge released on each pixel forming the inner [outer] pixel cluster. |
Skew | Ratio between the inner [outer] pixel cluster Y size and X size. |
Pixel Pads | (“in” or “out” prefix to indicate the inner or the outer hit of the doublet, e.g. inDetSeq, outX . . .) |
PixX
with X = 0,...,255 |
Inner [outer] hit pixels A.D.C. levels with X ranging from 0 to 255 for a 16x16 pad). The X index spans from top left pad corner to bottom right: e.g. the last bottom row will span from inPix240 to inPix255. |
Labels | (if the hit is not matched to any tracking particle all these labels are set to -1.0. “in” or “out” prefix to indicate the inner or the outer hit of the doublet, e.g. inDetSeq, outX . . .) |
PId | Flag set to 1.0 (-1.0) if the inner [outer] hit is (not) matched |
TId | Inner [outer] hit matched tracking particle key number in the event collection of tracking particles. |
Px,Py,Pz,Pt | Inner [outer] hit matched tracking particle momentum components (p_x, p_y, p_z) and transverse momentum (p_T). |
MT | Inner [outer] hit matched tracking particle transverse mass. |
ET | Inner [outer] hit matched tracking particle transverse energy. |
MSqr | Inner [outer] hit matched tracking particle mass squared. |
PdgId | Inner [outer] hit matched tracking particle PDG id, i.e. the index indicating which kind of particle it is. |
Charge | Inner [outer] hit matched tracking particle charge. |
NTrackerHits | Inner [outer] hit matched tracking particle number of tracker hits. |
NTrackerLayers | The number of tracker layers crossed by the inner [outer] hit matched tracking particle. |
Phi
Eta Rapidity |
Inner [outer] hit matched tracking particle phi, eta and y. |
VX, VY, VZ | Inner [outer] hit matched tracking particle vertex global coordinates. |
DXY | Inner [outer] hit matched tracking particle vertex transverse impact parameter. |
DZ | Inner [outer] hit matched tracking particle vertex longitudinal impact parameter. |
BunchCrossing | Event bunch crossing number. |
[1] https://twiki.cern.ch/twiki/bin/view/CMSPublic/SWGuideIterativeTracking