A Graph Neural Network Analysis Framework for High Energy Physics!

What is Analysis-G:

Analysis-G is a Python package which aims to abstract and automate novel High Energy Particle Physics Analyses. Since most of HEPP software relies on complicated ROOT files, the framework attempts to minimize the need for writing complicated and inefficient nested for loops to retrieve the content. This is achieved by defining particles and events as Python classes and assigning these attributes which allow the framework to infer and assign particles to events. To bypass the speed limitations of Python, the underlying code has been written in C++ to assure scalability and minimize RAM usage when reading large number of ROOT files, without compromising the generality Python offers.

The framework places heavy emphasis on optimizing the interfacing between HEPP software and the machine learning community, which utilize vastly different data structure input. As such, once events are interpreted, they are stored in the HDF5 file format, thus removing the need for interacting with ROOT as early as possible. A particular advantage of using the framework is made apparent in the context of constructing Graph like structures, which are required for training Graph Neural Networks. Since events and particles alike are abstracted into objects, their attributes can be utilized to populate Graphs with edge, node and global features in a fast and simplistic way, without resorting to repetitive code being written. This will be illustrated in dedicated tutorials of the docs.

Given the growing trend in machine learning, for instance Graph Neural Networks (GNNs), the framework aims to remain Analysis agnostic, such that several mutually exclusive ATLAS/Belle-II analyses can benefit from a single optimized framework.

Core Packages in Analysis-G:

  • EventTemplate: A class which is to be inherited by custom event implementations. This class interprets and retrieves data content found within ROOT files, and translates the ROOT structure in terms of particle content and per event attributes.

  • ParticleTemplate: A class used to define the most minimalistic representation of a particle expected to be found within the ROOT files. Particles derived from this class will be generated on a per event basis such that the exact number of particles in each event will be made available.

  • GraphTemplate: A template class which is used to specify which particles are to be used for Graph construction. Graph features are assigned at a later stage of the Analysis pipeline, and will be discussed and illustrated in a dedicated tutorial.

  • SelectionTemplate: A class used to define a simplistic selection strategy, where particles and events (derived from EventTemplates and ParticleTemplates) will be made available for additional processing. This class can be used to output ROOT n-tuples and then passed into some fitting tool, e.g. TRexFitter or PyHF

  • SampleTracer An abstract module which can be utilized as a completely standalone package and can be integrated with your own framework. This module aims to keep track of events and their original ROOT filename, and further permits fast event/graph retrieval. The output of this module is a small HDF5 file, which only holds meta-data and mappings between ROOT files and their associated event indices.

  • PyC (Python Custom) A completely detached package which implements high performance native C++ and CUDA code/kernels, which utilize the PyTorch API. Several interfaces are implemented, namely switching from Cartesian to Polar coordinates, computing scalar invariant masses from particles, single/double neutrino reconstruction and many more.

Work in Progress for DOCS

  • Model Docs: Need to complete the declaration examples.

  • SampleTracer and Generators: Need to move from attribute to ivars.

  • Clean up the docs structure: pyc and pyc++

Index and Directories

Advanced Object Definitions

Analysis and Other Generators

Read and Writing (IO)

Machine Learning (Graph Neural Network)

Condor and DAGMAN Submission Compilers

Plotting Functions

Tools Multi-Threading and Code Preservation

Data Types and Dictionary Mapping

PyC, CUDA and C++ API via PyTorch

Analysis and Truth Studies Documentation

Event Template Classes

Documentation and Codebase Status

Documentation Status Build Status: AnalysisG Build Status: Torch-Extension (pyc) PyPI Release Build Status Tox Testing Status Coverage Report https://static.pepy.tech/badge/analysisg/month