C++ Private Member Variables Reference

This document provides comprehensive documentation for all private member variables and methods in the C++ modules. Private members handle internal state management, caching, and implementation details not exposed to users.

Overview

Private members in AnalysisG serve several purposes:

  1. Internal State Management: Track processing state, caches, and temporary data

  2. Build Pipelines: Internal methods for constructing analysis workflows

  3. Data Structures: Maps and containers for efficient lookups

  4. Thread Safety: Mutexes and synchronization primitives

  5. File I/O: Internal file handles and iterators

Core Template Modules

analysis

Header: modules/analysis/include/AnalysisG/analysis.h

The analysis class has extensive private infrastructure for orchestrating the entire analysis workflow.

Private Build Methods

These methods construct different phases of the analysis pipeline:

void check_cache()

Check if event/graph data is already cached on disk to avoid reprocessing.

Called by: start()

Purpose: Populates in_cache map to skip redundant event building

void build_project()

Initialize the analysis project structure and output directories.

Called by: start()

Creates: Output directory structure, initializes sampletracer

void build_events()

Build event objects from ROOT files using registered event templates.

Called by: start()

Process:

  1. Reads ROOT files via io reader

  2. For each file, instantiates event templates

  3. Calls event_template::build() on each event

  4. Caches results if enabled

void build_selections()

Apply selection templates to filter events.

Called by: start()

Process: Iterates through registered selections, applies filters

void build_graphs()

Construct graph objects from events for GNN processing.

Called by: start()

Process:

  1. For each event, call registered graph templates

  2. Build adjacency matrices and node/edge features

  3. Cache graph data

void build_model_session()

Set up training sessions for registered models.

Called by: start()

Process:

  1. Initialize optimizers from model_sessions

  2. Create dataloaders with k-fold splits

  3. Spawn training threads

void build_inference()

Run inference on data using pre-trained models.

Called by: start()

Process: Load model weights, process data, save predictions

bool build_metric()

Compute metrics on model outputs.

Returns:

True if metrics computed successfully

Called by: start()

void build_metric_folds()

Compute metrics across k-fold cross-validation splits.

void build_dataloader(bool training)

Configure dataloader for training or inference.

Parameters:

training – If true, set up for training; if false, for inference

void fetchtags()

Fetch k-fold tags from metadata for cross-validation.

Purpose: Populates tags with fold assignments

Private Static Helper Methods

static int add_content(std::map<std::string, torch::Tensor*> *data, std::vector<variable_t> *content, int index, std::string prefx, TTree *tt = nullptr)

Add ROOT TTree content to tensor map.

Parameters:
  • data – Output tensor map

  • content – Variables to extract

  • index – Event index

  • prefx – Branch name prefix

  • tt – ROOT TTree pointer

Returns:

Number of variables added

static void add_content(std::map<std::string, torch::Tensor*> *data, std::vector<std::vector<torch::Tensor>> *buff, torch::Tensor *edge, torch::Tensor *node, torch::Tensor *batch, std::vector<long> mask)

Add graph data (edges, nodes) to tensor buffers.

Parameters:
  • data – Tensor map

  • buff – Buffer for batched tensors

  • edge – Edge index tensor

  • node – Node feature tensor

  • batch – Batch assignment tensor

  • mask – Masking for valid entries

static void execution(model_template *mdx, model_settings_t mds, std::vector<graph_t*> *data, size_t *prg, std::string output, std::vector<variable_t> *content, std::string *msg)

Execute model training/inference in separate thread.

Parameters:
  • mdx – Model instance

  • mds – Model settings

  • data – Graph data for processing

  • prg – Progress counter (updated atomically)

  • output – Output path for results

  • content – Variables to save

  • msg – Status message (updated by thread)

static void execution_metric(metric_t *mt, size_t *prg, std::string *msg)

Execute metric computation in separate thread.

Parameters:
  • mt – Metric data structure

  • prg – Progress counter

  • msg – Status message

static void initialize_loop(optimizer *op, int k, model_template *model, optimizer_params_t *config, model_report **rep)

Initialize training loop for k-fold.

Parameters:
  • op – Optimizer instance

  • k – Fold number

  • model – Model to train

  • config – Optimizer configuration

  • rep – Output report structure

Private Template Helper

template<typename g>
void safe_clone(std::map<std::string, g*> *mp, g *in)

Safely clone template object if not already in map.

Parameters:
  • mp – Map to check/insert into

  • in – Template object to clone

Purpose: Avoid duplicate clones of same template

Private State Variables

Label Management:

std::map<std::string, std::string> file_labels

Maps file paths to user-assigned labels.

std::map<std::string, event_template*> event_labels

Maps labels to event template instances.

std::map<std::string, metric_template*> metric_names

Maps metric names to metric template instances.

std::map<std::string, selection_template*> selection_names

Maps selection names to selection template instances.

std::map<std::string, std::map<std::string, graph_template*>> graph_labels

Two-level map: event label → graph label → graph template.

Purpose: Each event type can have multiple graph representations

Model Management:

std::vector<std::string> model_session_names

Names of all registered training sessions.

std::map<std::string, model_template*> model_inference

Models registered for inference (no training).

std::map<std::string, model_template*> model_metrics

Models to compute metrics on.

std::vector<std::tuple<model_template*, optimizer_params_t*>> model_sessions

Training sessions: (model, optimizer config) pairs.

Training Infrastructure:

std::map<std::string, optimizer*> trainer

Active optimizer instances for each model.

std::map<std::string, model_report*> reports

Training reports (loss curves, metrics) for each session.

std::vector<std::thread*> threads

Worker threads for parallel training/inference.

Cache Management:

std::map<std::string, std::map<std::string, bool>> in_cache

Cache status: file → event_label → is_cached.

Purpose: Skip processing if data already on disk

std::map<std::string, bool> skip_event_build

Whether to skip event building for each file.

std::map<std::string, std::string> graph_types

Maps graph labels to their type names.

Core Components:

std::vector<folds_t> *tags

K-fold cross-validation tags (train/val/test assignments).

dataloader *loader

Dataloader instance for batch iteration.

sampletracer *tracer

Tracks sample metadata (cross-sections, sum of weights).

io *reader

I/O handler for ROOT/HDF5 files.

bool started

Flag indicating if start() has been called.

Purpose: Prevent double-execution

event_template

Header: modules/event/include/templates/event_template.h

Private Methods:

void build_mapping(std::map<std::string, data_t*> *evnt)

Build internal mapping between ROOT branches and particle objects.

Parameters:

evnt – Event data from ROOT file

Process:

  1. Parse branch names from evnt

  2. Match to particle templates via add_leaf() mappings

  3. Populate tree_variable_link

void flush_leaf_string()

Clear leaf string caches after event building.

Purpose: Free memory from temporary branch name storage

Private Variables:

std::map<std::string, bool> next_

Tracks which trees have more entries to read.

Keys: Tree names Values: Has next entry

std::map<std::string, particle_template*> particle_generators

Template particles used to create actual particle instances.

Purpose: Factory pattern - clone generators for each event

Links ROOT tree variables to internal element_t structures.

Structure: tree_name → variable_name → element_t

Purpose: Fast lookup during event building

Links tree branches to particle collections.

Structure: tree_name → branch_name → particle_map

Purpose: Organize particles by their source tree/branch

std::map<std::string, particle_template*> garbage

Temporary particles to be deleted after event processing.

Purpose: Memory management for cloned particles

graph_template

Header: modules/graph/include/templates/graph_template.h

Private Variables:

bool is_owner

Whether this graph owns its data (vs. referencing external data).

Purpose: Prevent double-free of tensors

std::mutex mut

Mutex for thread-safe access to graph data.

Usage: Lock when modifying edge_index or node features

torch::Tensor *edge_index

Pointer to edge connectivity tensor [2, num_edges].

Format: Row 0 = source nodes, Row 1 = destination nodes

std::map<std::string, int> *data_map_graph

Maps graph-level feature names to column indices.

std::map<std::string, int> *data_map_node

Maps node feature names to column indices.

std::map<std::string, int> *data_map_edge

Maps edge feature names to column indices.

Data Management Modules

io

Header: modules/io/include/io/io.h

Private HDF5 Methods:

hid_t member(folds_t t)

Create HDF5 compound type for folds_t struct.

Returns:

HDF5 type identifier

hid_t member(graph_hdf5_w t)

Create HDF5 compound type for graph_hdf5_w struct.

static herr_t file_info(hid_t loc_id, const char *name, const H5L_info_t *linfo, void *opdata)

HDF5 callback for iterating through datasets.

Parameters:
  • loc_id – HDF5 location identifier

  • name – Dataset/group name

  • linfo – Link info

  • opdata – User data

Returns:

HDF5 error code

H5::DataSet *dataset(std::string set_name, hid_t type, long long unsigned int length)

Create or open HDF5 dataset for writing.

Parameters:
  • set_name – Dataset name

  • type – HDF5 datatype

  • length – Number of elements

H5::DataSet *dataset(std::string set_name)

Open existing HDF5 dataset for reading.

Private ROOT Methods:

void root_key_paths(std::string path)

Recursively scan ROOT file directory structure.

Parameters:

path – Current directory path

void root_key_paths(std::string path, TTree *t)

Scan TTree branches.

void root_key_paths(std::string path, TBranch *t)

Scan TBranch leaves.

Private Variables:

HDF5 State:

std::map<std::string, H5::DataSet*> data_w

Open HDF5 datasets for writing.

Keys: Dataset names

std::map<std::string, H5::DataSet*> data_r

Open HDF5 datasets for reading.

H5::H5File *file

Currently open HDF5 file handle.

Null when: No file open

ROOT State:

TFile *file_root

Currently open ROOT file handle.

std::map<std::string, data_t*> *iters

Iterator state for ROOT tree reading.

Purpose: Track current position in each tree

Trigger Tracking:

std::map<std::string, bool> missing_trigger

Tracks which triggers are missing from data.

Purpose: Warn user about missing branches

std::map<std::string, bool> success_trigger

Tracks which triggers were successfully loaded.

dataloader

Header: modules/dataloader/include/generators/dataloader.h

Private Members:

friend class analysis

Grants analysis class access to private members.

settings_t *setting

Pointer to global analysis settings.

Purpose: Access output paths, k-fold settings

std::thread *cuda_mem

Background thread for managing CUDA memory.

Purpose: Asynchronous GPU memory allocation/deallocation

Private Methods:

void cuda_memory_server()

Worker function for CUDA memory management thread.

Process:

  1. Monitor memory usage

  2. Allocate buffers as needed

  3. Clean up finished batches

void clean_data_elements(std::map<std::string, int> **data_map, std::vector<std::map<std::string, int>*> *loader_map)

Free data mapping structures.

Parameters:
  • data_map – Data index maps to free

  • loader_map – Loader maps to free

Analysis Modules

lossfx

Header: modules/lossfx/include/templates/lossfx.h

Private Methods:

void interpret(std::string *ox)

Parse optimizer name string.

Parameters:

ox – Optimizer name to interpret

Optimizer Builders (one for each optimizer type):

void build_adam(optimizer_params_t *op, std::vector<torch::Tensor> *params)

Create Adam optimizer instance.

void build_adagrad(optimizer_params_t *op, std::vector<torch::Tensor> *params)

Create Adagrad optimizer.

void build_adamw(optimizer_params_t *op, std::vector<torch::Tensor> *params)

Create AdamW optimizer (Adam with weight decay).

void build_lbfgs(optimizer_params_t *op, std::vector<torch::Tensor> *params)

Create L-BFGS optimizer (quasi-Newton method).

void build_rmsprop(optimizer_params_t *op, std::vector<torch::Tensor> *params)

Create RMSprop optimizer.

void build_sgd(optimizer_params_t *op, std::vector<torch::Tensor> *params)

Create SGD optimizer (stochastic gradient descent).

Loss Function Builders:

void build_fx_loss(torch::nn::BCELossImpl *lossfx_)

Build binary cross-entropy loss function.

Parameters:

lossfx – Loss function implementation

meta

Header: modules/meta/include/meta/meta.h

Private Methods:

void compiler()

Compile metadata from all sources.

Process: Aggregate cross-sections, weights, fold assignments

float parse_float(std::string key, TTree *tr)

Parse float value from ROOT TTree.

Parameters:
  • key – Branch name

  • tr – TTree pointer

Returns:

Float value

std::string parse_string(std::string key, TTree *tr)

Parse string value from ROOT TTree.

static void get_isMC(bool*, meta*)

Determine if sample is Monte Carlo or data.

static void get_found(bool*, meta*)

Check if metadata was successfully loaded.

Private Variables:

std::vector<folds_t> *folds

K-fold cross-validation assignments.

Structure: Vector of fold structures with train/val/test indices

metric_template

Header: modules/metric/include/templates/metric_template.h

Private Variables:

friend class metric_template

Self-friend for template specialization access.

friend class analysis

Grants analysis access to build metrics.

mode_enum train_mode

Current mode: TRAIN, VALIDATION, or TEST.

std::string *pth

Pointer to output path for metric results.

model_template *mdlx

Model being evaluated.

metric_template *mtx

Metric template instance.

size_t index

Current fold index.

Private Methods:

void build()

Build metric computation pipeline.

Common Patterns

Friend Classes

Many modules declare friend class analysis to allow the orchestrator access to private members:

class my_module {
private:
    friend class analysis;  // analysis can access private members
    // ...
};

Purpose: analysis needs to coordinate between modules without exposing implementation details to users.

Cache Maps

Pattern for tracking cached data:

std::map<std::string, std::map<std::string, bool>> in_cache;
// file_path -> event_label -> is_cached

Usage: Check cache before expensive operations.

Data Index Maps

Pattern for fast feature lookup:

std::map<std::string, int>* data_map_node;
// feature_name -> column_index

Purpose: Convert feature names to tensor column indices.

Thread Safety

Pattern for thread-safe data access:

class graph_template {
private:
    std::mutex mut;
    torch::Tensor* edge_index;

    void modify_graph() {
        std::lock_guard<std::mutex> lock(mut);
        // Modify edge_index safely
    }
};

Memory Management

Pattern for owning vs. referencing data:

class graph_template {
private:
    bool is_owner;
    torch::Tensor* data;

    ~graph_template() {
        if (is_owner) {
            delete data;  // Only free if we own it
        }
    }
};

See Also