Structs and Enumerations
Framework-wide plain-old-data structures and enumeration types. These types are used pervasively across the C++ core and are the primary data contracts between modules (e.g. between the IO layer and the GNN training pipeline).
Enumerations
-
enum class data_enum
Identifies the concrete C++ type of a stored data element.
Values:
-
enumerator d
Scalar
double.
-
enumerator v_d
std::vector<double>.
-
enumerator vv_d
std::vector<std::vector<double>>.
-
enumerator vvv_d
std::vector<std::vector<std::vector<double>>>.
-
enumerator f
Scalar
float.
-
enumerator v_f
std::vector<float>.
-
enumerator vv_f
std::vector<std::vector<float>>.
-
enumerator vvv_f
std::vector<std::vector<std::vector<float>>>.
-
enumerator l
Scalar
long.
-
enumerator v_l
std::vector<long>.
-
enumerator vv_l
std::vector<std::vector<long>>.
-
enumerator vvv_l
std::vector<std::vector<std::vector<long>>>.
-
enumerator i
Scalar
int.
-
enumerator v_i
std::vector<int>.
-
enumerator vv_i
std::vector<std::vector<int>>.
-
enumerator vvv_i
std::vector<std::vector<std::vector<int>>>.
-
enumerator ull
Scalar
unsignedlong long.
-
enumerator v_ull
std::vector<unsigned long long>.
-
enumerator vv_ull
std::vector<std::vector<unsigned long long>>.
-
enumerator vvv_ull
std::vector<std::vector<std::vector<unsigned long long>>>.
-
enumerator b
Scalar
bool.
-
enumerator v_b
std::vector<bool>.
-
enumerator vv_b
std::vector<std::vector<bool>>.
-
enumerator vvv_b
std::vector<std::vector<std::vector<bool>>>.
-
enumerator ui
Scalar
unsignedint.
-
enumerator v_ui
std::vector<unsigned int>.
-
enumerator vv_ui
std::vector<std::vector<unsigned int>>.
-
enumerator vvv_ui
std::vector<std::vector<std::vector<unsigned int>>>.
-
enumerator c
Scalar
char.
-
enumerator v_c
std::vector<char>.
-
enumerator vv_c
std::vector<std::vector<char>>.
-
enumerator vvv_c
std::vector<std::vector<std::vector<char>>>.
-
enumerator undef
Type is unknown / undefined.
-
enumerator unset
Type has not yet been assigned.
-
enumerator d
-
enum class opt_enum
Identifies the PyTorch optimizer algorithm.
Values:
-
enumerator adam
Adam optimizer.
-
enumerator adagrad
Adagrad optimizer.
-
enumerator adamw
AdamW optimizer.
-
enumerator lbfgs
L-BFGS optimizer.
-
enumerator rmsprop
RMSprop optimizer.
-
enumerator sgd
Stochastic gradient descent optimizer.
-
enumerator invalid_optimizer
Sentinel value indicating no valid optimizer was selected.
-
enumerator adam
-
enum class mlp_init
Weight-initialisation schemes for torch::nn::Sequential modules.
Values:
-
enumerator uniform
Uniform random initialisation.
-
enumerator normal
Normal (Gaussian) random initialisation.
-
enumerator xavier_normal
Xavier normal initialisation.
-
enumerator xavier_uniform
Xavier uniform initialisation.
-
enumerator kaiming_uniform
Kaiming (He) uniform initialisation.
-
enumerator kaiming_normal
Kaiming (He) normal initialisation.
-
enumerator uniform
-
enum class loss_enum
Identifies the PyTorch loss function to use.
Values:
-
enumerator bce
Binary cross-entropy loss.
-
enumerator bce_with_logits
Binary cross-entropy with logits loss.
-
enumerator cosine_embedding
Cosine embedding loss.
-
enumerator cross_entropy
Cross-entropy loss.
-
enumerator ctc
CTC loss.
-
enumerator hinge_embedding
Hinge embedding loss.
-
enumerator huber
Huber loss.
-
enumerator kl_div
Kullback-Leibler divergence loss.
-
enumerator l1
L1 (mean absolute error) loss.
-
enumerator margin_ranking
Margin ranking loss.
-
enumerator mse
Mean squared error loss.
-
enumerator multi_label_margin
Multi-label margin loss.
-
enumerator multi_label_soft_margin
Multi-label soft margin loss.
-
enumerator multi_margin
Multi-margin loss.
-
enumerator nll
Negative log-likelihood loss.
-
enumerator poisson_nll
Poisson NLL loss.
-
enumerator smooth_l1
Smooth L1 loss.
-
enumerator soft_margin
Soft margin loss.
-
enumerator triplet_margin
Triplet margin loss.
-
enumerator triplet_margin_with_distance
Triplet margin with distance loss.
-
enumerator invalid_loss
Sentinel value indicating no valid loss was selected.
-
enumerator bce
-
enum class scheduler_enum
Identifies the learning-rate scheduler.
Values:
-
enumerator steplr
Step-based learning-rate decay (StepLR).
-
enumerator reducelronplateauscheduler
Reduce LR on plateau scheduler.
-
enumerator lrscheduler
Generic LR scheduler base.
-
enumerator invalid_scheduler
Sentinel value indicating no valid scheduler was selected.
-
enumerator steplr
-
enum class graph_enum
Identifies which tensor slot of a graph_t object is being accessed.
Values:
-
enumerator data_graph
Input feature tensor at graph level.
-
enumerator data_node
Input feature tensor at node level.
-
enumerator data_edge
Input feature tensor at edge level.
-
enumerator truth_graph
Ground-truth label tensor at graph level.
-
enumerator truth_node
Ground-truth label tensor at node level.
-
enumerator truth_edge
Ground-truth label tensor at edge level.
-
enumerator edge_index
COO edge-index tensor ([2, num_edges]).
-
enumerator weight
Per-event weight tensor.
-
enumerator batch_index
Batch-assignment index tensor for batched graphs.
-
enumerator batch_events
Global event indices of all graphs in the batch.
-
enumerator pred_graph
Model prediction tensor at graph level.
-
enumerator pred_node
Model prediction tensor at node level.
-
enumerator pred_edge
Model prediction tensor at edge level.
-
enumerator pred_extra
Miscellaneous model prediction tensor.
-
enumerator data_graph
-
enum class mode_enum
Identifies the training phase.
Values:
-
enumerator training
Model is in training mode.
-
enumerator validation
Model is in validation mode.
-
enumerator evaluation
Model is in evaluation (test/inference) mode.
-
enumerator training
-
enum class particle_enum
Identifies which kinematic or metadata attribute of a particle to read or write.
Values:
-
enumerator index
Particle index in the event.
-
enumerator pdgid
PDG Monte Carlo particle ID.
-
enumerator pt
Transverse momentum.
-
enumerator eta
Pseudorapidity.
-
enumerator phi
Azimuthal angle.
-
enumerator energy
Energy.
-
enumerator px
x-component of momentum.
-
enumerator pz
z-component of momentum.
-
enumerator py
y-component of momentum.
-
enumerator mass
Invariant mass.
-
enumerator charge
Electric charge.
-
enumerator is_b
Flag: particle is a b-quark/hadron.
-
enumerator is_lep
Flag: particle is a lepton.
-
enumerator is_nu
Flag: particle is a neutrino.
-
enumerator is_add
Flag: additional particle classification.
-
enumerator pmc
Bulk Cartesian four-momentum write-out (px, py, pz, e).
-
enumerator pmu
Bulk polar four-momentum write-out (pt, eta, phi, e).
-
enumerator index
Core Data Structs
bsc_t — polymorphic leaf buffer
bsc_t (base struct) is the polymorphic root of the ROOT leaf-reading
hierarchy. Each instantiation holds at most one heap-allocated buffer
corresponding to the concrete data_enum type discovered at runtime.
All overloads of element() read the buffer at index and write into
the caller-supplied pointer.
-
struct bsc_t
Type-erased data container that stores a single typed pointer.
bsc_tholds one pointer for each supported data type (scalars, vectors, 2-deep nested vectors, and 3-deep nested vectors of all the numeric types declared indata_enum). Exactly one pointer should be non-null at any time; the active type is recorded in thetypemember. Theindexmember controls which element of a vector-typed buffer is returned by theelement()accessors.flush_buffer()either zeros or deletes the active pointer depending on theclearflag. The class is the base ofdata_t(read buffer for ROOT branches) andvariable_t(write buffer for output ROOT branches).Subclassed by data_t, variable_t
Public Functions
-
bsc_t()
Default constructor. Initialises all pointers to
nullptr.
-
virtual ~bsc_t()
Virtual destructor.
-
void flush_buffer()
Clear or delete the active data pointer according to
bsc_t::clear.
-
std::string as_string()
Return the string name of the currently active type.
- Returns:
Type name as a string.
-
std::string scan_buffer()
Scan and return a debug summary of the buffer state.
- Returns:
String describing the active pointer.
-
data_enum root_type_translate(std::string*)
Translate a ROOT leaf-type string to the corresponding
data_enum.- Parameters:
root_str – Pointer to the ROOT type string (e.g.
"D"for double).- Returns:
Corresponding
data_enumvalue.
-
bool element(std::vector<std::vector<std::vector<float>>> *el)
Store the given pointer and set
typeaccordingly.- Parameters:
el – Pointer to assign.
- Returns:
trueon success.
-
bool element(std::vector<std::vector<std::vector<double>>> *el)
-
bool element(std::vector<std::vector<std::vector<long>>> *el)
-
bool element(std::vector<std::vector<std::vector<int>>> *el)
-
bool element(std::vector<std::vector<std::vector<bool>>> *el)
-
bool element(std::vector<std::vector<float>> *el)
-
bool element(std::vector<std::vector<double>> *el)
-
bool element(std::vector<std::vector<long>> *el)
-
bool element(std::vector<std::vector<int>> *el)
-
bool element(std::vector<std::vector<bool>> *el)
-
bool element(std::vector<float> *el)
-
bool element(std::vector<double> *el)
-
bool element(std::vector<long> *el)
-
bool element(std::vector<int> *el)
-
bool element(std::vector<char> *el)
-
bool element(std::vector<bool> *el)
-
bool element(double *el)
-
bool element(float *el)
-
bool element(long *el)
-
bool element(int *el)
-
bool element(bool *el)
-
bool element(unsigned long long *el)
-
bool element(unsigned int *el)
-
bool element(char *el)
Public Members
-
std::vector<std::vector<std::vector<unsigned long long>>> *vvv_ull = nullptr
-
std::vector<std::vector<std::vector<unsigned int>>> *vvv_ui = nullptr
-
std::vector<std::vector<std::vector<double>>> *vvv_d = nullptr
-
std::vector<std::vector<std::vector<long>>> *vvv_l = nullptr
-
std::vector<std::vector<std::vector<float>>> *vvv_f = nullptr
-
std::vector<std::vector<std::vector<int>>> *vvv_i = nullptr
-
std::vector<std::vector<std::vector<bool>>> *vvv_b = nullptr
-
std::vector<std::vector<std::vector<char>>> *vvv_c = nullptr
-
std::vector<std::vector<unsigned long long>> *vv_ull = nullptr
-
std::vector<std::vector<unsigned int>> *vv_ui = nullptr
Pointer to depth-2 unsigned int data, or
nullptr.
-
std::vector<std::vector<double>> *vv_d = nullptr
-
std::vector<std::vector<long>> *vv_l = nullptr
-
std::vector<std::vector<float>> *vv_f = nullptr
-
std::vector<std::vector<int>> *vv_i = nullptr
-
std::vector<std::vector<bool>> *vv_b = nullptr
-
std::vector<std::vector<char>> *vv_c = nullptr
-
std::vector<unsigned long long> *v_ull = nullptr
Pointer to a flat vector of unsigned long longs, or
nullptr.
-
std::vector<unsigned int> *v_ui = nullptr
Pointer to a flat vector of unsigned ints, or
nullptr.
-
std::vector<double> *v_d = nullptr
Pointer to a flat vector of doubles, or
nullptr.
-
std::vector<long> *v_l = nullptr
Pointer to a flat vector of longs, or
nullptr.
-
std::vector<float> *v_f = nullptr
Pointer to a flat vector of floats, or
nullptr.
-
std::vector<int> *v_i = nullptr
Pointer to a flat vector of ints, or
nullptr.
-
std::vector<bool> *v_b = nullptr
Pointer to a flat vector of bools, or
nullptr.
-
std::vector<char> *v_c = nullptr
Pointer to a flat vector of chars, or
nullptr.
-
unsigned long long *ull = nullptr
Pointer to a scalar unsigned long long, or
nullptr.
-
unsigned int *ui = nullptr
Pointer to a scalar unsigned int, or
nullptr.
-
double *d = nullptr
Pointer to a scalar double, or
nullptr.
-
long *l = nullptr
Pointer to a scalar long, or
nullptr.
-
float *f = nullptr
Pointer to a scalar float, or
nullptr.
-
int *i = nullptr
Pointer to a scalar int, or
nullptr.
-
bool *b = nullptr
Pointer to a scalar bool, or
nullptr.
-
char *c = nullptr
Pointer to a scalar char, or
nullptr.
-
long index = 0
Current index into a vector-typed buffer (used during iteration).
-
bool clear = false
If
true,flush_bufferdeletes the pointed-to object; otherwise it only resets the value to 0 / clears the vector.
-
data_enum type = data_enum::unset
Active data type of this container (default
data_enum::unset).
-
bsc_t()
data_t — single ROOT leaf accessor
data_t extends bsc_t with ROOT bookkeeping (TLeaf/TBranch/TTree
pointers, file path, leaf type string) and sequential iteration helpers.
-
struct data_t : public bsc_t
Represents a single ROOT branch/leaf with its associated metadata and typed data buffer.
Inherits
bsc_tto store the branch data in a type-safe way. Adata_tinstance keeps references to theTFile,TTree,TBranch, andTLeafit reads from and iterates over events vianext().Public Functions
-
data_t()
Default constructor.
-
~data_t() override
Destructor. Releases the typed data buffer.
-
void initialize()
Open the ROOT file and prepare the branch for reading.
-
void flush()
Clear the internal data buffer.
-
bool next()
Advance to the next event entry.
- Returns:
trueif a next entry was found;falseat end of file.
Public Members
-
std::string leaf_name = ""
Name of the ROOT TLeaf.
-
std::string branch_name = ""
Name of the ROOT TBranch.
-
std::string tree_name = ""
Name of the ROOT TTree.
-
std::string leaf_type = ""
ROOT leaf type string (e.g.
"D"for double,"F"for float).
-
std::string path = ""
File path from which this branch is read.
-
std::string *fname = nullptr
Pointer to the current filename (not owned).
-
TLeaf *leaf = nullptr
Pointer to the associated ROOT TLeaf object.
-
TBranch *branch = nullptr
Pointer to the associated ROOT TBranch object.
-
TTree *tree = nullptr
Pointer to the ROOT TTree that contains this branch.
-
TFile *file = nullptr
Pointer to the ROOT TFile.
-
int file_index = 0
Index into the
files_s/files_i/files_tarrays for multi-file iteration.
-
std::vector<std::string> *files_s = nullptr
Pointer to the list of file paths for multi-file reading.
-
std::vector<long> *files_i = nullptr
Pointer to per-file event counts.
-
std::vector<TFile*> *files_t = nullptr
Pointer to the list of open TFile objects.
-
data_t()
element_t — per-event leaf handle map
element_t is handed to particle_template::build() and
event_template::build(). The get<T>(key, ptr) template method
looks up the named data_t and copies the current element into *ptr
via the appropriate bsc_t::element() overload.
-
struct element_t
Aggregates all
data_tbranches for one event, providing a named-key access interface.Public Functions
-
bool next()
Advance all branches to the next event.
- Returns:
trueif successful.
-
void set_meta()
Update metadata (filename, event index) from the active branches.
-
bool boundary()
Check whether the current entry is on a file boundary.
- Returns:
trueif the current entry is the first of a new file.
-
template<typename g>
inline bool get(std::string key, g *var) Retrieve the value stored under
keyand place it invar.- Template Parameters:
g – Target type; must match the leaf type or an abort is triggered.
- Parameters:
key – Leaf/branch name used as the lookup key.
var – Pointer to the variable to fill.
- Returns:
trueon success; aborts on type mismatch.
-
bool next()
write_t / writer — ROOT output helpers
-
struct write_t
Groups a ROOT
TFile,TTree, andmeta_tpointer together with a map of namedvariable_twrite buffers.Public Functions
-
variable_t *process(std::string *name)
Look up or create the
variable_tforname.- Parameters:
name – Variable name.
- Returns:
Pointer to the corresponding
variable_t.
-
void write()
Fill the current TTree entry from all registered
variable_tbuffers.
-
void create(std::string tr_name, std::string path)
Open the output ROOT file at
pathand create a TTree namedtr_name.- Parameters:
tr_name – TTree name.
path – Output file path.
-
void close()
Write and close the output ROOT file.
Public Members
-
TFile *file = nullptr
Output ROOT TFile.
-
TTree *tree = nullptr
Output ROOT TTree.
-
std::map<std::string, variable_t*> *data = nullptr
Map of variable names to their
variable_twrite buffers.
-
variable_t *process(std::string *name)
-
struct writer
High-level helper that manages multiple
write_tobjects keyed by TTree name.Public Functions
-
writer()
Default constructor.
-
~writer()
Destructor. Closes all open files.
-
void create(std::string *pth)
Open the output ROOT file at the given path.
- Parameters:
pth – Pointer to the file path string.
-
void write(std::string *tree)
Fill one TTree entry for the tree named
tree.- Parameters:
tree – Pointer to the TTree name.
-
writer()
Event, Particle, and Graph Payload Structs
particle_t — raw kinematic payload
particle_t carries the floating-point kinematics and integer metadata
for a single particle. particle_template stores one of these as its
internal representation.
-
struct particle_t
Raw particle kinematics and classification flags.
Raw kinematic and identification data for a single particle.
Public Members
-
double e = -0.000000000000001
Energy in MeV (default -1e-15 to signal “unset”).
-
double mass = -1
Invariant mass in MeV (default -1).
-
double px = 0
Cartesian x-component of three-momentum.
-
double py = 0
Cartesian y-component of three-momentum.
-
double pz = 0
Cartesian z-component of three-momentum.
-
double pt = 0
Transverse momentum.
-
double eta = 0
Pseudorapidity.
-
double phi = 0
Azimuthal angle in radians.
-
bool cartesian = false
trueif Cartesian (px, py, pz, e) coordinates are valid.
-
bool polar = false
trueif polar (pt, eta, phi, e) coordinates are valid.
-
double charge = 0
Electric charge.
-
int pdgid = 0
PDG Monte Carlo particle identifier.
-
int index = -1
Index of this particle within its parent event container.
-
std::string type = ""
String label identifying the particle type (e.g.
"top","b").
-
std::string hash = ""
Unique hash string for this particle instance.
-
std::string symbol = ""
LaTeX-style or text symbol for the particle (e.g.
"t","b").
-
std::vector<int> lepdef = {11, 13, 15}
PDG IDs considered as leptons for
is_lepclassification (default {11, 13, 15}).
-
std::vector<int> nudef = {12, 14, 16}
PDG IDs considered as neutrinos for
is_nuclassification (default {12, 14, 16}).
-
std::map<std::string, bool> children = {}
Map of child particle hashes to a presence flag.
-
std::map<std::string, bool> parents = {}
Map of parent particle hashes to a presence flag.
-
std::map<std::string, particle_template*> *data_p = nullptr
Pointer to the event-level particle map; used during event building.
-
double e = -0.000000000000001
event_t — event identity
event_t is embedded in every event_template and carries the event
index, weight, ROOT tree name and the unique hash used for graph caching.
-
struct event_t
Minimal event identification and state container.
Public Members
-
std::string name = ""
Human-readable name of the event type.
-
double weight = 1
Event weight (default 1.0).
-
long index = -1
Sequential event index within the file (default -1 = unset).
-
std::string hash = ""
Unique hash string identifying this event.
-
std::string tree = ""
Name of the ROOT TTree from which this event was read.
-
std::string name = ""
graph_t — GNN tensor container
graph_t is the central data structure passed between the graph builder,
the dataloader, and the model_template::forward() method. It stores
batched PyTorch tensors for node/edge/graph data features, truth features,
the COO edge index, and batching meta-data.
-
struct graph_t
Runtime container for a single graph’s input features, truth labels, edge index, and device-resident tensors.
Stores input features (
data_graph/,truth labels (truth_graph/,a COO edge-index tensor, an event-weight tensor, and a batch-index tensor. All tensors are cached per CUDA device index (in thedev_*maps) so that multi-GPU training can transfer data without repeated host-to-device copies.The
has_feature(graph_enum, name, dev)method is the unified lookup entry-point for all feature categories. Friendsgraph_templateanddataloaderhave write access to the private tensor storage. Thein_usecounter is managed bydataloaderto implement a simple object pool.Public Functions
-
template<typename g>
inline torch::Tensor *get_truth_graph(std::string _name, g *mdl) Retrieve the truth graph-level tensor named
_nameon the device of modelmdl.- Template Parameters:
g – A model type with a
device_indexmember.- Parameters:
_name – Feature name.
mdl – Pointer to the model.
- Returns:
Pointer to the tensor, or
nullptrif not found.
-
template<typename g>
inline torch::Tensor *get_truth_node(std::string _name, g *mdl) Retrieve the truth node-level tensor named
_name.- Template Parameters:
g – A model type with a
device_indexmember.- Parameters:
_name – Feature name.
mdl – Pointer to the model.
- Returns:
Pointer to the tensor, or
nullptr.
-
template<typename g>
inline torch::Tensor *get_truth_edge(std::string _name, g *mdl) Retrieve the truth edge-level tensor named
_name.- Template Parameters:
g – A model type with a
device_indexmember.- Parameters:
_name – Feature name.
mdl – Pointer to the model.
- Returns:
Pointer to the tensor, or
nullptr.
-
template<typename g>
inline torch::Tensor *get_data_graph(std::string _name, g *mdl) Retrieve the input data graph-level tensor named
_name.- Template Parameters:
g – A model type with a
device_indexmember.- Parameters:
_name – Feature name.
mdl – Pointer to the model.
- Returns:
Pointer to the tensor, or
nullptr.
-
template<typename g>
inline torch::Tensor *get_data_node(std::string _name, g *mdl) Retrieve the input data node-level tensor named
_name.- Template Parameters:
g – A model type with a
device_indexmember.- Parameters:
_name – Feature name.
mdl – Pointer to the model.
- Returns:
Pointer to the tensor, or
nullptr.
-
template<typename g>
inline torch::Tensor *get_data_edge(std::string _name, g *mdl) Retrieve the input data edge-level tensor named
_name.- Template Parameters:
g – A model type with a
device_indexmember.- Parameters:
_name – Feature name.
mdl – Pointer to the model.
- Returns:
Pointer to the tensor, or
nullptr.
-
template<typename g>
inline torch::Tensor *get_edge_index(g *mdl) Retrieve the COO edge-index tensor for model
mdl'sdevice.- Template Parameters:
g – A model type with a
device_indexmember.- Parameters:
mdl – Pointer to the model.
- Returns:
Pointer to the edge-index tensor, or
nullptr.
-
template<typename g>
inline torch::Tensor *get_event_weight(g *mdl) Retrieve the event-weight tensor.
- Template Parameters:
g – A model type with a
device_indexmember.- Parameters:
mdl – Pointer to the model.
- Returns:
Pointer to the weight tensor.
-
template<typename g>
inline torch::Tensor *get_batch_index(g *mdl) Retrieve the batch-assignment index tensor.
- Template Parameters:
g – A model type with a
device_indexmember.- Parameters:
mdl – Pointer to the model.
- Returns:
Pointer to the batch-index tensor.
-
template<typename g>
inline torch::Tensor *get_batched_events(g *mdl) Retrieve the global event-index tensor for the batch.
- Template Parameters:
g – A model type with a
device_indexmember.- Parameters:
mdl – Pointer to the model.
- Returns:
Pointer to the tensor.
-
torch::Tensor *has_feature(graph_enum tp, std::string _name, int dev)
Generic feature lookup by type, name, and device index.
- Parameters:
tp – Feature category (
graph_enum)._name – Feature name.
dev – Device index.
- Returns:
Pointer to the tensor, or
nullptr.
-
void add_truth_graph(std::map<std::string, torch::Tensor*> *data, std::map<std::string, int> *maps)
Register truth graph-level tensors from
datawith index mapmaps.
-
void add_truth_node(std::map<std::string, torch::Tensor*> *data, std::map<std::string, int> *maps)
Register truth node-level tensors.
-
void add_truth_edge(std::map<std::string, torch::Tensor*> *data, std::map<std::string, int> *maps)
Register truth edge-level tensors.
-
void add_data_graph(std::map<std::string, torch::Tensor*> *data, std::map<std::string, int> *maps)
Register input data graph-level tensors.
-
void add_data_node(std::map<std::string, torch::Tensor*> *data, std::map<std::string, int> *maps)
Register input data node-level tensors.
-
void add_data_edge(std::map<std::string, torch::Tensor*> *data, std::map<std::string, int> *maps)
Register input data edge-level tensors.
-
void transfer_to_device(torch::TensorOptions *dev)
Copy all tensors to the device described by
dev.- Parameters:
dev – Device options.
-
void _purge_all()
Delete all owned tensor data.
Public Members
-
int num_nodes = 0
Number of nodes in this graph.
-
long event_index = 0
Global event index.
-
double event_weight = 1
Event weight (default 1.0).
-
bool preselection = false
Result of the graph’s pre-selection flag.
-
std::vector<long> batched_events = {}
Global event indices of all graphs combined in the batch.
-
std::vector<std::string*> batched_filenames = {}
Source filenames of all graphs combined in the batch.
-
std::string *hash = nullptr
Pointer to the event hash string (not owned).
-
std::string *filename = nullptr
Pointer to the source file path (not owned).
-
std::string *graph_name = nullptr
Pointer to the graph type name (not owned).
-
c10::DeviceType device = c10::kCPU
Device on which data tensors reside (default CPU).
-
int in_use = 1
Reference count / in-use flag for the dataloader pool.
-
template<typename g>
folds_t — k-fold assignment
-
struct folds_t
Associates an event hash with its k-fold split membership.
Public Functions
-
inline void flush_data()
Free the heap-allocated
hashstring and set it tonullptr.
Public Members
-
int k = -1
K-fold index to which this event belongs (default -1 = unset).
-
bool is_train = false
trueif this event is assigned to the training set.
-
bool is_valid = false
trueif this event is assigned to the validation set.
-
bool is_eval = false
trueif this event is assigned to the evaluation (test) set.
-
char *hash = nullptr
Null-terminated C string holding the event hash.
-
inline void flush_data()
graph_hdf5 / graph_hdf5_w — HDF5 serialisation records
-
struct graph_hdf5
std::string-based representation of a serialised graph for HDF5 I/O.
All tensor and map data are encoded as base-64 strings.
Public Members
-
int num_nodes = -1
Number of nodes in the graph.
-
double event_weight = 1
Event weight.
-
long event_index = -1
Global event index.
-
std::string hash
Event hash.
-
std::string filename
Source file path.
-
std::string edge_index
Serialised edge-index tensor.
-
std::string data_map_graph
Serialised data-feature name-to-index map (graph level).
-
std::string data_map_node
Serialised data-feature name-to-index map (node level).
-
std::string data_map_edge
Serialised data-feature name-to-index map (edge level).
-
std::string truth_map_graph
Serialised truth-feature name-to-index map (graph level).
-
std::string truth_map_node
Serialised truth-feature name-to-index map (node level).
-
std::string truth_map_edge
Serialised truth-feature name-to-index map (edge level).
-
std::string data_graph
Serialised data-feature tensors (graph level).
-
std::string data_node
Serialised data-feature tensors (node level).
-
std::string data_edge
Serialised data-feature tensors (edge level).
-
std::string truth_graph
Serialised truth-label tensors (graph level).
-
std::string truth_node
Serialised truth-label tensors (node level).
-
std::string truth_edge
Serialised truth-label tensors (edge level).
-
int num_nodes = -1
-
struct graph_hdf5_w
C-string (
char*) variant ofgraph_hdf5used for direct HDF5 write operations.All string fields are heap-allocated null-terminated C strings. Call
flush_data()to release them.Public Functions
-
void flush_data()
Free all heap-allocated
char*members and set them tonullptr.
Public Members
-
int num_nodes = -1
Number of nodes in the graph.
-
double event_weight = 1
Event weight.
-
long event_index = -1
Global event index.
-
char *hash = nullptr
Null-terminated event hash string.
-
char *filename = nullptr
Null-terminated source file path.
-
char *edge_index = nullptr
Null-terminated serialised edge-index tensor.
-
char *data_map_graph = nullptr
-
char *data_map_node = nullptr
-
char *data_map_edge = nullptr
-
char *truth_map_graph = nullptr
-
char *truth_map_node = nullptr
-
char *truth_map_edge = nullptr
-
char *data_graph = nullptr
-
char *data_node = nullptr
-
char *data_edge = nullptr
-
char *truth_graph = nullptr
-
char *truth_node = nullptr
-
char *truth_edge = nullptr
-
void flush_data()
Settings and Configuration Structs
settings_t — global analysis settings
settings_t is the POD configuration object stored inside analysis
(and accessible from Python as properties). All Analysis.* properties
map onto fields of this struct.
-
struct settings_t
Global analysis run settings.
Aggregated run configuration covering I/O paths, ML hyperparameters, and plotting settings.
Passed directly to the
analysisclass viaanalysis::m_settingsand propagated todataloader,optimizer,metrics, andio.The I/O fields (
output_path,run_name,sow_name,metacache_path) control where results are written and how the sum-of-weights histogram is named.The ML fields (
epochs,kfolds,batch_size,kfold,num_examples,train_size,training,validation,evaluation,continue_training,training_dataset,graph_cache) configure the k-fold cross-validation loop, mini-batch size, which phases to run, whether to resume from a checkpoint, and paths to pre-built graph caches.The plotting fields (
var_pt,var_eta,var_phi,var_energy,targets,nbins,max_range,logy) configure auto-generated invariant-mass histograms.The execution fields (
threads,intra_th,debug_mode,build_cache,selection_root) control parallelism, intra-op thread count, verbose output, and output format.Public Members
-
std::string output_path = "./ProjectName"
Root output directory for plots, models, and selection files (default
"./ProjectName").
-
std::string run_name = ""
Optional tag appended to output directories for this run.
-
std::string sow_name = ""
Name of the sum-of-weights histogram in the ROOT file.
-
std::string metacache_path = "./"
Path to the directory used for caching AMI metadata (default
"./").
-
bool fetch_meta = false
If
true, query AMI for dataset metadata; otherwise use cache.
-
bool pretagevents = false
If
true, pre-tag events before building graphs.
-
int epochs = 10
Number of training epochs (default 10).
-
int kfolds = 10
Number of k-fold splits (default 10).
-
int batch_size = 1
Minibatch size for training (default 1).
-
std::vector<int> kfold = {}
Explicit list of k-fold indices to run; empty means run all.
-
int num_examples = 3
Number of example graphs displayed during progress reporting (default 3).
-
float train_size = 50
Percentage of data used for training (default 50).
-
bool training = true
Enable the training phase (default
true).
-
bool validation = true
Enable the validation phase (default
true).
-
bool evaluation = true
Enable the evaluation phase (default
true).
-
bool continue_training = true
If
true, resume training from the last saved checkpoint (defaulttrue).
-
std::string training_dataset = ""
Path to a pre-built graph dataset for training.
-
std::string graph_cache = ""
Path to a graph cache directory.
-
std::string var_pt = "pt"
Leaf name of the transverse-momentum variable (default
"pt").
-
std::string var_eta = "eta"
Leaf name of the pseudorapidity variable (default
"eta").
-
std::string var_phi = "phi"
Leaf name of the azimuthal angle variable (default
"phi").
-
std::string var_energy = "energy"
Leaf name of the energy variable (default
"energy").
-
std::vector<std::string> targets = {}
List of variable names to use as training targets.
-
int nbins = 400
Number of histogram bins (default 400).
-
int max_range = 400
Maximum axis range for mass histograms (default 400 GeV).
-
bool logy = false
Use logarithmic y-axis on histograms (default
false).
-
int threads = 10
Number of parallel worker threads (default 10).
-
int intra_th = -1
Number of intra-op threads for PyTorch (default -1 = use system default).
-
bool debug_mode = false
Enable verbose debug output (default
false).
-
bool build_cache = false
If
true, build and save a local graph cache (defaultfalse).
-
bool selection_root = false
If
true, write selection output to ROOT files (defaultfalse).
-
std::string output_path = "./ProjectName"
model_settings_t — per-model ML configuration
model_settings_t is populated by model_template and carries
optimizer choice, I/O feature maps, weight/tree names, and device info.
-
struct model_settings_t
Snapshot of a
model_templateconfiguration suitable for serialisation and transfer between objects.Public Members
-
std::string s_optim
Optimizer type as a string (e.g.
"adam").
-
std::string weight_name
Name of the event-weight leaf in the ROOT tree.
-
std::string tree_name
Name of the ROOT TTree to read from.
-
std::string model_name
Human-readable model name.
-
std::string model_device
PyTorch device string (e.g.
"cpu"or"cuda:0").
-
std::string model_checkpoint_path
Path where model checkpoints are saved.
-
bool inference_mode
truewhen the model is used for inference only.
-
bool is_mc
trueif the dataset is Monte Carlo simulation.
-
std::map<std::string, std::string> o_graph
Output feature names to loss-function names (graph level).
-
std::map<std::string, std::string> o_node
Output feature names to loss-function names (node level).
-
std::map<std::string, std::string> o_edge
Output feature names to loss-function names (edge level).
-
std::vector<std::string> i_graph
Names of requested input graph-level features.
-
std::vector<std::string> i_node
Names of requested input node-level features.
-
std::vector<std::string> i_edge
Names of requested input edge-level features.
-
std::string s_optim
loss_opt — loss function options
-
struct loss_opt
Options controlling the behaviour of a loss function.
Public Members
-
loss_enum fx = loss_enum::invalid_loss
Loss function type (default
invalid_loss).
-
bool mean = false
Use
torch::kMeanreduction.
-
bool sum = false
Use
torch::kSumreduction.
-
bool none = false
Use
torch::kNonereduction.
-
bool swap = false
Enable the
swapoption (TripletMargin).
-
bool full = false
Enable the
fulloption (KLDiv).
-
bool batch_mean = false
Use
torch::kBatchMeanreduction (KLDiv).
-
bool target = false
Enable
log_target(KLDiv).
-
bool zero_inf = false
Enable
zero_infinity(CTC).
-
bool defaults = true
If
true, use the loss function’s default options.
-
int ignore = 1000
Index to ignore in cross-entropy/NLL (default 1000 = disabled).
-
int blank = 0
Blank token index for CTC (default 0).
-
double margin = 0
Margin value for margin-based losses.
-
double beta = 0
Beta parameter for Huber/SmoothL1.
-
double eps = 0
Epsilon for numerical stability.
-
double smoothing = 0
Label-smoothing factor for cross-entropy.
-
double delta = 0
Delta parameter for Huber loss.
-
std::vector<double> weight = {}
Per-class weight vector for cross-entropy and NLL.
-
loss_enum fx = loss_enum::invalid_loss
optimizer_params_t — optimizer hyper-parameters
optimizer_params_t is the C++ counterpart of OptimizerConfig
(Cython layer). Each cproperty field sets a m_* sentinel flag so
the optimizer builder knows which hyper-parameters have been explicitly
specified.
-
class optimizer_params_t
Typed hyper-parameter set for a PyTorch optimizer.
Wraps each hyper-parameter as a
cpropertyso that setting one automatically records that it was explicitly provided (the correspondingm_*flag is set totrue).Public Functions
-
optimizer_params_t()
Constructor. Registers setter callbacks for all hyper-parameters.
Public Members
-
std::string optimizer = ""
Optimizer name string (e.g.
"adam").
-
cproperty<double, optimizer_params_t> lr
Learning rate.
-
cproperty<double, optimizer_params_t> lr_decay
Learning-rate decay (Adagrad).
-
cproperty<double, optimizer_params_t> weight_decay
L2 weight-decay regularisation.
-
cproperty<double, optimizer_params_t> initial_accumulator_value
Initial accumulator value (Adagrad).
-
cproperty<double, optimizer_params_t> eps
Epsilon for numerical stability.
-
cproperty<double, optimizer_params_t> tolerance_grad
Gradient tolerance (L-BFGS).
-
cproperty<double, optimizer_params_t> tolerance_change
Function value/parameter change tolerance (L-BFGS).
-
cproperty<double, optimizer_params_t> alpha
Alpha / smoothing constant (RMSprop).
-
cproperty<double, optimizer_params_t> momentum
Momentum factor (SGD / RMSprop).
-
cproperty<double, optimizer_params_t> dampening
Dampening for momentum (SGD).
-
cproperty<bool, optimizer_params_t> amsgrad
Enable AMSGrad variant of Adam/AdamW.
-
cproperty<bool, optimizer_params_t> centered
Centered RMSprop.
-
cproperty<bool, optimizer_params_t> nesterov
Nesterov momentum (SGD).
-
cproperty<int, optimizer_params_t> max_iter
Maximum number of iterations per optimisation step (L-BFGS).
-
cproperty<int, optimizer_params_t> max_eval
Maximum number of function evaluations per step (L-BFGS).
-
cproperty<int, optimizer_params_t> history_size
History size for L-BFGS.
-
cproperty<std::tuple<float, float>, optimizer_params_t> betas
Adam β₁/β₂ coefficients.
-
cproperty<std::vector<float>, optimizer_params_t> beta_hack
Alternative way to set β₁/β₂ as a vector (for Cython compatibility).
-
std::string scheduler = ""
Scheduler type string (e.g.
"steplr").
-
unsigned int step_size = 1
StepLR step size (default 1).
-
double gamma = 0.1
Multiplicative factor for LR decay (default 0.1).
-
bool m_lr = false
-
bool m_lr_decay = false
-
bool m_weight_decay = false
-
bool m_initial_accumulator_value = false
-
bool m_eps = false
-
bool m_betas = false
-
bool m_amsgrad = false
-
bool m_max_iter = false
-
bool m_max_eval = false
-
bool m_tolerance_grad = false
-
bool m_tolerance_change = false
-
bool m_history_size = false
-
bool m_alpha = false
-
bool m_momentum = false
-
bool m_centered = false
-
bool m_dampening = false
-
bool m_nesterov = false
-
optimizer_params_t()
Meta Structs
meta_t — ATLAS dataset metadata
meta_t holds all AMI / ATLAS metadata for a dataset: DSID, campaign,
generator, cross-section, filter efficiency, luminosity, sum-of-weights,
run numbers, file GUIDs and per-systematic weight dictionaries.
-
struct meta_t
Full dataset metadata struct.
Complete metadata for one Monte Carlo or data dataset.
Stores AMI metadata, cross-section, luminosity, and provenance information for a single dataset.
Aggregates ATLAS AnalysisBase tracking values, AMI attributes, physics cross-section information, and ROOT file inventory.
Public Members
-
unsigned int dsid = 0
Dataset identifier.
-
bool isMC = true
trueif the dataset is Monte Carlo simulation.
-
std::string derivationFormat = ""
ATLAS derivation format string (e.g.
"DAOD_PHYS").
-
std::map<int, std::string> inputfiles = {}
Map of integer index to input file path.
-
std::map<std::string, std::string> config = {}
Arbitrary key-value configuration map.
-
std::string AMITag = ""
AMI tag string (e.g.
"e8496_s3126_r12305_p5169").
-
std::string generators = ""
Space-separated list of generator names.
-
std::map<int, int> inputrange = {}
Map of file index to event count in that file.
-
double eventNumber = -1
ROOT event number (reserved for ROOT-specific mapping).
-
double event_index = -1
Free-use event index within the framework.
-
bool found = false
trueif the dataset was found in AMI / the local cache.
-
std::string DatasetName = ""
Full AMI logical dataset name.
-
double totalSize = 0
Total dataset size in bytes.
-
double kfactor = 0
QCD k-factor for the process.
-
double ecmEnergy = 0
Centre-of-mass energy in MeV.
-
double genFiltEff = 0
Generator filter efficiency.
-
double completion = 0
Fraction of the dataset that has been processed.
-
double beam_energy = 0
Beam energy in MeV.
-
double crossSection = 0
Cross section in nb.
-
double crossSection_mean = 0
Mean cross section (e.g. averaged over PDF members).
-
double campaign_luminosity = 0
Integrated luminosity of the corresponding campaign in fb⁻¹.
-
unsigned int nFiles = 0
Number of files in the dataset.
-
unsigned int totalEvents = 0
Total number of events across all files.
-
unsigned int datasetNumber = 0
Numeric dataset identifier (redundant with
dsidfor legacy use).
-
std::string identifier = ""
Human-readable unique identifier string for the dataset.
-
std::string prodsysStatus = ""
Production system status string.
-
std::string dataType = ""
Data type string (e.g.
"MC"or"DATA").
-
std::string version = ""
Software version tag.
-
std::string PDF = ""
PDF set used for generation.
-
std::string AtlasRelease = ""
ATLAS software release string.
-
std::string principalPhysicsGroup = ""
ATLAS physics group that owns the dataset.
-
std::string physicsShort = ""
Short physics description tag.
-
std::string generatorName = ""
Primary generator name.
-
std::string geometryVersion = ""
ATLAS detector geometry version.
-
std::string conditionsTag = ""
Conditions database tag.
-
std::string generatorTune = ""
Generator tune identifier.
-
std::string amiStatus = ""
AMI dataset status string.
-
std::string beamType = ""
Beam type string (e.g.
"collisions").
-
std::string productionStep = ""
Production step label.
-
std::string projectName = ""
ATLAS project name.
-
std::string statsAlgorithm = ""
Statistics algorithm used.
-
std::string genFilterNames = ""
Comma-separated generator filter names.
-
std::string file_type = ""
File type identifier (e.g.
"ROOT"or"HDF5").
-
std::string sample_name = ""
Short sample name used in plots and outputs.
-
std::string logicalDatasetName = ""
Full logical dataset name (LDN) in AMI.
-
std::string campaign = ""
ATLAS MC campaign identifier (e.g.
"mc21").
-
std::vector<std::string> keywords = {}
List of AMI keyword strings.
-
std::vector<std::string> weights = {}
Names of the event weights stored in the dataset.
-
std::vector<std::string> keyword = {}
Additional keyword list (complementary to
keywords).
-
std::vector<int> events = {}
Per-file event counts.
-
std::vector<int> run_number = {}
Per-file run numbers.
-
std::vector<double> fileSize = {}
Per-file sizes in bytes.
-
std::vector<std::string> fileGUID = {}
Per-file GUIDs.
-
std::map<std::string, int> LFN = {}
Map of logical file name (LFN) to integer index.
-
unsigned int dsid = 0
weights_t — per-systematic sum-of-weights record
-
struct weights_t
Sample weighting information from AMI.
Holds normalisation weights and statistics for a single dataset.
Public Members
-
int dsid = -1
Dataset identifier (DSID).
-
bool isAFII = false
trueif the dataset uses ATLAS Fast II simulation.
-
std::string generator = ""
Name of the Monte Carlo generator.
-
std::string ami_tag = ""
AMI processing tag string.
-
float total_events_weighted = -1
Sum of weights for all events in the dataset.
-
float total_events = -1
Raw total event count.
-
float processed_events = -1
Number of events processed.
-
float processed_events_weighted = -1
Sum of weights for processed events.
-
float processed_events_weighted_squared = -1
Sum of squared weights for processed events (for uncertainty estimation).
-
std::map<std::string, float> hist_data = {}
Histogram data keyed by variable name.
-
int dsid = -1
Training Report Structs
model_report — per-epoch training summary
model_report is produced by the dataloader after each epoch and
carries loss/accuracy maps keyed by mode_enum and feature name, as
well as the current learning rates and iteration counters.
-
struct model_report
Stores and formats the training progress of a single epoch for one k-fold.
Tracks per-epoch, per-mode (training / validation / evaluation) loss and accuracy values at graph, node, and edge level, indexed by feature name. The
current_lrvector records the learning rate of each parameter group at the end of the epoch.is_completeis set totruewhen the epoch loop finishes;waiting_plotpoints to themetricsobject that is waiting to dump plots for this epoch (ornullptr).progresstracks fractional completion within an epoch;itersandnum_evntcount gradient updates and processed events respectively.print()formats all maps as a multi-line human-readable string.prx()formats one individual map with a given title prefix.Public Functions
-
std::string print()
Format all accumulated metrics as a multi-line human-readable string.
- Returns:
Formatted report string.
Public Members
-
int k
K-fold index for this report.
-
int epoch
Epoch number for this report.
-
bool is_complete = false
trueonce the epoch has finished processing.
-
metrics *waiting_plot = nullptr
Pointer to the
metricsobject waiting to produce plots for this epoch, ornullptrif not applicable.
-
std::vector<double> current_lr = {}
Per-parameter-group learning rates at the end of this epoch.
-
std::map<mode_enum, std::map<std::string, float>> loss_graph = {}
Graph-level loss values keyed by mode then feature name.
-
std::map<mode_enum, std::map<std::string, float>> loss_node = {}
Node-level loss values keyed by mode then feature name.
-
std::map<mode_enum, std::map<std::string, float>> loss_edge = {}
Edge-level loss values keyed by mode then feature name.
-
std::map<mode_enum, std::map<std::string, float>> accuracy_graph = {}
Graph-level accuracy values keyed by mode then feature name.
-
std::map<mode_enum, std::map<std::string, float>> accuracy_node = {}
Node-level accuracy values keyed by mode then feature name.
-
std::map<mode_enum, std::map<std::string, float>> accuracy_edge = {}
Edge-level accuracy values keyed by mode then feature name.
-
std::string run_name
Name of the training run (as passed to
analysis::add_model).
-
std::string mode
Current mode string (
"training","validation", or"evaluation").
-
long iters = 0
Number of gradient-update iterations performed so far.
-
long num_evnt = 0
Number of events processed in the current epoch.
-
float progress
Fractional progress through the epoch in [0, 1].
-
std::string print()
roc_t — ROC curve data
-
struct roc_t
Holds the data for one ROC curve (one class, one k-fold, one model).
Stores the ROC curve data for one class and one k-fold.
Aggregates the ground-truth label arrays (
truth) and classifier score arrays (scores) together with the computed true-positive rate (tpr_), false-positive rate (fpr_), and area-under-curve values (_auc) for a specific class index (cls) and k-fold (kfold). Thetruthandscorespointers are not owned by this struct.Public Members
-
int cls = 0
Class index.
-
int kfold = 0
K-fold index.
-
std::string model = ""
Name of the model that produced this ROC curve.
-
std::vector<double> _auc = {}
Area under the ROC curve for each threshold sweep.
-
std::vector<std::vector<double>> tpr_ = {}
True positive rate (recall) values; one vector per threshold sweep.
-
std::vector<std::vector<double>> fpr_ = {}
False positive rate values; one vector per threshold sweep.
-
std::vector<std::vector<int>> *truth = nullptr
Pointer to the ground-truth label arrays (not owned).
-
std::vector<std::vector<double>> *scores = nullptr
Pointer to the classifier score arrays (not owned).
-
int cls = 0