Analysis Interface
The main interfacing class that automates and defines the workflow from MVA training, ROOT n-tuples production, GNN inference, sample generation and much more.
The C++ Interface
-
class analysis : public notification, public tools
-
settings_t m_settings
A member struct varible used to control and specify runtime behaviour.
-
void add_samples(std::string path, std::string label)
A function used to specify the directory of the ROOT samples used for the analysis. Accepted syntax for the path parameter is /path/<name>.root or /path/*.root. The label parameter is useful when samples need to separated, but is optional.
-
settings_t m_settings
- class Analysis
- AddSamples(str path, str label)
A function used to assign a sample label (arbitrary name) to a particular dataset.
- AddEvent(EventTemplate ev, str label)
A function used to pass the event implementation to be used for subsequent compilations.
- AddGraph(GraphTemplate ev, str label)
A function used to tell the framework which graph implementation should be used.
- AddSelection(SelectionTemplate selc)
A function which adds any selection templates to the current analysis workflow.
- AddModel(ModelTemplate model, OptimizerConfig op, str run_name)
A function used to add a model to be trained, along with any optimizer hyperparameters that should be applied to the model. The additional run_name variable is used to create folders that contain the training output.
- AddModelInference(ModelTemplate model, str run_name = "run_name")
A function used to add a trained model that should be used for inference studies. The run_name variable is used to generate folder structures for output ROOT files that hold model predictions.
- Start()
- Variables:
BatchSize (int)
FetchMeta (bool)
BuildCache (str)
PreTagEvents (bool)
SaveSelectionToROOT (bool)
GetMetaData (bool) – Attempts to identify any meta-data associated with the input samples and queries PyAMI to match any results.
SumOfWeightsTreeName (list) – Scans the ROOT file for possible sum of weights trees and histograms.
OutputPath (str) – The output path of the results.
kFolds (int) – Number of folds to train the model with.
kFold (list) – A list of kfolds to train the model. Useful if not enough resources are available to do a full k-fold train at once.
Epochs (int) – Number of epochs to train the model.
NumExamples (int) – Number of test example to validate runtime of the model.
TrainingDataset (str) – Path of the training set to use. If a value is given but no training set is available, the framework will dump a .h5 file.
TrainSize (int) – Size of the training set in percentage.
Training (bool) – Run the model over the training set.
Validation (bool) – Run the model over the validation set in a k-fold training session.
Evaluation (bool) – Run the model over the evaluation set.
ContinueTraining (bool) – Continue training the model at the last known checkpoint.
nBins (int) – Number of bins to plot the invariant mass metrics with.
Refresh (int) – Progress bar refresh step.
MaxRange (float) – Maximum range to plot the invariant mass metric plots.
VarPt (str) – The transverse momentum variable string name to use for the invariant mass computation.
VarEta (str) – The rapidity variable string name to use for the invariant mass computation.
VarPhi (str) – The azimuthal angle variable string name to use for the invariant mass computation.
VarEnergy (str) – The energy variable string name to use for the invariant mass computation.
Targets (list) – The targets to plot (the output of the model) e.g. top_edge.
DebugMode (bool) – Disables all threading.
Threads (int) – Number of threads to run the framework over.
GraphCache (str) – Specifies a directory in which graph_template outputs should be cached. This will generate .h5 files that can be reused.