The SampleTracer¶
A core module which is being inherited by all generator classes. This class focuses on tracking sample generation including events, graphs and selections, and how they are cached in memory. The class can be used as a standalone package, but is primarily intended to be integrated into any abstract generator class that one might want to implement, this will be illustrated later. Most of the functionalities, such as the magic functions are implemented within this class, making it a rather useful core module.
Methods and Attributes¶
- class Event¶
A simple wrapper class used to batch cached objects into a single object. Upon calling for a specific attribute, the class will scan available objects for the attribute. Unlike most sub-modules within the package, this class has limited functionalities in terms of magic functions.
- release_event() EventTemplate¶
A method which releases the event object from the sample tracer batch.
- release_graph() GraphTemplate¶
A method which releases the graph object from the sample tracer batch.
- release_selection() SelectionTemplate¶
A method which releases the selection object from the sample tracer batch.
- event_cache_dir() dict¶
Returns a dictionary of the current caching directory of the given event name.
- graph_cache_dir() dict¶
Returns a dictionary of the current caching directory of the given event name.
- selection_cache_dir() dict¶
Returns a dictionary of the current caching directory of the given event name.
- __eq__() bool¶
Returns true if the events have the same hash.
- __hash__() bool¶
Allows for the use of set and dict, where the event can be interpreted as a key in a dictionary.
- hash -> str
Returns a the hash of the current event.
- __getstate__ tuple[meta_t, batch_t]¶
Allows the event to be pickled.
- __setstate__(tuple[meta_t, batch_t])¶
Rebuilds the Event from a meta_t and batch_t data type.
- class SampleTracer¶
- __getstate__ -> tracer_t
Export this tracer including all samples (selections, graphs, events) and state (settings).
- __setstate__(tracer_t inpt)¶
Import tracer parameters including all samples (selections, graphs, events) and state (settings).
- __getitem__(key: list | str) bool or list¶
Scan indexed content and return a list of matches or a boolean if nothing has been found.
- __contains__(str val) bool¶
Check if query is in sample tracer.
- __len__ -> int
Return length of the entire sample.
- __add__(SampleTracer other) SampleTracer¶
Add two SampleTracers to create an independent SampleTracer. Content of both samples is compared and summed as a set.
- __radd__(other) SampleTracer¶
- __iadd__(SampleTracer other) SampleTracer¶
Append the incoming tracer object to this tracer.
- __iter__()¶
Iteratate over the Sample Tracer with given parameters, e.g. cache type etc.
- __next__ -> Event
The return of the iterator is an Event (Not to be confused with EventTemplate). This Event is a batched version of SelectionTemplate/GraphTemplate/EventTemplate and MetaData
- preiteration -> bool
A place holder for adding last minute behaviour changes to the iteration process. This can include loading specific caches or changing general behaviour, i.e. pre-fetching etc.
- DumpTracer(retag: str | None) None¶
Preserve the index map of the samples within the tracer. The output of this is a set of HDF5 files, which are written in the form of their Logical File Names or original sample name.
- Parameters:
retag (str, None) – Allows for tagging specific samples of the tracer to be tagged.
- RestoreTracer(dict tracers = {}, sample_name: Union[None, str]) None¶
Restore the index map of the samples within the tracer.
- Parameters:
tracers (dict) – Restore these HDF5 file directories
sample_name (None, str) – Restore only tracer samples with a particular sample name tag.
- DumpEvents -> None
Preserve the EventTemplates in HDF5 files.
- DumpGraphs -> None
Preserve the GraphTemplates in HDF5 files.
- DumpSelections -> None
Preserve the SelectionTemplates in HDF5 files.
- RestoreEvents(list these_hashes = []) None¶
Restore EventTemplates matching a particular set of hashes. :params list these_hashes: A list of hashes consistent with events indexed by the tracer.
- RestoreGraphs(list these_hashes = []) None¶
Restore GraphTemplates matching a particular set of hashes. :params list these_hashes: A list of hashes consistent with events indexed by the tracer.
- RestoreSelections(list these_hashes = []) None¶
Restore SelectionTemplates matching a particular set of hashes. :params list these_hashes: A list of hashes consistent with events indexed by the tracer.
- FlushEvents(list these_hashes = []) None¶
Delete EventTemplates matching a particular set of hashes from RAM :params list these_hashes: A list of hashes consistent with events indexed by the tracer.
- FlushGraphs(list these_hashes = []) None¶
Delete GraphsTemplates matching a particular set of hashes from RAM. :params list these_hashes: A list of hashes consistent with events indexed by the tracer.
- FlushSelections(list these_hashes = []) None¶
Delete SelectionTemplates matching a particular set of hashes from RAM. :params list these_hashes: A list of hashes consistent with events indexed by the tracer.
- _makebar(inpt: int, CustTitle: None | str = None)¶
Creates a tqdm progress bar. :params int inpt: Length of the sample, i.e. the range of the bar. :params None, str CustTitle: Override the default progress prefix title (see Caller).
- trace_code(obj) code_t¶
Preserve an object which is independent of the current file implementation (see Code). :params obj: Any Python object
- rebuild_code(val: list | str | None) list[Code]¶
Rebuild a set of Code objects which mimic the originally traced code. :params list, str, None val: Rebuild these strings from the traced code of the SampleTracer.
- ImportSettings(settings_t inpt) None¶
Apply settings from the input to the current SampleTracer. :params settings_t inpt: A dictionary like object with specific keys. See the Data Type and Dictionary Section.
- ExportSettings -> settings_t
Export the current settings of the SampleTracer.
- clone -> SampleTracer
Returns a copy of the current object SampleTracer object. This will NOT clone the content of the source tracer.
- is_self(inpt, obj=SampleTracer) bool¶
Checks whether the input has a type consistent with the object type (also inherited objects are permitted). :params inpt: Any Python object :params obj: The target object type to check against, e.g. SampleTracer type.
- makehashes() dict¶
Returns a dictionary of current hashes not found in RAM.
- makelist() list[Event]¶
Returns a list of Event objects regardless if Templates are not loaded in memory.
- AddEvent(event_inpt, meta_inpt=None) None¶
- AddGraph(graph_inpt, meta_inpt=None) None¶
- AddSelections(selection_inpt, meta_inpt=None) None¶
- SetAttribute(fx, str name) bool¶
- Tree -> str
Returns current ROOT Tree being used.
- ShowTrees -> list[str]
Returns a list of ROOT Trees found within the index.
- Event -> EventTemplate or Code
Specifies the an EventTemplate inherited event implementation to use for building Event objects from ROOT Files.
- ShowEvents -> list[str]
Returns a list of EventTemplate implementations found within the index.
- GetEvent -> bool
Forcefully get or ignore EventTemplate types from the Event object. This is useful to avoid redundant sample fetching from RAM.
- EventCache -> bool
Specifies whether to generate a cache after constructing Event objects. If this is enabled without specifying a ProjectName, a folder called UNTITLED is generated.
- EventName -> str
- Graph -> GraphTemplate or Code
Specifies the event graph implementation to use for constructing graphs.
- ShowGraphs -> list[str]
- GetGraph -> bool
- DataCache -> bool
Specifies whether to generate a cache after constructing graph objects. If this is enabled without having an event cache, the Event attribute needs to be set.
- GraphName -> str
- Selections -> dict[str, SelectionTemplate or Code]
- ShowSelections -> list[str]
- GetSelection -> bool
- SelectionName -> str
- Optimizer -> str
Expects a string of the specific optimizer to use. Current choices are; SGD - Stochastic Gradient Descent and ADAM.
- Scheduler -> str
Expects a string of the specific scheduler to use. Current choices are - ExponentialLR - CyclicLR
- Model -> ModelWrapper or Code
The target model to be trained.
- OptimizerParams -> dict
A dictionary containing the specific input parameters for the chosen Optimizer.
- SchedulerParams -> dict
A dictionary containing the specific input parameters for the chosen Scheduler.
- ModelParams -> dict
- kFold -> list[int]
Explicitly use these kFolds during training. This can be quite useful when doing parallel traning, since each kFold is trained completely independently. The variable can be set to a single integer or list of integers
- Epoch -> int
- kFolds -> int
Number of folds to use for training
- Epochs -> int
- BatchSize -> int
How many Graphs to group into a single big graph (also known as batch training).
- GetAll -> bool
- nHashes -> int
- ShowLength -> dict
- EventStart -> int or None
The event to start from given a set of ROOT samples. Useful for debugging specific events.
- EventStop -> int or None
The number of events to generate.
- EnablePyAMI -> bool
- Files -> dict
- SampleMap -> dict
- ProjectName -> str
Specifies the output folder of the analysis. If the folder is non-existent, a folder will be created.
- OutputDirectory -> str
Specifies the output directory of the analysis. This is useful if the output needs to be placed outside of the working directory.
- WorkingPath -> str
Returns the current working path of the Analysis. Constructed as; OutputDirectory/ProjectName
- RunName -> str
The name given to the particular training session of the Graph Neural Network.
- Caller -> str
A string controlling the verbose information prefix.
- Verbose -> int
An integer which increases the verbosity of the framework, with 3 being the highest and 0 the lowest.
- DebugMode -> bool
Expects a boolean, if this is set to True, a complete print out of the training is displayed.
- Chunks -> int
An integer which regulates the number of entries to process for each given core. This is particularly relevant when constructing events, as to avoid memory issues. As an example, if Threads is set to 2 and chnk is set to 10, then 10 events will be processed per core.
- Threads -> int
The number of CPU threads to use for running the framework. If the number of threads is set to 1, then the framework will not print a progress bar.
- Device -> str
The device used to run
PyTorchtraining on. Options arecudaorcpu.
- TrainingName -> str
Name of the training sample to be used.
- SortByNodes -> bool
Sort the input graph sample by nodes. This is useful when the model is node agnostic, but requires recomputation of internal variables based on variable graph node sizes. For instance, when computing the combinatorial of a graph, it is faster to compute the combinations for n-nodes and batch n-sized graphs into a single sample set.
- ContinueTraining -> bool
Whether to continue the training from the last known checkpoint (after each epoch).
- KinematicMap -> dict
place holder
- EnableReconstruction -> bool
place holder
- PlotLearningMetrics -> bool
Whether to output various metric plots whilst training. This can be enabled before training or re-run after training from the training cache.