Data Loader / Generator
The dataloader class manages the in-memory graph dataset, providing k-fold
splitting, test-set generation, HDF5-based graph caching, and batching for
multi-GPU training. It is populated by sampletracer::populate_dataloader
and used exclusively by optimizer.
Class: dataloader
Header: <generators/dataloader.h>
Inheritance: notification, tools
Dataset Management Methods
Signature |
Description |
|---|---|
|
Reserves percentage percent of the dataset as the held-out test set. The remainder is used for k-fold training/validation. |
|
Partitions the training set into k folds. |
|
Returns the training subset for fold k. |
|
Returns the validation subset for fold k. |
|
Returns the held-out test set. |
|
Returns the inference dataset (label → graphs). |
Batching Methods
Signature |
Description |
|---|---|
|
Builds a batched |
|
Deletes batched |
|
Returns num randomly sampled |
|
Moves the tensor data of gr to CPU memory (for serialisation). |
Device Transfer Methods
Signature |
Description |
|---|---|
|
Transfers the entire dataset to the device specified by op. |
|
Transfers to multiple devices simultaneously (one per kfold/GPU). |
|
Starts the background CUDA memory management thread (CUDA builds only). |
Cache / Restore Methods
Signature |
Description |
|---|---|
|
Serialises all graphs to HDF5 files in path using threads workers.
Returns |
|
Deserialises graphs from HDF5 files at paths. |
|
Deserialises graphs from all HDF5 files in directory paths. |
|
Dumps the k-fold train/validation/test split indices to path. |
|
Restores the split indices from path. Returns |