spine.io.factories
Functions that instantiate IO tools from configuration blocks.
Functions
|
Instantiate a collate function from configuration. |
|
Instantiate a dataset from configuration. |
|
Instantiate a PyTorch |
|
Instantiate a reader from a configuration block. |
|
Instantiate a sampler from configuration. |
|
Instantiate a writer from a configuration block. |
- spine.io.factories.reader_factory(reader_cfg: Mapping[str, Any] | str) Any[source]
Instantiate a reader from a configuration block.
The configured
namemust match a reader class exported fromspine.io.read.- Parameters:
reader_cfg (Mapping[str, Any] or str) – Reader configuration mapping or the short reader name.
- Returns:
Instantiated reader object.
- Return type:
object
- spine.io.factories.writer_factory(writer_cfg: Mapping[str, Any] | str, prefix: str | list[str] | None = None, split: bool = False) Any[source]
Instantiate a writer from a configuration block.
The configured
namemust match a writer class exported fromspine.io.write.- Parameters:
writer_cfg (Mapping[str, Any] or str) – Writer configuration mapping or the short writer name.
prefix (str or list[str], optional) – Input file prefix or per-file list of prefixes used to derive output names when the writer supports prefix-based naming.
split (bool, default False) – Request one output file per input file. Writers that do not support unsplit output may reject
split=Falseexplicitly.
- Returns:
Instantiated writer object.
- Return type:
object
- spine.io.factories.loader_factory(dataset: Mapping[str, Any] | str, dtype: str, batch_size: int | None = None, minibatch_size: int | None = None, shuffle: bool = True, sampler: Mapping[str, Any] | str | None = None, num_workers: int = 0, collate_fn: Mapping[str, Any] | str | None = None, entry_list: list[int] | None = None, distributed: bool = False, world_size: int = 0, rank: int | None = None, **kwargs: Any) Any[source]
Instantiate a PyTorch
DataLoaderfrom configuration.- Parameters:
dataset (mapping or str) – Dataset configuration mapping or short dataset name.
dtype (str) – Floating-point dtype passed to the dataset factory.
batch_size (int, optional) – Global batch size. Mutually exclusive with
minibatch_size.minibatch_size (int, optional) – Per-process batch size. Mutually exclusive with
batch_size.shuffle (bool, default True) – Whether to shuffle batches in the underlying loader.
sampler (mapping or str, optional) – Sampler configuration mapping or short sampler name.
num_workers (int, default 0) – Number of loader worker processes.
collate_fn (mapping or str, optional) – Collate function configuration mapping or short collate name.
entry_list (list[int], optional) – Explicit subset of dataset entries to expose.
distributed (bool, default False) – If
True, wrap the sampler for distributed loading.world_size (int, default 0) – Number of distributed processes/devices.
rank (int, optional) – Distributed process rank. Required when
distributed=True.**kwargs (dict) – Extra keyword arguments forwarded to
torch.utils.data.DataLoader.
- Returns:
Instantiated data loader.
- Return type:
torch.utils.data.DataLoader
- spine.io.factories.dataset_factory(dataset_cfg: Mapping[str, Any] | str, entry_list: list[int] | None = None, dtype: str | None = None) Any[source]
Instantiate a dataset from configuration.
- Parameters:
dataset_cfg (Mapping[str, Any] or str) – Dataset configuration mapping or short dataset name.
entry_list (list[int], optional) – Explicit subset of dataset entries to expose. When provided here, it overrides any
entry_listalready present indataset_cfg.dtype (str, optional) – Floating-point dtype forwarded to the dataset constructor.
- Returns:
Instantiated dataset object.
- Return type:
object
- spine.io.factories.sampler_factory(sampler_cfg: Mapping[str, Any] | str, dataset: Any, minibatch_size: int, distributed: bool = False, num_replicas: int = 1, rank: int | None = None) Any[source]
Instantiate a sampler from configuration.
- Parameters:
sampler_cfg (mapping or str) – Sampler configuration mapping or short sampler name.
dataset (object) – Dataset instance used to initialize the sampler.
minibatch_size (int) – Per-process batch size passed to the sampler.
distributed (bool, default False) – If
True, wrap the sampler inDistributedProxySampler.num_replicas (int, default 1) – Number of distributed processes/devices.
rank (int, optional) – Distributed process rank. Required when
distributed=True.
- Returns:
Instantiated sampler object, optionally wrapped for distributed loading.
- Return type:
object
- spine.io.factories.collate_factory(collate_cfg: Mapping[str, Any] | str, data_types: Mapping[str, str], overlay_methods: Mapping[str, str]) Any[source]
Instantiate a collate function from configuration.
- Parameters:
collate_cfg (Mapping[str, Any] or str) – Collate configuration mapping or short collate function name.
data_types (Mapping[str, str]) – Mapping from parser output keys to their declared data type.
overlay_methods (Mapping[str, str]) – Mapping from parser output keys to the overlay method used when combining data from multiple sources.
- Returns:
Instantiated collate callable.
- Return type:
collections.abc.Callable