spine.io.dataset.LArCVDataset

class spine.io.dataset.LArCVDataset(schema: Mapping[str, Mapping[str, Any]], dtype: str, overlay_methods: Mapping[str, str] | None = None, augment: Mapping[str, Any] | None = None, **kwargs: Any)[source]

Torch dataset that parses LArCV entries into SPINE products.

The dataset wraps spine.io.read.LArCVReader and a parser schema. The schema maps output product names to parser configurations from spine.io.parse.larcv. During initialization, the dataset instantiates each parser, collects every LArCV tree key required by those parsers, and passes the union of those tree keys to the reader.

Each loaded entry is returned as a dictionary containing standard dataset metadata, such as index and source-file provenance fields, plus one parsed product per schema entry. Optional augmentation is applied after all parser products are produced.

Attributes:
data_keys

Return metadata and parser-product keys exposed by this dataset.

data_types

Return the collate type for metadata and parsed products.

overlay_methods

Return overlay methods for metadata and parsed products.

Methods

apply_augmenter(data)

Apply the configured augmenter, if present.

build_augmenter(augment)

Instantiate the configured augmenter, if any.

index_data_types()

Return the standard collate types for metadata keys.

index_overlay_methods()

Return the standard overlay methods for metadata keys.

list_data(file_path)

List top-level products available in an input LArCV file.

metadata_dict(data)

Extract standard dataset metadata from one reader output.

__init__(schema: Mapping[str, Mapping[str, Any]], dtype: str, overlay_methods: Mapping[str, str] | None = None, augment: Mapping[str, Any] | None = None, **kwargs: Any) None[source]

Instantiate the LArCV-backed dataset.

Parameters:
  • schema (mapping) – Mapping from output product name to parser configuration. Each parser configuration must identify a parser from spine.io.parse.larcv using parser or name and provide any parser-specific LArCV product names.

  • dtype (str) – Floating-point dtype forwarded to parser factories.

  • overlay_methods (mapping, optional) – Explicit overlay-method overrides for parser products.

  • augment (mapping, optional) – Augmentation configuration applied to each parsed sample.

  • **kwargs (Any) – Reader-specific keyword arguments forwarded to spine.io.read.LArCVReader, such as file_keys and entry-list filters.

Methods

__init__(schema, dtype[, overlay_methods, ...])

Instantiate the LArCV-backed dataset.

apply_augmenter(data)

Apply the configured augmenter, if present.

build_augmenter(augment)

Instantiate the configured augmenter, if any.

index_data_types()

Return the standard collate types for metadata keys.

index_overlay_methods()

Return the standard overlay methods for metadata keys.

list_data(file_path)

List top-level products available in an input LArCV file.

metadata_dict(data)

Extract standard dataset metadata from one reader output.

Attributes

data_keys

Return metadata and parser-product keys exposed by this dataset.

data_types

Return the collate type for metadata and parsed products.

name

overlay_methods

Return overlay methods for metadata and parsed products.

parsers

reader

augmenter

name: ClassVar[str] = 'larcv'
parsers: dict[str, Any]
reader: LArCVReader
property data_types: dict[str, str]

Return the collate type for metadata and parsed products.

Parser return types are consumed by spine.io.collate.CollateAll to batch products consistently.

property overlay_methods: dict[str, str]

Return overlay methods for metadata and parsed products.

Parser overlay metadata is consumed by spine.io.overlay.Overlayer when multiple entries are combined into one training sample.

property data_keys: tuple[str, ...]

Return metadata and parser-product keys exposed by this dataset.

static list_data(file_path: str) dict[str, list[str]][source]

List top-level products available in an input LArCV file.

Parameters:

file_path (str) – Path to one LArCV input file.

Returns:

Mapping from product category to available product names.

Return type:

dict