spine.data.batch.IndexBatch
- class spine.data.batch.IndexBatch(data: Any | Sequence[Any], spans: Sequence[int] | Any, counts: Sequence[int] | Any | None = None, single_counts: Sequence[int] | Any | None = None, batch_ids: Sequence[int] | Any | None = None, batch_size: int | None = None, default: Any | None = None)[source]
Batched index with the necessary methods to slice it.
- spans
(B) Per-entry parent spans used to build the batch offsets. This is the same quantity as the parser-side
spanand may be required when serializing unwrapped indexes for later rebatching.- Type:
Union[np.ndarray, torch.Tensor]
- offsets
(B) Offsets between successive indexes in the batch, computed from the cumulative sum of
spans.- Type:
Union[np.ndarray, torch.Tensor]
- single_counts
(I) Number of index elements per index in the index list. This is the same as counts if the underlying data is a single index
- Type:
Union[np.ndarray, torch.Tensor]
- Attributes:
batch_idsReturns the batch ID of each index in the list.
full_batch_idsReturns the batch ID of each element in the full index list.
full_countsReturns the total number of elements in each batch entry.
full_indexReturns the index combining all sub-indexes, if relevant.
indexAlias for the underlying data stored.
index_idsReturns the ID of the index in the list each element belongs to.
index_listAlias for the underlying data list stored.
shapeShape of the underlying data.
splitsBoundaries needed to split the data into its constituents.
Methods
get_counts(batch_ids, batch_size)Finds the number of elements in each entry, provided a batch ID list.
get_edges(counts)Finds the edges between successive entries in the batch.
merge(index_batch)Merge this index batch with another.
split()Breaks up the index batch into its constituents.
to_numpy()Cast underlying index to a np.ndarray and return a new instance.
to_tensor([dtype, device])Cast underlying index to a torch.tensor and return a new instance.
- __init__(data: Any | Sequence[Any], spans: Sequence[int] | Any, counts: Sequence[int] | Any | None = None, single_counts: Sequence[int] | Any | None = None, batch_ids: Sequence[int] | Any | None = None, batch_size: int | None = None, default: Any | None = None) None[source]
Initialize the attributes of the class.
- Parameters:
data (Union[np.ndarray, torch.Tensor,) – List[Union[np.ndarray, torch.Tensor]]] Simple batched index or list of indexes
spans (Union[List[int], np.ndarray, torch.Tensor]) –
Per-entry parent spans used to derive
offsets.
counts (Union[List[int], np.ndarray, torch.Tensor], optional) –
Number of indexes in the batch
single_counts (Union[List[int], np.ndarray, torch.Tensor], optional) – (I) Number of index elements per index in the index list. This is the same as counts if the underlying data is a single index
batch_ids (Union[List[int], np.ndarray, torch.Tensor], optional) – (I) Batch index of each of the clusters. If not specified, the assumption is that each count corresponds to a specific entry
batch_size (int, optional) – Number of entries in the batch. Must be specified along batch_ids
default (Union[np.ndarray, torch.Tensor], optional) – Empty-index prototype used when initializing an empty index list
Methods
__init__(data, spans[, counts, ...])Initialize the attributes of the class.
get_counts(batch_ids, batch_size)Finds the number of elements in each entry, provided a batch ID list.
get_edges(counts)Finds the edges between successive entries in the batch.
merge(index_batch)Merge this index batch with another.
split()Breaks up the index batch into its constituents.
to_numpy()Cast underlying index to a np.ndarray and return a new instance.
to_tensor([dtype, device])Cast underlying index to a torch.tensor and return a new instance.
Attributes
Returns the batch ID of each index in the list.
Returns the batch ID of each element in the full index list.
Returns the total number of elements in each batch entry.
Returns the index combining all sub-indexes, if relevant.
Alias for the underlying data stored.
Returns the ID of the index in the list each element belongs to.
Alias for the underlying data list stored.
shapeShape of the underlying data.
splitsBoundaries needed to split the data into its constituents.
- data: Any | Sequence[Any]
- counts: Any
- single_counts: Any
- edges: Any
- spans: Any
- offsets: Any
- batch_size: int
- property index: Any
Alias for the underlying data stored.
- Returns:
Underlying index
- Return type:
Union[np.ndarray, torch.Tensor]
- property index_list: Sequence[Any]
Alias for the underlying data list stored.
- Returns:
Underlying index list
- Return type:
List[Union[np.ndarray, torch.Tensor]]
- property full_index: Any
Returns the index combining all sub-indexes, if relevant.
- Returns:
Complete concatenated index
- Return type:
Union[np.ndarray, torch.Tensor]
- property index_ids: Any
Returns the ID of the index in the list each element belongs to.
- Returns:
List of index IDs for each element
- Return type:
Union[np.ndarray, torch.Tensor]
- property full_counts: Any
Returns the total number of elements in each batch entry.
- Returns:
Number of elements in each batch entry
- Return type:
Union[np.ndarray, torch.Tensor]
- property batch_ids: Any
Returns the batch ID of each index in the list.
- Returns:
Batch ID array, one per index in the list
- Return type:
Union[np.ndarray, torch.Tensor]
- property full_batch_ids: Any
Returns the batch ID of each element in the full index list.
- Returns:
Complete batch ID array, one per element
- Return type:
Union[np.ndarray, torch.Tensor]
- split() list[Any] | list[list[Any]][source]
Breaks up the index batch into its constituents.
- Returns:
List of list of indexes per entry in the batch
- Return type:
List[List[Union[np.ndarray, torch.Tensor]]]
- merge(index_batch: IndexBatch) IndexBatch[source]
Merge this index batch with another.
- Parameters:
index_batch (IndexBatch) – Other index batch object to merge with
- Returns:
Merged index batch
- Return type:
- to_numpy() IndexBatch[source]
Cast underlying index to a np.ndarray and return a new instance.
- Returns:
New TensorBatch object with an underlying np.ndarray tensor.
- Return type:
- to_tensor(dtype: Any = None, device: Any = None) IndexBatch[source]
Cast underlying index to a torch.tensor and return a new instance.
- Parameters:
dtype (torch.dtype, optional) – Data type of the tensor to create
device (torch.device, optional) – Device on which to put the tensor
- Returns:
New TensorBatch object with an underlying np.ndarray tensor.
- Return type: