Encoding Base Class

Bases: ABC

Base class for hypervector encoding schemes.

An encoding defines how hypervectors are generated and how operations (similarity, bundling, binding) are performed on them.

Bind multiple hypervectors, optionally in batches.

Parameters:

*hypervectors – Hypervector objects, raw arrays, or lists to bind
batch_dim – If provided with 3D+ array, split along this dimension for batching

Returns:

Single Hypervector (if not batched) or List of Hypervectors (if batched)

Bundle multiple hypervectors, optionally in batches.

Parameters:

*hypervectors – Hypervector objects, raw arrays, or lists to bundle
batch_dim – If provided with 3D+ array, split along this dimension for batching

Returns:

Single Hypervector (if not batched) or List of Hypervectors (if batched)

Examples

>>> # Single bundle (current behavior)
>>> bsc.bundle(hv1, hv2, hv3)  # Returns: Hypervector
>>> bsc.bundle([hv1, hv2, hv3])  # Returns: Hypervector

>>> # Batched bundles (new)
>>> bsc.bundle([[hv1, hv2], [hv3, hv4]])  # Returns: [bundled1, bundled2]
>>> bsc.bundle(array_3d, batch_dim=0)  # Returns: list of bundled hypervectors

from_array(array: ndarray | torch.Tensor, backend: Literal['numpy', 'torch'] | None = None) → Hypervector[source]: Create a Hypervector from an existing array.

Generate random hypervector(s).

Hypervectors are dimension-first. A scalar (or None) size produces a single (D,) hypervector; a tuple (D, N) produces a batch of N hypervectors of dimension D stored as columns of a (D, N) array (and likewise (D, N, M) for higher-rank batches).

Batched generation is defined as generating the N hypervectors one at a time and stacking them as columns, so under a fixed seed generate(size=(D, N)) yields exactly the same vectors as N successive generate(size=D) calls – including for ordered generators (LCG/LFSR/…).

Parameters:

size – None or int for a single (D,) vector; a tuple (D, *batch) for a batch of prod(batch) vectors of dimension D.
backend – Backend override (defaults to the encoding’s backend).
device – Device override for the torch backend.
use_generator – Whether to use the HDCGenerator pathway. Defaults to True if a custom generator was passed at construction, False otherwise (uses element_generator directly, which gives the correct per-encoding distribution).

Returns:

A new Hypervector.

get_generator() → HDCGenerator[source]: Get the current generator.

set_generator(generator: HDCGenerator) → None[source]

Set a new generator for this encoding.

Parameters:: generator – The new generator to use
Raises:: GeneratorNotSupportedError – If generator doesn’t support required output type

Compute similarity between hypervector(s).

Hypervectors are dimension-first (D, N). Calling conventions:

similarity(a, b) with two (D,) vectors -> a scalar score
similarity(A, B) with two (D, N) batches -> N per-column scores
similarity(v, B) with a vector and a (D, N) batch -> N scores
similarity(batch) with one (D, N) batch -> N-1 scores of column 0 against each remaining column
similarity([..], [..]) with two equal-length lists -> pairwise scores

Parameters:

hvA – First hypervector(s) (Hypervector, array, or list), or a single (D, N) batch when hvB is omitted.
hvB – Optional second hypervector(s).

Returns:

A scalar, a 1D array of scores, or a list of scores (for list inputs).

Examples

>>> bsc.similarity(hv1, hv2)                       # scalar
>>> enc.similarity(codebook)                       # col 0 vs the rest
>>> bsc.similarity([hv1, hv2], [hv4, hv5])         # [sim(1,4), sim(2,5)]

thin(hypervector: ndarray | torch.Tensor | Hypervector | List) → Hypervector | List[Hypervector][source]

Apply thinning to hypervector(s).

Supports batching: if a list is provided, applies thinning independently to each hypervector in the list.

Parameters:: hypervector – Hypervector object, raw array, or list of hypervectors to thin
Returns:: Single Hypervector (if single input) or List of Hypervectors (if list input)

Examples

>>> # Single thinning
>>> bsc.thin(hv)  # Returns: Hypervector

>>> # Batched thinning
>>> bsc.thin([hv1, hv2, hv3])  # Returns: [thinned1, thinned2, thinned3]

Unbind hypervectors, optionally in batches.

Parameters:

*hypervectors – Hypervector objects, raw arrays, or lists to unbind
batch_dim – If provided with 3D+ array, split along this dimension for batching

Returns:

Single Hypervector (if not batched) or List of Hypervectors (if batched)

zeros(size: int | Tuple[int, ...] = None, backend: Literal['numpy', 'torch'] | None = None, device: str | torch.device | None = None) → Hypervector[source]: Generate zero hypervector(s).

Constructor parameters

All encoding classes share this constructor signature:

Encoding(
    dimension=10_000,
    backend=None,
    device=None,
    dtype=None,
    mask=None,
    generator=None,
    similarity_remap=None,
)

Parameter	Type	Default	Description
`dimension`	`int`	`10_000`	Number of elements per hypervector.
`backend`	`str` or `None`	`None`	`"numpy"` or `"torch"`. `None` inherits the global default (see `prefer_torch()` / `prefer_numpy()`), which is `"numpy"` unless changed.
`device`	`str` or `None`	`None`	PyTorch device string (`"cpu"`, `"cuda"`, `"cuda:1"`, …). Only meaningful when `backend="torch"`.
`dtype`	dtype or `None`	`None`	Override the encoding’s default data type. If `None`, uses the type specified by `EncodingSpec.dtype`.
`mask`	`int` or `None`	`None`	Bit mask for `MAP_I_Bits`; sets the integer bit width. Ignored by all other encodings.
`generator`	`HDCGenerator` or `None`	`None`	Custom random generator. If `None`, uses a `DefaultGenerator` backed by NumPy.
`similarity_remap`	`callable` or `None`	`None`	Function applied to every similarity result. E.g., `remap_to_unit()` to map [-1,1] → [0,1].

Properties

property Encoding.dimension: int: Number of elements per hypervector.

property Encoding.backend: str: "numpy" or "torch".

property Encoding.device: str or None: PyTorch device string, or None for the NumPy backend.

Methods

Encoding.generate(size=None, backend=None, device=None, use_generator=None)[source]

Generate one or more hypervectors.

Parameters:

size – None -> single (D,) vector; int -> a single vector of that dimension; tuple (D, N) -> a dimension-first batch of N vectors (each column a hypervector).
backend – Override the encoding’s default backend for this call.
device – Override the encoding’s default device for this call.
use_generator – True forces the custom generator; False forces NumPy’s default; None uses the encoding’s setting.

Returns:

Hypervector

Encoding.zeros(size=None, backend=None, device=None)[source]

Return a zero-valued hypervector or batch.

Returns:: Hypervector

Encoding.from_array(array, backend=None)[source]

Wrap an existing NumPy array or PyTorch tensor as a Hypervector.

Parameters:

array – ndarray or Tensor with last dimension equal to self.dimension.
backend – Override backend detection.

Returns:

Hypervector

Raises:

DimensionsNotMatchingError – If the array’s last dimension ≠ self.dimension.

Encoding.similarity(hvA, hvB=None)[source]

Compute similarity. Accepts Hypervector objects, raw arrays, or lists. If hvB is omitted, hvA must be a (D, N) batch and column 0 is compared against each remaining column. See batched calling conventions.

Returns:: float, ndarray, Tensor, or list depending on inputs.

Encoding.bundle(*hypervectors, batch_dim=None)[source]

Bundle hypervectors.

Parameters:

hypervectors – Positional Hypervector arguments, or a list of lists for batched bundling.
batch_dim – Axis along which to bundle for 3-D tensor inputs.

Returns:

Hypervector or list[Hypervector]

Encoding.bind(*hypervectors, batch_dim=None)[source]

Bind hypervectors.

Returns:: Hypervector or list[Hypervector]

Encoding.unbind(*hypervectors, batch_dim=None)[source]

Unbind to recover a component.

Raises:: NotImplementedError – For encodings that do not support unbinding.
Returns:: Hypervector or list[Hypervector]

Encoding.thin(hypervector)[source]

Apply the encoding’s thinning operation.

Returns:: Hypervector or list[Hypervector]

Encoding.set_generator(generator)[source]

Replace the encoding’s generator.

Parameters:: generator – A HDCGenerator instance.

Encoding.get_generator()[source]

Return the current generator.

Returns:: HDCGenerator

Abstract method (for subclassers)

Encoding._get_encoding_spec()[source]

Return an EncodingSpec that wires together the component functions for this encoding.

Returns:: EncodingSpec

EncodingSpec dataclass

class pyhdc.EncodingSpec[source]

Specification dataclass that links an encoding to its component functions.

Field	Description
`dtype`	NumPy data type for elements (e.g., `np.float32`, `np.int8`)
`element_generator`	Callable producing random element values given `(size, dtype)`
`similarity_fn`	Callable implementing the similarity metric
`bundling_fn`	Callable implementing bundling
`thinning_fn`	Callable implementing thinning (or `NoThin` if not applicable)
`binding_fn`	Callable implementing binding
`unbinding_fn`	Callable implementing unbinding
`mask`	Optional integer bit mask (used by `MAP_I_Bits`)
`generator_output_type`	`"bits"`, `"words"`, or `"floats"`: the output type this encoding requires from a custom generator

BackendManager

class pyhdc.BackendManager[source]

Static utility for backend detection and conversion.

static get_backend(array)[source]: Return "numpy" or "torch" for the given array.

static to_numpy(array)[source]: Convert to numpy.ndarray. Detaches from autograd if needed.

static to_torch(array, device=None)[source]: Convert to torch.Tensor on the specified device.

static get_device(array)[source]: Return the device string of a tensor, or None for NumPy arrays.