Encoding Base Class

Bases: ABC

Base class for hypervector encoding schemes.

An encoding defines how hypervectors are generated and how operations (similarity, bundling, binding) are performed on them.

Bind hypervectors, batching automatically when the input is batched.

A single (D, N) (or higher-rank) batch is bound in one call: the element-wise binders (MAP multiply, BSC xor, FHRR angle add) must broadcast natively, and the others (convolution, shifting, matrix, VTB, CDT) are applied per column internally, so the result is one batched Hypervector. batch_dim is no longer required.

Parameters:

*hypervectors – Hypervector objects, raw arrays, or lists to bind.
batch_dim – Deprecated. Splits a 3D+ array along this axis and returns a list of results, pass a batched array directly instead.

Returns:

A single Hypervector, or a list of Hypervectors for the deprecated batch_dim / nested-list forms.

Bundle multiple hypervectors, optionally in batches.

Parameters:

*hypervectors – Hypervector objects, raw arrays, or lists to bundle
batch_dim – If provided with 3D+ array, split along this dimension for batching

Returns:

Single Hypervector (if not batched) or List of Hypervectors (if batched)

Examples

>>> # Single bundle (current behavior)
>>> bsc.bundle(hv1, hv2, hv3)  # Returns: Hypervector
>>> bsc.bundle([hv1, hv2, hv3])  # Returns: Hypervector

>>> # Batched bundles (new)
>>> bsc.bundle([[hv1, hv2], [hv3, hv4]])  # Returns: [bundled1, bundled2]
>>> bsc.bundle(array_3d, batch_dim=0)
... # Returns: list of bundled hypervectors
>>> enc.bundle(tensor_DNM, axis=2)  # (D, N, M) -> (D, N)

from_array(array: ndarray | torch.Tensor, backend: Literal['numpy', 'torch'] | None = None) → Hypervector[source]: Create a Hypervector from an existing array.

Generate random hypervector(s).

Hypervectors are dimension-first. A scalar (or None) size produces a single (D,) hypervector; a tuple (D, N) produces a batch of N hypervectors of dimension D stored as columns of a (D, N) array (and likewise (D, N, M) for higher-rank batches).

Batched generation is reproducible under a fixed seed for a given batch shape. For the i.i.d. element generators the batch is drawn in one vectorized call. Ordered and custom generators draw per vector.

Parameters:

size – None or int for a single (D,) vector; a tuple (D, *batch) for a batch of prod(batch) vectors of dimension D.
backend – Backend override (defaults to the encoding’s backend).
device – Device override for the torch backend.
use_generator – Whether to use the HDCGenerator pathway. Defaults to True if a custom generator was passed at construction, False otherwise (uses element_generator directly, which gives the correct per-encoding distribution).

Returns:

A new Hypervector.

get_generator() → HDCGenerator[source]: Get the current generator.

inverse(hypervector: ndarray | torch.Tensor | Hypervector) → Hypervector[source]

Binding inverse of a hypervector.

Raises NotImplementedError for encodings whose binding has no defined inverse (e.g. MAP_C continuous, VTB, MBAT, BSDC_*).

negative(hypervector: ndarray | torch.Tensor | Hypervector) → Hypervector[source]

Bundling (additive) inverse of a hypervector.

Raises NotImplementedError for encodings with no defined negative (e.g. FHRR, BSC, BSDC_*).

normalize(hypervector: ndarray | torch.Tensor | Hypervector) → Hypervector[source]

Normalize a hypervector to its encoding’s canonical form (L2 unit length for real encodings, bipolar sign for MAP, phase wrap for FHRR).

Raises NotImplementedError for encodings with no defined normalization (e.g. BSC, BSDC_*).

permute(hypervector: ndarray | torch.Tensor | Hypervector, shift: int = 1) → Hypervector[source]

Permute (cyclic-shift) a hypervector along the dimension axis (axis 0).

Parameters:

hypervector – Hypervector or raw array to permute.
shift – Positions to roll along axis 0; negative inverts the permute.

Returns:

A new permuted Hypervector.

set_generator(generator: HDCGenerator) → None[source]

Set a new generator for this encoding.

Parameters:: generator – The new generator to use
Raises:: GeneratorNotSupportedError – If generator doesn’t support required output type

Compute similarity between hypervector(s).

Hypervectors are dimension-first (D, N). Calling conventions:

similarity(a, b) with two (D,) vectors -> a scalar score
similarity(A, B) with two (D, N) batches -> N per-column scores
similarity(v, B) with a vector and a (D, N) batch -> N scores
similarity(batch) with one (D, N) batch -> N-1 scores of column 0 against each remaining column
similarity([..], [..]) with two equal-length lists -> pairwise scores

Parameters:

hvA – First hypervector(s) (Hypervector, array, or list), or a single (D, N) batch when hvB is omitted.
hvB – Optional second hypervector(s).

Returns:

A scalar, a 1D array of scores, or a list of scores (for list inputs).

Examples

>>> bsc.similarity(hv1, hv2)                       # scalar
>>> enc.similarity(codebook)                       # col 0 vs the rest
>>> bsc.similarity([hv1, hv2], [hv4, hv5])         # [sim(1,4), sim(2,5)]

thin(hypervector: ndarray | torch.Tensor | Hypervector | List) → Hypervector | List[Hypervector][source]

Apply thinning to hypervector(s).

Supports batching: if a list is provided, applies thinning independently to each hypervector in the list.

Parameters:: hypervector – Hypervector object, raw array, or list of hypervectors to thin
Returns:: Single Hypervector (if single input) or List of Hypervectors (if list input)

Examples

>>> # Single thinning
>>> bsc.thin(hv)  # Returns: Hypervector

>>> # Batched thinning
>>> bsc.thin([hv1, hv2, hv3])  # Returns: [thinned1, thinned2, thinned3]

Unbind hypervectors, batching automatically when the input is batched.

Mirrors bind(): a single (D, N) batch is unbound in one call (element-wise unbinders broadcast; the others are applied per column), so batch_dim is no longer required.

Parameters:

*hypervectors – Hypervector objects, raw arrays, or lists to unbind.
batch_dim – Deprecated. Splits a 3D+ array along this axis and returns a list of results, pass a batched array directly instead.

Returns:

A single Hypervector, or a list of Hypervectors for the deprecated batch_dim / nested-list forms.

Raises:

NotImplementedError – For encodings that do not support unbinding.

zeros(size: int | Tuple[int, ...] = None, backend: Literal['numpy', 'torch'] | None = None, device: str | torch.device | None = None) → Hypervector[source]: Generate zero hypervector(s).

Constructor parameters

All encoding classes share this constructor signature:

Encoding(
    dimension=10_000,
    backend=None,
    device=None,
    dtype=None,
    mask=None,
    generator=None,
    similarity_remap=None,
)

Parameter	Type	Default	Description
`dimension`	`int`	`10_000`	Number of elements per hypervector.
`backend`	`str` or `None`	`None`	`"numpy"` or `"torch"`. `None` inherits the global default (see `prefer_torch()` / `prefer_numpy()`), which is `"numpy"` unless changed.
`device`	`str` or `None`	`None`	PyTorch device string (`"cpu"`, `"cuda"`, `"cuda:1"`, …). Only meaningful when `backend="torch"`.
`dtype`	dtype or `None`	`None`	Override the encoding’s default data type. If `None`, uses the type specified by `EncodingSpec.dtype`.
`mask`	`int` or `None`	`None`	Bit mask for `MAP_I_Bits`; sets the integer bit width. Ignored by all other encodings.
`generator`	`HDCGenerator` or `None`	`None`	Custom random generator. If `None`, uses a `DefaultGenerator` backed by NumPy.
`similarity_remap`	`callable` or `None`	`None`	Function applied to every similarity result. E.g., `remap_to_unit()` to map [-1,1] → [0,1].

Properties

property Encoding.dimension: int: Number of elements per hypervector.

property Encoding.backend: str: "numpy" or "torch".

property Encoding.device: str or None: PyTorch device string, or None for the NumPy backend.

Methods

Encoding.generate(size=None, backend=None, device=None, use_generator=None)[source]

Generate one or more hypervectors.

Parameters:

size – None -> single (D,) vector; int -> a single vector of that dimension; tuple (D, N) -> a dimension-first batch of N vectors (each column a hypervector).
backend – Override the encoding’s default backend for this call.
device – Override the encoding’s default device for this call.
use_generator – True forces the custom generator; False forces NumPy’s default; None uses the encoding’s setting.

Returns:

Hypervector

Encoding.zeros(size=None, backend=None, device=None)[source]

Return a zero-valued hypervector or batch.

Returns:: Hypervector

Encoding.from_array(array, backend=None)[source]

Wrap an existing NumPy array or PyTorch tensor as a Hypervector.

Parameters:

array – ndarray or Tensor with last dimension equal to self.dimension.
backend – Override backend detection.

Returns:

Hypervector

Raises:

DimensionsNotMatchingError – If the array’s last dimension ≠ self.dimension.

Encoding.similarity(hvA, hvB=None)[source]

Compute similarity. Accepts Hypervector objects, raw arrays, or lists. If hvB is omitted, hvA must be a (D, N) batch and column 0 is compared against each remaining column. See batched calling conventions.

Returns:: float, ndarray, Tensor, or list depending on inputs.

Encoding.bundle(*hypervectors, axis=None, batch_dim=None)[source]

Bundle hypervectors. A batched (D, *batch) input is reduced automatically. axis= selects which batch axis to collapse (an int or a tuple of ints, defaulting to the last). Axis 0 is the hypervector dimension and cannot be reduced.

Parameters:

hypervectors – Positional Hypervector arguments, a batched array, or a list of lists for grouped bundling.
axis – Batch axis (or tuple of axes) to reduce.
batch_dim – Deprecated as of 2.1.0; emits DeprecationWarning and will be removed. Pass a batched array or use axis=.

Returns:

Hypervector or list[Hypervector]

Encoding.bind(*hypervectors, batch_dim=None)[source]

Bind hypervectors. Batched inputs are handled automatically. The element-wise binders broadcast, and every other binder is applied per column, returning one batched result.

Parameters:: batch_dim – Deprecated as of 2.1.0, emits DeprecationWarning and will be removed in a future release. Pass a batched array instead.
Returns:: Hypervector

Encoding.unbind(*hypervectors, batch_dim=None)[source]

Unbind to recover a component. Batched inputs are handled automatically, the same way as bind().

Parameters:: batch_dim – Deprecated as of 2.1.0; emits DeprecationWarning and will be removed in a future release. Pass a batched array instead.
Raises:: NotImplementedError – For encodings that do not support unbinding.
Returns:: Hypervector

Encoding.thin(hypervector)[source]

Apply the encoding’s thinning operation.

Returns:: Hypervector or list[Hypervector]

Encoding.set_generator(generator)[source]

Replace the encoding’s generator.

Parameters:: generator – A HDCGenerator instance.

Encoding.get_generator()[source]

Return the current generator.

Returns:: HDCGenerator

Abstract method (for subclassers)

Encoding._get_encoding_spec()[source]

Return an EncodingSpec that wires together the component functions for this encoding.

Returns:: EncodingSpec

EncodingSpec dataclass

class pyhdc.EncodingSpec[source]

Specification dataclass that links an encoding to its component functions.

Field	Description
`dtype`	NumPy data type for elements (e.g., `np.float32`, `np.int8`)
`element_generator`	Callable producing random element values given `(size, dtype)`
`similarity_fn`	Callable implementing the similarity metric
`bundling_fn`	Callable implementing bundling
`thinning_fn`	Callable implementing thinning (or `NoThin` if not applicable)
`binding_fn`	Callable implementing binding
`unbinding_fn`	Callable implementing unbinding
`mask`	Optional integer bit mask (used by `MAP_I_Bits`)
`generator_output_type`	`"bits"`, `"words"`, or `"floats"`: the output type this encoding requires from a custom generator

BackendManager

class pyhdc.BackendManager[source]

Static utility for backend detection and conversion.

static get_backend(array)[source]: Return "numpy" or "torch" for the given array.

static to_numpy(array)[source]: Convert to numpy.ndarray. Detaches from autograd if needed.

static to_torch(array, device=None)[source]: Convert to torch.Tensor on the specified device.

static get_device(array)[source]: Return the device string of a tensor, or None for NumPy arrays.