Encoding Base Class

class pyhdc.Encoding(dimension: int = 10000, backend: Literal['numpy', 'torch'] | None = None, device: str | torch.device | None = None, dtype: Any | None = None, mask: int | None = None, generator: HDCGenerator | None = None, similarity_remap: Callable | None = None)[source]

Bases: ABC

Base class for hypervector encoding schemes.

An encoding defines how hypervectors are generated and how operations (similarity, bundling, binding) are performed on them.

bind(*hypervectors: ndarray | torch.Tensor | Hypervector | List, batch_dim: int | None = None) Hypervector | List[Hypervector][source]

Bind hypervectors, batching automatically when the input is batched.

A single (D, N) (or higher-rank) batch is bound in one call: the element-wise binders (MAP multiply, BSC xor, FHRR angle add) must broadcast natively, and the others (convolution, shifting, matrix, VTB, CDT) are applied per column internally, so the result is one batched Hypervector. batch_dim is no longer required.

Parameters:
  • *hypervectors – Hypervector objects, raw arrays, or lists to bind.

  • batch_dim – Deprecated. Splits a 3D+ array along this axis and returns a list of results, pass a batched array directly instead.

Returns:

A single Hypervector, or a list of Hypervectors for the deprecated batch_dim / nested-list forms.

bundle(*hypervectors: ndarray | torch.Tensor | Hypervector | List, axis: None | int | Tuple[int, ...] = None, batch_dim: int | None = None) Hypervector | List[Hypervector][source]

Bundle multiple hypervectors, optionally in batches.

Parameters:
  • *hypervectors – Hypervector objects, raw arrays, or lists to bundle

  • batch_dim – If provided with 3D+ array, split along this dimension for batching

Returns:

Single Hypervector (if not batched) or List of Hypervectors (if batched)

Examples

>>> # Single bundle (current behavior)
>>> bsc.bundle(hv1, hv2, hv3)  # Returns: Hypervector
>>> bsc.bundle([hv1, hv2, hv3])  # Returns: Hypervector
>>> # Batched bundles (new)
>>> bsc.bundle([[hv1, hv2], [hv3, hv4]])  # Returns: [bundled1, bundled2]
>>> bsc.bundle(array_3d, batch_dim=0)
... # Returns: list of bundled hypervectors
>>> enc.bundle(tensor_DNM, axis=2)  # (D, N, M) -> (D, N)
from_array(array: ndarray | torch.Tensor, backend: Literal['numpy', 'torch'] | None = None) Hypervector[source]

Create a Hypervector from an existing array.

generate(size: int | Tuple[int, ...] = None, backend: Literal['numpy', 'torch'] | None = None, device: str | torch.device | None = None, use_generator: bool | None = None) Hypervector[source]

Generate random hypervector(s).

Hypervectors are dimension-first. A scalar (or None) size produces a single (D,) hypervector; a tuple (D, N) produces a batch of N hypervectors of dimension D stored as columns of a (D, N) array (and likewise (D, N, M) for higher-rank batches).

Batched generation is reproducible under a fixed seed for a given batch shape. For the i.i.d. element generators the batch is drawn in one vectorized call. Ordered and custom generators draw per vector.

Parameters:
  • sizeNone or int for a single (D,) vector; a tuple (D, *batch) for a batch of prod(batch) vectors of dimension D.

  • backend – Backend override (defaults to the encoding’s backend).

  • device – Device override for the torch backend.

  • use_generator – Whether to use the HDCGenerator pathway. Defaults to True if a custom generator was passed at construction, False otherwise (uses element_generator directly, which gives the correct per-encoding distribution).

Returns:

A new Hypervector.

get_generator() HDCGenerator[source]

Get the current generator.

inverse(hypervector: ndarray | torch.Tensor | Hypervector) Hypervector[source]

Binding inverse of a hypervector.

Raises NotImplementedError for encodings whose binding has no defined inverse (e.g. MAP_C continuous, VTB, MBAT, BSDC_*).

negative(hypervector: ndarray | torch.Tensor | Hypervector) Hypervector[source]

Bundling (additive) inverse of a hypervector.

Raises NotImplementedError for encodings with no defined negative (e.g. FHRR, BSC, BSDC_*).

normalize(hypervector: ndarray | torch.Tensor | Hypervector) Hypervector[source]

Normalize a hypervector to its encoding’s canonical form (L2 unit length for real encodings, bipolar sign for MAP, phase wrap for FHRR).

Raises NotImplementedError for encodings with no defined normalization (e.g. BSC, BSDC_*).

permute(hypervector: ndarray | torch.Tensor | Hypervector, shift: int = 1) Hypervector[source]

Permute (cyclic-shift) a hypervector along the dimension axis (axis 0).

Parameters:
  • hypervector – Hypervector or raw array to permute.

  • shift – Positions to roll along axis 0; negative inverts the permute.

Returns:

A new permuted Hypervector.

set_generator(generator: HDCGenerator) None[source]

Set a new generator for this encoding.

Parameters:

generator – The new generator to use

Raises:

GeneratorNotSupportedError – If generator doesn’t support required output type

similarity(hvA: ndarray | torch.Tensor | Hypervector | List, hvB: ndarray | torch.Tensor | Hypervector | List | None = None, *, axis: int | None = None) float | ndarray | torch.Tensor | List[float | ndarray | torch.Tensor][source]

Compute similarity between hypervector(s).

Hypervectors are dimension-first (D, N). Calling conventions:

  • similarity(a, b) with two (D,) vectors -> a scalar score

  • similarity(A, B) with two (D, N) batches -> N per-column scores

  • similarity(v, B) with a vector and a (D, N) batch -> N scores

  • similarity(batch) with one (D, N) batch -> N-1 scores of column 0 against each remaining column

  • similarity([..], [..]) with two equal-length lists -> pairwise scores

Parameters:
  • hvA – First hypervector(s) (Hypervector, array, or list), or a single (D, N) batch when hvB is omitted.

  • hvB – Optional second hypervector(s).

Returns:

A scalar, a 1D array of scores, or a list of scores (for list inputs).

Examples

>>> bsc.similarity(hv1, hv2)                       # scalar
>>> enc.similarity(codebook)                       # col 0 vs the rest
>>> bsc.similarity([hv1, hv2], [hv4, hv5])         # [sim(1,4), sim(2,5)]
thin(hypervector: ndarray | torch.Tensor | Hypervector | List) Hypervector | List[Hypervector][source]

Apply thinning to hypervector(s).

Supports batching: if a list is provided, applies thinning independently to each hypervector in the list.

Parameters:

hypervector – Hypervector object, raw array, or list of hypervectors to thin

Returns:

Single Hypervector (if single input) or List of Hypervectors (if list input)

Examples

>>> # Single thinning
>>> bsc.thin(hv)  # Returns: Hypervector
>>> # Batched thinning
>>> bsc.thin([hv1, hv2, hv3])  # Returns: [thinned1, thinned2, thinned3]
unbind(*hypervectors: ndarray | torch.Tensor | Hypervector | List, batch_dim: int | None = None) Hypervector | List[Hypervector][source]

Unbind hypervectors, batching automatically when the input is batched.

Mirrors bind(): a single (D, N) batch is unbound in one call (element-wise unbinders broadcast; the others are applied per column), so batch_dim is no longer required.

Parameters:
  • *hypervectors – Hypervector objects, raw arrays, or lists to unbind.

  • batch_dim – Deprecated. Splits a 3D+ array along this axis and returns a list of results, pass a batched array directly instead.

Returns:

A single Hypervector, or a list of Hypervectors for the deprecated batch_dim / nested-list forms.

Raises:

NotImplementedError – For encodings that do not support unbinding.

zeros(size: int | Tuple[int, ...] = None, backend: Literal['numpy', 'torch'] | None = None, device: str | torch.device | None = None) Hypervector[source]

Generate zero hypervector(s).

Constructor parameters

All encoding classes share this constructor signature:

Encoding(
    dimension=10_000,
    backend=None,
    device=None,
    dtype=None,
    mask=None,
    generator=None,
    similarity_remap=None,
)

Parameter

Type

Default

Description

dimension

int

10_000

Number of elements per hypervector.

backend

str or None

None

"numpy" or "torch". None inherits the global default (see prefer_torch() / prefer_numpy()), which is "numpy" unless changed.

device

str or None

None

PyTorch device string ("cpu", "cuda", "cuda:1", …). Only meaningful when backend="torch".

dtype

dtype or None

None

Override the encoding’s default data type. If None, uses the type specified by EncodingSpec.dtype.

mask

int or None

None

Bit mask for MAP_I_Bits; sets the integer bit width. Ignored by all other encodings.

generator

HDCGenerator or None

None

Custom random generator. If None, uses a DefaultGenerator backed by NumPy.

similarity_remap

callable or None

None

Function applied to every similarity result. E.g., remap_to_unit() to map [-1,1] → [0,1].

Properties

property Encoding.dimension: int

Number of elements per hypervector.

property Encoding.backend: str

"numpy" or "torch".

property Encoding.device: str or None

PyTorch device string, or None for the NumPy backend.

Methods

Encoding.generate(size=None, backend=None, device=None, use_generator=None)[source]

Generate one or more hypervectors.

Parameters:
  • sizeNone -> single (D,) vector; int -> a single vector of that dimension; tuple (D, N) -> a dimension-first batch of N vectors (each column a hypervector).

  • backend – Override the encoding’s default backend for this call.

  • device – Override the encoding’s default device for this call.

  • use_generatorTrue forces the custom generator; False forces NumPy’s default; None uses the encoding’s setting.

Returns:

Hypervector

Encoding.zeros(size=None, backend=None, device=None)[source]

Return a zero-valued hypervector or batch.

Returns:

Hypervector

Encoding.from_array(array, backend=None)[source]

Wrap an existing NumPy array or PyTorch tensor as a Hypervector.

Parameters:
  • arrayndarray or Tensor with last dimension equal to self.dimension.

  • backend – Override backend detection.

Returns:

Hypervector

Raises:

DimensionsNotMatchingError – If the array’s last dimension ≠ self.dimension.

Encoding.similarity(hvA, hvB=None)[source]

Compute similarity. Accepts Hypervector objects, raw arrays, or lists. If hvB is omitted, hvA must be a (D, N) batch and column 0 is compared against each remaining column. See batched calling conventions.

Returns:

float, ndarray, Tensor, or list depending on inputs.

Encoding.bundle(*hypervectors, axis=None, batch_dim=None)[source]

Bundle hypervectors. A batched (D, *batch) input is reduced automatically. axis= selects which batch axis to collapse (an int or a tuple of ints, defaulting to the last). Axis 0 is the hypervector dimension and cannot be reduced.

Parameters:
  • hypervectors – Positional Hypervector arguments, a batched array, or a list of lists for grouped bundling.

  • axis – Batch axis (or tuple of axes) to reduce.

  • batch_dim – Deprecated as of 2.1.0; emits DeprecationWarning and will be removed. Pass a batched array or use axis=.

Returns:

Hypervector or list[Hypervector]

Encoding.bind(*hypervectors, batch_dim=None)[source]

Bind hypervectors. Batched inputs are handled automatically. The element-wise binders broadcast, and every other binder is applied per column, returning one batched result.

Parameters:

batch_dim – Deprecated as of 2.1.0, emits DeprecationWarning and will be removed in a future release. Pass a batched array instead.

Returns:

Hypervector

Encoding.unbind(*hypervectors, batch_dim=None)[source]

Unbind to recover a component. Batched inputs are handled automatically, the same way as bind().

Parameters:

batch_dim – Deprecated as of 2.1.0; emits DeprecationWarning and will be removed in a future release. Pass a batched array instead.

Raises:

NotImplementedError – For encodings that do not support unbinding.

Returns:

Hypervector

Encoding.thin(hypervector)[source]

Apply the encoding’s thinning operation.

Returns:

Hypervector or list[Hypervector]

Encoding.set_generator(generator)[source]

Replace the encoding’s generator.

Parameters:

generator – A HDCGenerator instance.

Encoding.get_generator()[source]

Return the current generator.

Returns:

HDCGenerator

Abstract method (for subclassers)

Encoding._get_encoding_spec()[source]

Return an EncodingSpec that wires together the component functions for this encoding.

Returns:

EncodingSpec

EncodingSpec dataclass

class pyhdc.EncodingSpec[source]

Specification dataclass that links an encoding to its component functions.

Field

Description

dtype

NumPy data type for elements (e.g., np.float32, np.int8)

element_generator

Callable producing random element values given (size, dtype)

similarity_fn

Callable implementing the similarity metric

bundling_fn

Callable implementing bundling

thinning_fn

Callable implementing thinning (or NoThin if not applicable)

binding_fn

Callable implementing binding

unbinding_fn

Callable implementing unbinding

mask

Optional integer bit mask (used by MAP_I_Bits)

generator_output_type

"bits", "words", or "floats": the output type this encoding requires from a custom generator

BackendManager

class pyhdc.BackendManager[source]

Static utility for backend detection and conversion.

static get_backend(array)[source]

Return "numpy" or "torch" for the given array.

static to_numpy(array)[source]

Convert to numpy.ndarray. Detaches from autograd if needed.

static to_torch(array, device=None)[source]

Convert to torch.Tensor on the specified device.

static get_device(array)[source]

Return the device string of a tensor, or None for NumPy arrays.