Encodings Overview

An encoding in PyHDC is a complete specification of how hypervectors are generated and how the three primitives (bundle, bind, similarity) are implemented. The sections below cover the shared base class and then each of the four families.

The Encoding base class

All encoding classes inherit from Encoding. The constructor accepts these shared parameters:

Parameter

Type

Description

dimension

int

Number of elements per hypervector (default: 10,000)

backend

str

"numpy" (default) or "torch"

device

str or None

PyTorch device string (e.g., "cuda", "cpu"); ignored for NumPy

dtype

dtype or None

Override the default data type

mask

int or None

Bit mask for MAP_I_Bits; sets the integer bit width

generator

HDCGenerator or None

Custom random generator; if None, uses NumPy’s default RNG

similarity_remap

callable or None

Function applied to every similarity result (e.g., remap_to_unit)

The encoding delegates to an EncodingSpec dataclass that wires together the five component functions:

@dataclass
class EncodingSpec:
    dtype: Any
    element_generator: Callable
    similarity_fn: Callable
    bundling_fn: Callable
    thinning_fn: Callable
    binding_fn: Callable
    unbinding_fn: Callable
    mask: Optional[int] = None
    generator_output_type: Literal["bits", "words", "floats"] = "floats"

Subclasses implement _get_encoding_spec() to return a populated EncodingSpec. Users never interact with EncodingSpec directly.

MAP family

Multiplicative-Additive-Permutation encodings use dense bipolar or binary vectors.

MAP_C; Continuous MAP
  • Elements: float in [-1, 1], default dtype float32

  • Binding: element-wise multiplication (self-inverse)

  • Bundling: element-wise addition with threshold/cut

  • Similarity: cosine

  • Unbind: yes

  • Best for: general-purpose, well-studied, good continuous capacity

MAP_I; Integer MAP
  • Elements: int {-1, +1}, default dtype int32

  • Same operations as MAP_C but in integer arithmetic

  • Requires a bit-output or word-output generator

  • Unbind: yes

MAP_I_Bits; Fixed-Width Integer MAP
  • Like MAP_I but the mask parameter sets a custom integer bit width

  • Useful for hardware implementations targeting specific word sizes

MAP_B; Binary MAP
  • Elements: binary {0, 1}, default dtype int8

  • Dense binary variant; binding is element-wise product; bundling clips to {0, 1}

  • Unbind: yes

HRR family

Holographic Reduced Representations use dense continuous vectors with circular convolution binding.

HRR; Holographic Reduced Representation
  • Elements: normal float, default dtype float32

  • Binding: circular convolution (implemented via FFT)

  • Unbinding: circular correlation

  • Bundling: element-wise addition then L2 normalisation

  • Similarity: cosine

  • Best for: large capacity requirements

HRR_NoNorm
  • Like HRR but bundling does not normalise the result

  • Vector magnitude grows with the number of bundled items

HRR_ConstNorm
  • Bundling normalises by \(\sqrt{M}\) where \(M\) is the number of bundled vectors, keeping constant norm regardless of bundle size

FHRR; Fourier HRR
  • Elements: angles in [0, 2π], stored as float32

  • Binding: element-wise angle addition (modular arithmetic)

  • Unbinding: element-wise angle subtraction

  • Bundling: compute resultant angle of summed phasors

  • Similarity: cosine of element-wise angle difference

  • Best for: periodic signals, phase-based feature spaces

Matrix family

These encodings use matrix operations for binding, giving them stronger algebraic properties at the cost of additional storage.

VTB; Vector-derived Transformation Binding
  • Elements: normal float, dtype float32

  • Binding: constructs a matrix from the key vector and applies it to the value

  • Unbinding: transpose of the transformation matrix

  • Bundling: normalised addition (same as HRR)

MBAT; Matrix Binding of Additive Terms
  • Elements: normal float, dtype float32

  • Binding: multiply by a random matrix; the matrix is stored in metadata

  • Unbinding: multiply by the matrix inverse (retrieved from get_metadata())

  • Important: you must preserve the metadata dict from the binding result to perform unbinding later

Binary family

BSC; Binary Spatter Code
  • Elements: binary {0, 1}, default dtype int8

  • Dense (Bernoulli p = 0.5)

  • Binding: XOR (exactly self-inverse)

  • Bundling: majority-vote threshold (element is 1 if > half the inputs are 1)

  • Similarity: Hamming distance (remapped to [-1, 1])

  • Unbind: yes (exact, not approximate)

  • Best for: hardware efficiency and situations where exact unbinding is needed

Sparse Binary (BSDC) family

All BSDC variants share:

  • Elements: sparse binary {0, 1}, dtype int8

  • Initial density \(\approx\) 1-5% (controlled by BernoulliSparse element generator)

  • Bundling: bitwise OR (with or without thinning)

  • Similarity: Overlap (remapped to [-1, 1])

BSDC_CDT; Context-Dependent Thinning
  • Binding: Additive context-dependent thinning

  • Unbind: not supported (thinning is not invertible)

  • Bundling: OR only (no thinning); density grows with each bundle step

BSDC_S; Sparse with Shifting
  • Binding: circular shift by one position per bind step

  • Unbind: yes (inverse shift)

  • Bundling: OR only; density grows

BSDC_SEG; Sparse Segmented
  • Like BSDC_S but shift is applied per-segment of the vector

  • Useful for segment-wise positional encoding

BSDC_THIN; Sparse with Thinning (v1.1.0)
  • Binding: circular shift (same as BSDC_S)

  • Unbind: yes

  • Bundling: OR followed by random thinning to maintain target density

  • Best for: applications requiring many bundle steps without density saturation