Bundling Operations

Bundling combines multiple hypervectors into one that is similar to all inputs.

All bundling operations are available in pyhdc.components.bundling.

Reducing over a batch axis

Bundling reduces a batch of hypervectors along one axis and returns the combined result. Axis 0 is always the hypervector dimension \(D\) and is never a reduce axis, reduction happens over the batch axes (axis 1 and higher).

The axis keyword on bundle() selects which batch axis to collapse. The default is the last axis. For a (D, N) batch this reduces axis 1 and returns a single (D,) hypervector, matching the 2.0 behaviour. For a (D, N, M) tensor it reduces axis 2 and returns (D, N):

import pyhdc

batch = pyhdc.MAP_I().generate(size=(10000, 8))      # (D, N)
combined = pyhdc.bundle(batch)                       # (D,) over axis 1

tensor = pyhdc.MAP_I().generate(size=(10000, 8, 4))  # (D, N, M)
per_column = pyhdc.bundle(tensor)                    # (D, N) over axis 2
over_axis1 = pyhdc.bundle(tensor, axis=1)            # (D, M)

Passing axis=0 raises ValueError because axis 0 is the dimension and cannot be reduced.

Tuple of axes. The additive, element-wise bundlers (ElementAddition and its variants, AnglesOfElementAddition, and Disjunction) accept a tuple of axes and fold them together in one reduction. DisjunctionThinned (BSDC_THIN) reduces a single axis only, because its thinning is per column. A tuple applies to a single batched tensor, collapsing axes 1 and 2 of a (D, N, M) tensor yields a single (D,) hypervector:

tensor = pyhdc.MAP_I().generate(size=(10000, 8, 4))  # (D, N, M)
flat = pyhdc.bundle(tensor, axis=(1, 2))             # (D,)

Bundling multiple separate operands requires (D,) or (D, N) inputs, an operand with three or more axes raises ValueError.

ElementAddition

Used by: MAP_I (HRR_NoNorm internally)

Simple element-wise sum with no normalisation:

\[(\bigoplus_k \mathbf{v}_k)_i = \sum_k v_{k,i}\]

The result’s magnitude grows with the number of bundled vectors. Similarity decreases slightly with each additional vector.

ElementAdditionCut

Used by: MAP_C

Element-wise sum followed by clipping each element back into the valid range:

\[(\bigoplus \mathbf{v})_i = \text{clip}\!\left(\sum_k v_{k,i},\; -1,\; 1\right)\]

ElementAdditionBits

Used by: MAP_I_Bits

Element-wise sum in a wide (int64) accumulator, then a single saturating clip to the int32 range. The clip happens once after the full reduction, not per addition, so a running sum that would overflow mid-accumulation saturates at the bounds instead of wrapping.

ElementAdditionBinaryThreshold

Used by: BSC

Element-wise sum followed by a majority-vote threshold: each output element is 1 if more than half the input elements at that position are 1, otherwise 0.

For an odd number of inputs, this is deterministic. For an even number, ties at exactly 0.5 are resolved randomly.

\[\begin{split}(\bigoplus \mathbf{v})_i = \begin{cases} 1 & \text{if } \sum_k v_{k,i} > n/2 \\ 0 & \text{if } \sum_k v_{k,i} < n/2 \\ \text{rand}(\{0,1\}) & \text{if } \sum_k v_{k,i} = n/2 \end{cases}\end{split}\]

Randomized-bundling metadata. The tie-randomizing bundlers report how many coordinates were resolved by a coin flip through random_zone_count. The type follows the result shape: for a (D,) result it is a Python int (the number of tie coordinates in that single vector). For a batched result it is a per-output-vector count array, one entry for each bundled output. Because the value at each tie coordinate is drawn at random, batched bundling that resolves ties has no fixed-seed guarantee, use axis= for the reproducible vectorized form.

ElementAdditionBipolarThreshold

Used by: MAP_B

Element-wise sum followed by a sign function, remapping to {-1, +1}.

ElementAdditionNormalized

Used by: HRR, VTB, MBAT

Element-wise sum followed by L2 normalisation:

\[\bigoplus \mathbf{v} = \frac{\sum_k \mathbf{v}_k}{\left\|\sum_k \mathbf{v}_k\right\|_2}\]

The result is always a unit vector, which preserves the geometric properties needed for cosine similarity to work reliably.

ElementAdditionConstantNormalized

Used by: HRR_ConstNorm

Divides by \(\sqrt{M}\) where \(M\) is the number of bundled vectors:

\[\bigoplus_M \mathbf{v} = \frac{\sum_k \mathbf{v}_k}{\sqrt{M}}\]

This normalises the expected magnitude rather than the actual magnitude, which gives a different noise profile than L2 normalisation.

AnglesOfElementAddition

Used by: FHRR

For angle-valued vectors, bundling sums the phasors and extracts the angle of the resultant:

\[(\bigoplus \mathbf{v})_i = \arg\!\left(\sum_k e^{j v_{k,i}}\right)\]

This is the circular mean of a set of angles: appropriate when values are periodic (e.g., directions, phases).

Disjunction (bitwise OR)

Used by: BSDC_CDT, BSDC_S, BSDC_SEG

Element-wise bitwise OR of binary vectors:

\[(\bigoplus \mathbf{v})_i = \bigvee_k v_{k,i}\]

OR can only turn bits on, never off, so density monotonically increases with each bundle step. After \(n\) steps with initial density \(\rho\):

\[\rho_n \approx 1 - (1 - \rho)^n\]

After 100 bundle steps from \(\rho_0 = 0.01\), density reaches \(1 - (0.99)^{100} \approx 0.63\). This makes all vectors indistinguishable. Use BSDC_THIN to avoid this problem.

DisjunctionThinned

Used by: BSDC_THIN

Bitwise OR followed by random thinning to maintain a target density:

  1. Compute element-wise OR

  2. Count bits that are 1 and compute actual density \(\rho_{\text{actual}}\)

  3. If \(\rho_{\text{actual}} > \rho_{\text{target}}\), randomly clear bits until density returns to \(\rho_{\text{target}}\)

\[p(\text{clear bit } i \mid v_i = 1) = 1 - \frac{\rho_{\text{target}}}{\rho_{\text{actual}}}\]

This keeps density stable at the initial level regardless of how many bundle steps are performed.

The density parameter of DisjunctionThinned is supplied by the BSDC_THIN encoding from its own density constructor argument (default 0.5).