How to Bundle Hypervectors
Bundling combines multiple hypervectors into one that is similar to all inputs. Four equivalent ways to bundle are provided.
Four bundling methods
1. Instance method: bundle one hypervector with others:
import pyhdc
enc = pyhdc.MAP_C(dimension=10_000)
a, b, c = enc.generate(), enc.generate(), enc.generate()
result = a.bundle(b, c) # result is similar to a, b, and c
2. Encoding method: bundle via the encoding object:
result = enc.bundle(a, b, c)
You can also pass a list:
hvs = [enc.generate() for _ in range(10)]
result = enc.bundle(*hvs)
3. Convenience function: module-level shortcut:
result = pyhdc.bundle(a, b, c)
4. Batched bundling: bundle multiple groups at once:
# Bundle [[a,b], [c,d]] -> returns [bundle(a,b), bundle(c,d)]
results = enc.bundle([a, b], [c, d]) # list of two Hypervectors
Collapse a whole batch to one prototype
A batch of N hypervectors is a (D, N) array, where each column is one
hypervector. Passing such a batch to bundle folds over the N columns and
returns a single (D,) prototype:
import pyhdc
enc = pyhdc.MAP_C(dimension=10_000)
batch = enc.generate(size=(10_000, 50)) # 50 vectors as columns (shape: (10000, 50))
# collapse the 50 columns into one prototype
result = enc.bundle(batch)
print(result.shape) # (10000,)
Choose which axis to collapse
A higher-rank batch is a (D, N, M) tensor, where axis 0 is the dimension
D and each trailing-axis column is one hypervector. The axis keyword
selects which batch axis bundle folds over.
Default is the last batch axis. With axis=None (the default), bundle
reduces the last axis. A (D, N) batch still collapses to (D,), and a
(D, N, M) tensor collapses to (D, N):
import pyhdc
enc = pyhdc.MAP_C(dimension=10_000)
tensor = enc.generate(size=(10_000, 8, 5)) # shape: (10000, 8, 5)
# default axis=None reduces the last axis (axis 2)
result = enc.bundle(tensor)
print(result.shape) # (10000, 8)
Pass an explicit axis to collapse a different batch axis. Axis 0 is the
hypervector dimension and is never reducible, passing axis=0 raises
ValueError. Negative indices are normalized the usual way:
# reduce axis 1, keeping axis 2 as the remaining batch
result = enc.bundle(tensor, axis=1)
print(result.shape) # (10000, 5)
A tuple of axes folds several batch axes at once. This applies to the additive, element-wise bundlers (the MAP, HRR, FHRR addition variants and the BSDC bitwise-OR disjunction, BSDC_S/SEG/CDT), and it operates on a single batched tensor. BSDC_THIN (thinned OR) reduces a single axis only:
# reduce axes 1 and 2 together -> one prototype
result = enc.bundle(tensor, axis=(1, 2))
print(result.shape) # (10000,)
Get per-group results with axis
axis reduces the selected batch axis in place and returns a single
Hypervector. To bundle each group of a 3-D batch separately, reduce the
other batch axis and read the result columns, column j is group j’s
bundle:
single = enc.bundle(tensor, axis=2) # one Hypervector, shape (10000, 8)
groups = enc.bundle(tensor, axis=1) # one Hypervector, shape (10000, 5)
# groups[:, j] is the bundle of group j
The older batch_dim= keyword split a 3-D array along a dimension and returned
a Python list of hypervectors. It is deprecated as of 2.1.0, emits a
DeprecationWarning, and will be removed. axis= returns the same content as
one tensor. Passing both axis and batch_dim raises ValueError.
Zero vector as the bundle identity
The zero hypervector acts as the additive identity for bundling; bundling anything with zero leaves it unchanged:
zero = enc.zeros()
result = enc.bundle(a, zero)
print(result.similarity(a)) # ~= 1.0
This is useful when building bundles iteratively:
accumulator = enc.zeros()
for hv in hvs:
accumulator = enc.bundle(accumulator, hv)
Capacity limits
Bundling is lossy: each bundle adds noise to every component. The more hypervectors you bundle, the harder it is to distinguish individual members via similarity.
Approximate rule of thumb: bundling more than \(O(N \times ln(M))\) vectors into a single hypervector of dimension \(D\) causes the similarity to each component to drop below a useful threshold.
Encoding-specific notes
Different encoding families use different bundling implementations, but the interface is always the same:
Encoding |
Bundling behaviour |
|---|---|
MAP_C |
Element-wise addition then clip to [-1,1]; ties broken randomly ( |
MAP_I |
Element-wise addition (plain sum), near-zero-sum ties broken randomly ( |
MAP_B |
Element-wise addition then sign threshold (majority vote) to {-1, +1} |
HRR |
Element-wise addition followed by L2 normalisation |
HRR_NoNorm |
Element-wise addition without normalisation (vectors grow in magnitude) |
FHRR |
Sum phasors, extract resultant angle |
BSC |
Majority-vote threshold: each element is 1 if more than half the inputs are 1 |
BSDC family |
Bitwise OR. BSDC_THIN applies random thinning after OR to maintain density |