How to Choose the Right Encoding
PyHDC provides 15 encoding classes. The decision tree and notes below narrow the choice for most use cases.
Quick decision guide
Start here and follow the branches:
Need GPU / PyTorch integration? → all encodings support both backends
Need binary output (1-bit elements)?
Dense binary →
BSCorMAP_BSparse binary (low density) →
BSDC_THIN(best default) orBSDC_S/BSDC_SEG/BSDC_CDT
Need complex / phase-based values? →
FHRRNeed matrix-style binding? →
VTBorMBATOtherwise (continuous float vectors)?
General purpose →
MAP_C(best default)Normalized bundling theory →
HRRFixed-width integers →
MAP_IorMAP_I_Bits
Full comparison table
Encoding |
Element type |
Default dtype |
Binding |
Bundling |
Similarity |
Unbind |
Notes |
|---|---|---|---|---|---|---|---|
|
float [-1,1] |
float32 |
ElementMultiply |
Add + cut |
Cosine |
Yes |
Best all-round default |
|
int {-1,1} |
int32 |
ElementMultiply |
Add + cut |
Cosine |
Yes |
Integer; needs bit generator |
|
int (custom width) |
int32 |
ElementMultiply |
Add (clipped) |
Cosine |
Yes |
|
|
binary {0,1} |
int8 |
ElementMultiply |
Add (clipped) |
Cosine |
Yes |
Binary MAP |
|
float (normal) |
float32 |
CircularConv |
Add + normalise |
Cosine |
Yes |
Theoretically clean |
|
float (normal) |
float32 |
CircularConv |
Add (no norm) |
Cosine |
Yes |
Faster than HRR |
|
float (normal) |
float32 |
CircularConv |
Add / √M |
Cosine |
Yes |
Constant-norm bundles |
|
angle [-π, π) |
float32 |
AngleAdd |
AngleAdd |
AngleDist |
Yes |
Phase/periodic signals |
|
float (normal) |
float32 |
VDTransform |
Add + normalise |
Cosine |
Yes |
Matrix derived from key |
|
float (normal) |
float32 |
MatrixMult |
Add + normalise |
Cosine |
Yes (+ metadata) |
Random matrix; save |
|
binary {0,1} |
int8 |
XOR |
Majority vote |
Hamming |
Yes (exact) |
Dense binary; XOR is exact inverse |
|
sparse {0,1} |
int8 |
CDThinning |
OR |
Overlap |
No |
Context-dependent thinning |
|
sparse {0,1} |
int8 |
CircShift |
OR |
Overlap |
Yes |
Shift-based; good for sequences |
|
sparse segmented |
int8 |
SegShift |
OR |
Overlap |
Yes |
Per-segment shift |
|
sparse {0,1} |
int8 |
CircShift |
OR + thin |
Overlap |
Yes |
Best sparse default (v1.1.0+) |
Side-by-side example
The encoding API is identical across families; only the constructor call changes:
import pyhdc
for EncClass in [pyhdc.MAP_C, pyhdc.HRR, pyhdc.BSC, pyhdc.BSDC_THIN]:
enc = EncClass(dimension=10_000)
a = enc.generate()
b = enc.generate()
c = a.bind(b)
print(f"{EncClass.__name__:12s} "
f"sim(a,a)={a.similarity(a):.2f} "
f"sim(a,bind(a,b))={a.similarity(c):.2f}")
Note
Choosing an encoding picks the algebra (bind / bundle / similarity). To turn raw
data (scalars, periodic values, or feature vectors) into hypervectors in that
encoding, use a data encoder. See How to Encode Data into Hypervectors. A few encoders are
restricted by family: Thermometer and Density need a discrete family
(MAP_I/MAP_B/BSC/BSDC), Projection needs a family with a normalize step
(MAP/HRR/VTB/MBAT/FHRR), and FractionalPower is defined only for the FHRR and
the HRR families.