How to Wrap Existing Arrays as Hypervectors
enc.from_array() wraps a pre-existing NumPy array or PyTorch tensor as a
Hypervector. The typical use cases are loading saved codebooks
from disk and converting feature vectors from other libraries.
Basic usage
import pyhdc
import numpy as np
enc = pyhdc.MAP_C(dimension=10_000)
# Wrap a NumPy array
arr = np.random.uniform(-1, 1, size=10_000).astype(np.float32)
hv = enc.from_array(arr)
print(hv.shape) # (10000,)
print(hv.backend) # numpy
print(hv.encoding) # MAP_C instance
The array must have the same last dimension as the encoding’s dimension:
bad_arr = np.zeros(5_000)
enc.from_array(bad_arr) # DimensionsNotMatchingError
Load a saved codebook from disk
# Load a codebook that was saved as a NumPy .npy file
# Shape: (num_items, dimension)
data = np.load('codebook.npy') # shape (100, 10000)
enc = pyhdc.MAP_C(dimension=10_000)
codebook = [enc.from_array(data[i]) for i in range(len(data))]
query = enc.generate()
best_idx = max(range(len(codebook)), key=lambda i: query.similarity(codebook[i]))
Wrap a PyTorch tensor
from_array auto-detects whether the input is a NumPy array or PyTorch
tensor:
import torch
t = torch.randn(10_000, dtype=torch.float32)
enc_torch = pyhdc.MAP_C(dimension=10_000, backend="torch")
hv = enc_torch.from_array(t)
print(hv.backend) # torch
Extract the underlying array
Access .data to get the raw NumPy array or PyTorch tensor back:
arr_back = hv.data # numpy.ndarray or torch.Tensor
You can use this to pass hypervectors to libraries that do not know about PyHDC, such as scikit-learn or matplotlib.
Dtype notes
The dtype of the wrapped array should match what the encoding expects.
Mismatches generate a warning but do not raise an error. For example,
MAP_C expects float32; wrapping float64 will still work but may
incur an implicit conversion.