Function addEmbeddingVectorsToIndex

addEmbeddingVectorsToIndex(documentVectors, options?): Promise<HierarchicalNSW>
VSEARCH: Vector Similarity Embedding Approximation in RAM-Limited Cluster Heirarchy
1. Compile hnswlib-node or NGT algorithm C++ to WASM JS for efficient similarity search.
2. Vector index is split by K-means into regional clusters, each being a specific size to fit in RAM. This is better than popular vector engines that require costly 100gb-RAM servers because they load all the vectors at once.
3. Vectors for centroids of each cluster are stored in a list in SQL, each cluster's binary quantized data is exported as base64 string to SQL, S3, etc.
4. Search: Embed Query, Compare to each cluster centroid to pick top clusters, download base64 strings for those clusters, load each into WASM, find top neighbors per cluster, merge results sorted by distance.
NGT Algorithm NGT Cluster https://qdrant.tech/articles/memory-consumption/ Lancedb Usearch
- ANN Benchmarks
Malkov et al. (2016),
Parameters
- documentVectors: string[]
  An array of document texts to be vectorized.
- Optionaloptions: {
  numDimensions: number;
  maxElements: number;
  } = {}
  Optional parameters for vector generation and indexing.
  - numDimensions: number
    The length of data point vector that will be indexed.
  - maxElements: number
    The maximum number of data points.
Returns Promise<HierarchicalNSW>
The created HNSW index. *
- Defined in similarity/similarity-vector.js:144