embeddings-to-graph
ai-research-agent / similarity/embeddings-to-graph
Similarity
convertEmbeddingsToUMAP()
function convertEmbeddingsToUMAP(embeddingsDict, options?): Promise<PlotDataPoint[]>
UMAP: Convert Embeddings to 2D or 3D Graph
UMAP (Uniform Manifold Approximation and Projection) is a dimensionality reduction technique that takes high-dimensional embeddings and converts into lower-dimensional coordinates for visualization.
-
Input: The process starts with high-dimensional embeddings. These could be word embeddings, image feature vectors, or any other type of high-dimensional data representation.
-
Dimensionality reduction: UMAP algorithmically reduces the number of dimensions while trying to preserve the structure of the data. It typically reduces the data to 2 or 3 dimensions for easy visualization.
-
Topological approach: UMAP uses concepts from topological data analysis and manifold learning to perform this reduction. It constructs a high-dimensional graph representation and then optimizes a low-dimensional layout to be as similar as possible.
-
Output: The result is a set of 2D or 3D coordinates for each input embedding. These can be plotted on a scatter plot, where each point represents an original high-dimensional datapoint.
-
Preservation of structure: UMAP aims to keep similar items close together and dissimilar items far apart in the lower-dimensional space, preserving both local and global structure of the data.
-
Visualization: The resulting UMAP coordinates can reveal clusters, patterns, and relationships in the data that were not easily visible in the original high-dimensional space.
Parameters
Parameter | Type | Description |
---|---|---|
| {} | The dictionary of embeddings. |
| { | |
|
| [default=2] - The number of dimensions for UMAP output. |
|
| [default=0.1] - The minimum distance parameter for UMAP. |
|
| [default=15] - The number of nearest neighbors for UMAP. |
Returns
Promise
<PlotDataPoint
[]>
An array of plot data points.