📖 @allemandi/embed-utils
Fast, type-safe utilities for vector embedding comparison and search.
Works in Node.js, browsers – supports ESM, CommonJS, and UMD
🔖 Table of Contents
✨ Features
- 🔍 Find nearest neighbors by cosine similarity, or Euclidean/Manhattan distance
- 📐 Compute, normalize, and verify vector similarity
- ⚡ Lightweight and fast vector operations
🛠️ Installation
# Yarn
yarn add @allemandi/embed-utils
# NPM
npm install @allemandi/embed-utils
🚀 Quick Usage Examples
📘 For a complete list of methods and options, see the API docs.
ESM
import { computeCosineSimilarity } from '@allemandi/embed-utils';
CommonJS
const { findNearestNeighbors } = require('@allemandi/embed-utils');
const samples = [
{ embedding: [0.1, 0.2, 0.3], label: 'sports' },
{ embedding: [0.9, 0.8, 0.7], label: 'finance' },
{ embedding: [0.05, 0.1, 0.15], label: 'sports' },
];
const query = [0.09, 0.18, 0.27];
// Find top 2 neighbors with similarity ≥ 0.5
// (default method: cosine similarity)
const resultsCosine = findNearestNeighbors(query, samples, { topK: 2, threshold: 0.5 });
console.log(resultsCosine);
// [ { embedding: [0.1, 0.2, 0.3], label: "sports", similarityScore: 1 },
// { embedding: [0.05, 0.1, 0.15], label: "sports", similarityScore: 1 } ]
// Find top 3 neighbors with Euclidean distance ≤ 1.1
const resultsEuclidean = findNearestNeighbors(query, samples, {
topK: 3,
threshold: 1.1,
method: 'euclidean',
});
console.log(resultsEuclidean.length);
// 2
// only 2 results that pass threshold conditions
UMD (Browser)
<script src="https://unpkg.com/@allemandi/embed-utils"></script>
<script>
const vectorsToNormalize = [3, 4];
const result = window.allemandi.embedUtils.normalizeVector(vectorsToNormalize);
console.log(result);
</script>
🧪 Tests
Available in the GitHub repo only.
# Run the test suite with Jest
yarn test
# or
npm test
🔗 Related Projects
Check out these related projects that might interest you:
- Node.js CLI tool for local text classification using word embeddings.
- A minimalist command-line knowledge system with semantic memory capabilities using vector embeddings for information retrieval.
🤝 Contributing
If you have ideas, improvements, or new features:
- Fork the project
- Create your feature branch (git checkout -b feature/amazing-feature)
- Commit your changes (git commit -m 'Add some amazing feature')
- Push to the branch (git push origin feature/amazing-feature)
- Open a Pull Request