Overview
Embeddings convert text into dense numerical vectors that capture semantic meaning. Similar concepts produce similar vectors, enabling semantic search (finding documents by meaning rather than exact keywords).| Provider | Model | Dimensions | Best For |
|---|---|---|---|
| Built-in | BGE-small-en-v1.5 | 384 | Offline, privacy-first |
| OpenAI | text-embedding-3-small | 1536 | General purpose |
| OpenAI | text-embedding-3-large | 3072 | Highest quality |
| Cohere | embed-english-v3.0 | 1024 | English documents |
| Cohere | embed-multilingual-v3.0 | 1024 | Multi-language |
| Voyage | voyage-3 | 1024 | Code and technical docs |
| Nomic | nomic-embed-text-v1.5 | 768 | Open-source alternative |
Built-in Model (Default)
By default, Memvid uses BGE-small-en-v1.5, a lightweight embedding model that runs locally without any API keys.Characteristics
- Dimensions: 384
- Size: ~75 MB (downloaded on first use)
- Inference: CPU-based, no GPU required
- Privacy: All processing happens locally
- Offline: Works without internet after initial download
Usage
OpenAI Embeddings
OpenAI’s embedding models offer excellent quality for general-purpose semantic search.Setup
CLI Usage
Python SDK
NVIDIA Embeddings
NVIDIA Integrate provides a fast hosted embedding API with OpenAI-compatible shapes.Setup
Python SDK
Node.js SDK
Node.js SDK
Model Comparison
| Model | Dimensions | Cost | Quality |
|---|---|---|---|
text-embedding-3-small | 1536 | $0.02/1M tokens | Good |
text-embedding-3-large | 3072 | $0.13/1M tokens | Best |
text-embedding-ada-002 | 1536 | $0.10/1M tokens | Legacy |
Cohere Embeddings
Cohere offers specialized models for English and multilingual content.Setup
Python SDK
Node.js SDK
Model Options
| Model | Dimensions | Best For |
|---|---|---|
embed-english-v3.0 | 1024 | English documents |
embed-multilingual-v3.0 | 1024 | 100+ languages |
embed-english-light-v3.0 | 384 | Faster, lower cost |
embed-multilingual-light-v3.0 | 384 | Multi-language, lighter |
Voyage Embeddings
Voyage AI specializes in embeddings for code and technical documentation.Setup
Python SDK
Node.js SDK
Model Options
| Model | Dimensions | Best For |
|---|---|---|
voyage-3 | 1024 | General purpose |
voyage-3-lite | 512 | Faster, smaller |
voyage-code-3 | 1024 | Source code |
HuggingFace Embeddings (Python)
Use any HuggingFace sentence-transformer model locally.Setup
Usage
Popular Models
| Model | Dimensions | Size |
|---|---|---|
all-MiniLM-L6-v2 | 384 | 80 MB |
all-mpnet-base-v2 | 768 | 420 MB |
multi-qa-MiniLM-L6-cos-v1 | 384 | 80 MB |
Using External Embeddings with Memvid
The key workflow for external embeddings:- Pick an embedder (OpenAI/Cohere/Voyage/NVIDIA/etc.)
- Ingest with
put_many(..., embedder=...)(stores embedding identity metadata) - Query with
find/ask(..., embedder=...)(keeps dimensions consistent)
Batch Ingestion Example
Vector Compression
For large collections, enable vector compression to reduce storage by ~16x:Environment Variables
| Variable | Description |
|---|---|
OPENAI_API_KEY | OpenAI API key |
COHERE_API_KEY | Cohere API key |
VOYAGE_API_KEY | Voyage AI API key |
NVIDIA_API_KEY | NVIDIA Integrate API key |
NVIDIA_BASE_URL | Optional NVIDIA Integrate base URL override |
MEMVID_MODELS_DIR | Local model cache directory |
MEMVID_OFFLINE=1 | Skip model downloads |
Choosing an Embedding Model
Decision Matrix
| Requirement | Recommended |
|---|---|
| Privacy/offline | Built-in BGE-small |
| Best quality | OpenAI text-embedding-3-large |
| Cost-effective | OpenAI text-embedding-3-small |
| Multi-language | Cohere embed-multilingual-v3.0 |
| Code/technical | Voyage voyage-code-3 |
| Open-source | HuggingFace all-mpnet-base-v2 |
Performance Considerations
- Dimension count affects storage and search speed
- API latency for external providers (batch when possible)
- Rate limits vary by provider plan
- Consistency - use same model for ingestion and search
Reranking
Memvid can rerank retrieved candidates using a cross-encoder model (auto-downloaded on first use). In the CLI this is applied duringask and can be disabled:
find, reranking is handled internally; there is no --rerank flag.