Skip to main content
Memvid supports multiple embedding models for semantic (vector) search. You can use the built-in BGE-small model for local, offline operation, or connect to external providers like OpenAI, Cohere, or Voyage for higher-quality embeddings.

Overview

Embeddings convert text into dense numerical vectors that capture semantic meaning. Similar concepts produce similar vectors, enabling semantic search (finding documents by meaning rather than exact keywords).
ProviderModelDimensionsBest For
Built-inBGE-small-en-v1.5384Offline, privacy-first
OpenAItext-embedding-3-small1536General purpose
OpenAItext-embedding-3-large3072Highest quality
Cohereembed-english-v3.01024English documents
Cohereembed-multilingual-v3.01024Multi-language
Voyagevoyage-31024Code and technical docs
Nomicnomic-embed-text-v1.5768Open-source alternative

Built-in Model (Default)

By default, Memvid uses BGE-small-en-v1.5, a lightweight embedding model that runs locally without any API keys.

Characteristics

  • Dimensions: 384
  • Size: ~75 MB (downloaded on first use)
  • Inference: CPU-based, no GPU required
  • Privacy: All processing happens locally
  • Offline: Works without internet after initial download

Usage

# CLI: Enable embeddings with built-in model
memvid put knowledge.mv2 --input document.pdf --embedding
# Python SDK
from memvid_sdk import create

mem = create("knowledge.mv2", enable_vec=True, enable_lex=True)
mem.put(
    "Document",
    "docs",
    {},
    text="Your content here",
    enable_embedding=True,
    embedding_model="bge-small",
)
// Node.js SDK
import { create } from '@memvid/sdk';

const mem = await create('knowledge.mv2');
await mem.put({ text: 'Your content here', title: 'Document', enableEmbedding: true });

OpenAI Embeddings

OpenAI’s embedding models offer excellent quality for general-purpose semantic search.

Setup

export OPENAI_API_KEY=sk-your-key-here

CLI Usage

# Use OpenAI for embeddings
memvid put knowledge.mv2 --input document.pdf --embedding -m openai-small

# Specify exact model
memvid put knowledge.mv2 --input docs/ --embedding -m openai-large

Python SDK

from memvid_sdk import create
from memvid_sdk.embeddings import OpenAIEmbeddings

# Initialize embedder
embedder = OpenAIEmbeddings(model='text-embedding-3-small')
print(f"Model: {embedder.model_name} ({embedder.dimension} dimensions)")

# Create memory with vector index
mem = create('knowledge.mv2', enable_vec=True, enable_lex=True)

# Store + embed in batch (vector index required for semantic search)
documents = [
    {"title": "Doc 1", "label": "kb", "text": "Machine learning fundamentals..."},
    {"title": "Doc 2", "label": "kb", "text": "Deep neural networks..."},
]
frame_ids = mem.put_many(documents, embedder=embedder)

# Search with query embedding
query = "How do neural networks work?"
results = mem.find(query, k=5, mode="sem", embedder=embedder)

NVIDIA Embeddings

NVIDIA Integrate provides a fast hosted embedding API with OpenAI-compatible shapes.

Setup

export NVIDIA_API_KEY=nvapi-your-key-here

Python SDK

from memvid_sdk import create
from memvid_sdk.embeddings import NvidiaEmbeddings

mem = create("knowledge.mv2", enable_vec=True, enable_lex=True)
embedder = NvidiaEmbeddings(model="nvidia/nv-embed-v1")  # uses NVIDIA_API_KEY

mem.put_many(
    [{"title": "Doc", "label": "kb", "text": "Vector search with NVIDIA embeddings."}],
    embedder=embedder,
)
res = mem.find("nvidia embeddings", mode="sem", embedder=embedder)

Node.js SDK

import { create, NvidiaEmbeddings } from '@memvid/sdk';

const mem = await create('knowledge.mv2');
const embedder = new NvidiaEmbeddings({ model: 'nvidia/nv-embed-v1' }); // uses NVIDIA_API_KEY
await mem.putMany([{ title: 'Doc', text: 'Vector search with NVIDIA embeddings.' }], { embedder });
const res = await mem.find('nvidia embeddings', { mode: 'sem', embedder });

Node.js SDK

import { create, OpenAIEmbeddings } from '@memvid/sdk';

// Initialize embedder (uses OPENAI_API_KEY env var)
const embedder = new OpenAIEmbeddings({ model: 'text-embedding-3-small' });
console.log(`Model: ${embedder.modelName} (${embedder.dimension} dimensions)`);

// Create memory
const mem = await create('knowledge.mv2');

// Store + embed in batch (vector index required for semantic search)
await mem.putMany(
  [
    { title: 'Doc 1', text: 'Machine learning fundamentals...' },
    { title: 'Doc 2', text: 'Deep neural networks...' },
  ],
  { embedder }
);
await mem.seal();

// Query using the same embedder (keeps dimensions consistent)
const results = await mem.find('How do neural networks work?', { mode: 'sem', k: 5, embedder });

Model Comparison

ModelDimensionsCostQuality
text-embedding-3-small1536$0.02/1M tokensGood
text-embedding-3-large3072$0.13/1M tokensBest
text-embedding-ada-0021536$0.10/1M tokensLegacy

Cohere Embeddings

Cohere offers specialized models for English and multilingual content.

Setup

export COHERE_API_KEY=your-key-here

Python SDK

from memvid_sdk.embeddings import CohereEmbeddings, get_embedder

# Direct initialization
embedder = CohereEmbeddings(model='embed-english-v3.0')

# Or use factory
embedder = get_embedder('cohere', model='embed-multilingual-v3.0')

# Generate embeddings
embeddings = embedder.embed_documents(['Text 1', 'Text 2'])
query_vec = embedder.embed_query('search query')

Node.js SDK

import { CohereEmbeddings, getEmbedder } from '@memvid/sdk';

// Direct initialization
const embedder = new CohereEmbeddings({ model: 'embed-english-v3.0' });

// Or use factory
const embedder2 = getEmbedder('cohere', { model: 'embed-multilingual-v3.0' });

const embeddings = await embedder.embedDocuments(['Text 1', 'Text 2']);

Model Options

ModelDimensionsBest For
embed-english-v3.01024English documents
embed-multilingual-v3.01024100+ languages
embed-english-light-v3.0384Faster, lower cost
embed-multilingual-light-v3.0384Multi-language, lighter

Voyage Embeddings

Voyage AI specializes in embeddings for code and technical documentation.

Setup

export VOYAGE_API_KEY=your-key-here

Python SDK

from memvid_sdk.embeddings import VoyageEmbeddings

embedder = VoyageEmbeddings(model='voyage-3')
embeddings = embedder.embed_documents(['def hello(): pass', 'function hello() {}'])

Node.js SDK

import { VoyageEmbeddings } from '@memvid/sdk';

const embedder = new VoyageEmbeddings({ model: 'voyage-code-3' });
const embeddings = await embedder.embedDocuments(['def hello(): pass']);

Model Options

ModelDimensionsBest For
voyage-31024General purpose
voyage-3-lite512Faster, smaller
voyage-code-31024Source code

HuggingFace Embeddings (Python)

Use any HuggingFace sentence-transformer model locally.

Setup

pip install sentence-transformers

Usage

from memvid_sdk.embeddings import get_embedder

# Use any sentence-transformers model
embedder = get_embedder('huggingface', model='all-MiniLM-L6-v2')
print(f"Model: {embedder.model_name} ({embedder.dimension} dimensions)")

embeddings = embedder.embed_documents(['Text 1', 'Text 2'])
ModelDimensionsSize
all-MiniLM-L6-v238480 MB
all-mpnet-base-v2768420 MB
multi-qa-MiniLM-L6-cos-v138480 MB

Using External Embeddings with Memvid

The key workflow for external embeddings:
  1. Pick an embedder (OpenAI/Cohere/Voyage/NVIDIA/etc.)
  2. Ingest with put_many(..., embedder=...) (stores embedding identity metadata)
  3. Query with find/ask(..., embedder=...) (keeps dimensions consistent)

Batch Ingestion Example

from memvid_sdk import create
from memvid_sdk.embeddings import OpenAIEmbeddings

# Setup
embedder = OpenAIEmbeddings()
mem = create('knowledge.mv2', enable_vec=True, enable_lex=True)

documents = [
    {"title": "Doc 1", "label": "research", "text": "Content 1..."},
    {"title": "Doc 2", "label": "research", "text": "Content 2..."},
]
frame_ids = mem.put_many(documents, embedder=embedder)

query = "What is the main finding?"
results = mem.find(query, k=10, mode="sem", embedder=embedder)

Vector Compression

For large collections, enable vector compression to reduce storage by ~16x:
# CLI
memvid put knowledge.mv2 --input docs/ --embedding --vector-compression
# Python
from memvid_sdk import create

mem = create("knowledge.mv2", enable_vec=True, enable_lex=True)
mem.put("Doc", "kb", {}, text="...", enable_embedding=True, vector_compression=True)
This uses Product Quantization (PQ) to compress vectors while maintaining search quality.

Environment Variables

VariableDescription
OPENAI_API_KEYOpenAI API key
COHERE_API_KEYCohere API key
VOYAGE_API_KEYVoyage AI API key
NVIDIA_API_KEYNVIDIA Integrate API key
NVIDIA_BASE_URLOptional NVIDIA Integrate base URL override
MEMVID_MODELS_DIRLocal model cache directory
MEMVID_OFFLINE=1Skip model downloads

Choosing an Embedding Model

Decision Matrix

RequirementRecommended
Privacy/offlineBuilt-in BGE-small
Best qualityOpenAI text-embedding-3-large
Cost-effectiveOpenAI text-embedding-3-small
Multi-languageCohere embed-multilingual-v3.0
Code/technicalVoyage voyage-code-3
Open-sourceHuggingFace all-mpnet-base-v2

Performance Considerations

  • Dimension count affects storage and search speed
  • API latency for external providers (batch when possible)
  • Rate limits vary by provider plan
  • Consistency - use same model for ingestion and search

Reranking

Memvid can rerank retrieved candidates using a cross-encoder model (auto-downloaded on first use). In the CLI this is applied during ask and can be disabled:
memvid ask knowledge.mv2 --question "What is machine learning?" --mode hybrid --no-rerank
For find, reranking is handled internally; there is no --rerank flag.

Next Steps