Memvid provides powerful search capabilities combining traditional keyword search with modern semantic understanding.
Search Modes
Memvid supports three search modes:
| Mode | Engine | Best For |
|---|
lex | BM25 | Exact keywords, technical terms, names |
sem | Vector search | Natural language, concepts, similarity |
auto | Hybrid | General queries, best overall results |
Basic Search
The find Command
memvid find knowledge.mv2 --query "your search query"
Options
| Option | Description | Default |
|---|
--query | Search query string | Required |
--mode | Search mode (lex, sem, auto) | auto |
--top-k | Number of results | 8 |
--snippet-chars | Context snippet length | 480 |
--json | Output as JSON | false |
--scope | Filter by URI prefix | All |
--uri | Filter to specific URI | All |
--cursor | Pagination cursor | None |
--query-embedding-model | Override query embedding model (rare; auto-detects when possible) | Auto |
--adaptive | Enable adaptive retrieval (dynamic top-k) | false |
--min-relevancy | Adaptive cutoff threshold | 0.5 |
--max-k | Adaptive max results | 100 |
-m/--embedding-model is a global flag that selects the default embedding model (not the search mode). Use --mode for lex/sem/auto.
Time-Travel Options
| Option | Description |
|---|
--as-of-frame ID | Show results as of frame ID |
--as-of-ts TIMESTAMP | Show results as of timestamp |
Search Mode Examples
Lexical Search
Best for exact matches and technical terms:
# Find exact keyword
memvid find knowledge.mv2 --query "WebAuthn" --mode lex
# Technical error codes
memvid find knowledge.mv2 --query "ERR_CONNECTION_REFUSED" --mode lex
# Function names
memvid find knowledge.mv2 --query "handleAuthentication" --mode lex
# Date range filtering
memvid find knowledge.mv2 --query "date:[2024-01-01 TO 2024-12-31]" --mode lex
Semantic Search
Best for natural language and conceptual queries:
# Natural language question
memvid find knowledge.mv2 --query "how do users log in" --mode sem
# Conceptual search
memvid find knowledge.mv2 --query "best practices for security" --mode sem
# Find similar content
memvid find knowledge.mv2 --query "machine learning model training" --mode sem
Semantic (sem) and hybrid (auto) search require query embeddings. Memvid auto-detects the correct embedding runtime from the .mv2 when vectors are present. Use --query-embedding-model (or global -m/--embedding-model) only when you need to override.
Hybrid Search (Recommended)
Combines both approaches for best results:
# General queries
memvid find knowledge.mv2 --query "authentication best practices" --mode auto
# Technical with context
memvid find knowledge.mv2 --query "OAuth2 implementation patterns" --mode auto
Advanced Search
Filtering Results
Filter by scope or specific URI:
# Search within specific URI prefix
memvid find knowledge.mv2 --query "authentication" --scope "mv2://api/"
# Search specific document
memvid find knowledge.mv2 --query "error handling" --uri "mv2://docs/errors.md"
Limiting Results
# Get top 5 results
memvid find knowledge.mv2 --query "performance optimization" --top-k 5
# Get single best match
memvid find knowledge.mv2 --query "main entry point" --top-k 1
# Longer snippets
memvid find knowledge.mv2 --query "architecture" --snippet-chars 800
JSON Output
For programmatic use:
memvid find knowledge.mv2 --query "database schema" --json
Output:
{
"query": "database schema",
"elapsed_ms": 5,
"engine": "hybrid",
"total_hits": 12,
"hits": [
{
"rank": 1,
"frame_id": 124,
"score": 0.892,
"uri": "mv2://docs/database.md",
"title": "Database Design",
"text": "The schema defines the following tables...",
"matches": 3,
"range": [145, 290]
}
],
"next_cursor": "eyJvZmZzZXQiOjh9"
}
For large result sets:
# First page
memvid find knowledge.mv2 --query "api" --top-k 10 --json
# Next page using cursor from previous response
memvid find knowledge.mv2 --query "api" --top-k 10 --cursor "eyJvZmZzZXQiOjEwfQ"
Time-Travel Queries
View search results at a specific point in time:
# Results as they were at frame 100
memvid find knowledge.mv2 --query "config" --as-of-frame 100
# Results as of a specific timestamp
memvid find knowledge.mv2 --query "config" --as-of-ts 1704067200
AI-Powered Q&A
The ask command retrieves relevant documents and synthesizes an answer using an LLM.
Basic Usage
# Ask a question with local Ollama model (recommended)
memvid ask knowledge.mv2 --question "Why is determinism important?" --use-model "ollama:qwen2.5:1.5b"
# Use cloud providers
memvid ask knowledge.mv2 --question "Why is determinism important?" --use-model openai
memvid ask knowledge.mv2 --question "Why is determinism important?" --use-model "gemini-2.0-flash"
memvid ask knowledge.mv2 --question "Why is determinism important?" --use-model claude
memvid ask knowledge.mv2 --question "Why is determinism important?" --use-model "nvidia:meta/llama3-8b-instruct"
Model Options
| Model | Type | Description |
|---|
ollama:qwen2.5:1.5b | Local | Recommended - Fast, private, no API costs |
ollama:qwen2.5:3b | Local | Higher quality, needs more RAM |
ollama:phi3:mini | Local | Good for reasoning tasks |
openai | Cloud | Uses GPT-4o-mini (requires OPENAI_API_KEY) |
gemini-2.0-flash | Cloud | Fast Gemini model (requires GEMINI_API_KEY) |
claude | Cloud | Claude Sonnet (requires ANTHROPIC_API_KEY) |
nvidia:meta/llama3-8b-instruct | Cloud | NVIDIA Integrate API (requires NVIDIA_API_KEY) |
For NVIDIA models, you can also set NVIDIA_LLM_MODEL and use --use-model nvidia.
Options
| Option | Description | Default |
|---|
--question | Question to answer | Required |
--use-model | LLM model (see table above) | None |
--top-k | Documents to retrieve | 8 |
--snippet-chars | Context length per document | 480 |
--mode | Retrieval mode (lex, sem, hybrid) | hybrid |
--context-only | Return context without synthesis | false |
--mask-pii | Mask PII before sending to LLM | false |
--llm-context-depth | Override context budget | Auto |
--json | Output as JSON | false |
--query-embedding-model | Override query embedding model (rare; auto-detects when possible) | Auto |
--adaptive | Enable adaptive retrieval (dynamic top-k) | false |
--min-relevancy | Adaptive cutoff threshold | 0.5 |
--max-k | Adaptive max results | 100 |
--adaptive-strategy | Adaptive cutoff strategy | relative |
Filtering Options
| Option | Description |
|---|
--scope | Filter by URI prefix |
--uri | Filter to specific URI |
--start | Start date filter |
--end | End date filter |
--as-of-frame | Time-travel to frame ID |
--as-of-ts | Time-travel to timestamp |
Examples
# Ask with local Ollama model (recommended)
memvid ask knowledge.mv2 \
--question "How do I configure authentication?" \
--use-model "ollama:qwen2.5:1.5b"
# Ask with more context
memvid ask knowledge.mv2 \
--question "Explain the architecture in detail" \
--top-k 15 \
--use-model "ollama:qwen2.5:3b"
# Get just the context without LLM synthesis
memvid ask knowledge.mv2 \
--question "What is the architecture?" \
--context-only
# Mask sensitive data before sending to cloud LLM
memvid ask knowledge.mv2 \
--question "What are the contact details?" \
--use-model openai \
--mask-pii
# Filter to specific date range
memvid ask knowledge.mv2 \
--question "What happened in Q4?" \
--start "2024-10-01" \
--end "2024-12-31" \
--use-model "ollama:qwen2.5:1.5b"
# JSON output with Gemini
memvid ask knowledge.mv2 \
--question "Summarize the API" \
--use-model "gemini-2.0-flash" \
--json
Multi-File Search
Search across multiple memory files:
# Search multiple files
memvid ask docs.mv2 code.mv2 notes.mv2 \
--question "How does authentication work?"
# Using glob patterns
memvid ask ./memories/*.mv2 \
--question "What are the main features?"
JSON Output
{
"question": "What is the architecture?",
"answer": "The architecture follows a layered design with...",
"mode": "hybrid",
"context_only": false,
"hits": [
{
"rank": 1,
"frame_id": 124,
"uri": "mv2://docs/arch.md",
"title": "Architecture Overview",
"score": 0.92,
"text": "The system consists of..."
}
],
"grounding": {
"score": 0.85,
"label": "HIGH",
"sentence_count": 3,
"grounded_sentences": 3,
"has_warning": false
},
"follow_up": {
"needed": false
},
"stats": {
"retrieval_ms": 5,
"synthesis_ms": 1200,
"latency_ms": 1205
}
}
Grounding & Hallucination Detection
When using --json, the response includes a grounding object that measures how well the answer is supported by the retrieved context:
| Field | Description |
|---|
score | Grounding score from 0.0 to 1.0 |
label | Quality label: LOW, MEDIUM, or HIGH |
sentence_count | Number of sentences in the answer |
grounded_sentences | Sentences supported by context |
has_warning | True if answer may be hallucinated |
warning_reason | Explanation if warning is present |
# Check grounding quality
memvid ask knowledge.mv2 \
--question "What is the API endpoint?" \
--use-model openai \
--json | jq '.grounding'
Example output for low grounding (potential hallucination):
{
"grounding": {
"score": 0.15,
"label": "LOW",
"sentence_count": 2,
"grounded_sentences": 0,
"has_warning": true,
"warning_reason": "Answer appears to be poorly grounded in context"
},
"follow_up": {
"needed": true,
"reason": "Answer may not be well-supported by the available context",
"hint": "This memory contains information about different topics. Try asking about those instead.",
"available_topics": ["API Reference", "Authentication", "Database Schema"],
"suggestions": [
"Tell me about API Reference",
"Tell me about Authentication",
"What topics are in this memory?"
]
}
}
When follow_up.needed is true, the answer may not be reliable. Consider using the suggested follow-up questions or rephrasing your query.
Ground Truth Corrections
The correct command stores authoritative corrections that take priority in future retrievals. Use this to fix hallucinations or add verified facts.
Synopsis
memvid correct <FILE> <STATEMENT> [OPTIONS]
Options
| Option | Description | Default |
|---|
--source | Attribution for the correction | None |
--topic | Topics for retrieval matching (can repeat) | None |
--boost | Retrieval priority boost factor | 2.0 |
Examples
# Store a correction
memvid correct knowledge.mv2 "Ben Koenig reported to Chloe Nguyen before 2025"
# With source attribution
memvid correct knowledge.mv2 "The API rate limit is 1000 req/min" \
--source "Engineering Team - Jan 2025"
# With topics for better retrieval
memvid correct knowledge.mv2 "OAuth tokens expire after 24 hours" \
--topic "authentication" \
--topic "OAuth" \
--topic "tokens"
# Higher boost for critical corrections
memvid correct knowledge.mv2 "Production database is db.prod.example.com" \
--boost 3.0
Verification
After storing a correction, verify it’s retrievable:
# Search for the correction
memvid find knowledge.mv2 --query "Ben Koenig reported to"
# Ask a question that should use the correction
memvid ask knowledge.mv2 \
--question "Who did Ben Koenig report to before 2025?" \
--use-model openai
Corrections are stored with a [Correction] label and receive boosted retrieval scores, ensuring they appear prominently in search results.
Vector Search
The vec-search command performs direct vector similarity search with pre-computed embeddings.
Synopsis
memvid vec-search <FILE> [OPTIONS]
Options
| Option | Description | Default |
|---|
--vector <CSV> | CSV-formatted vector | None |
--embedding <PATH> | Path to JSON file with embedding | None |
--limit <K> | Number of results | 10 |
--json | JSON output | false |
Examples
# Search with vector file
memvid vec-search project.mv2 --embedding ./query-vec.json --limit 5
# Search with inline vector
memvid vec-search project.mv2 --vector "0.1,0.2,0.3,..." --limit 10
Temporal Queries
The when command resolves temporal phrases and lists matching frames.
Synopsis
memvid when <FILE> --on <PHRASE> [OPTIONS]
Options
| Option | Description | Default |
|---|
--on <PHRASE> | Temporal phrase to resolve | Required |
--tz <ZONE> | Timezone for phrases | America/Chicago |
--limit <N> | Maximum frames | All |
--json | JSON output | false |
Examples
# Frames from "last Monday"
memvid when project.mv2 --on "last Monday"
# Frames from "yesterday"
memvid when project.mv2 --on "yesterday" --tz "America/New_York"
# Frames from "2 weeks ago"
memvid when project.mv2 --on "2 weeks ago" --limit 20
Audit Reports
The audit command generates audit reports with full source provenance.
Synopsis
memvid audit <FILE> <QUESTION> [OPTIONS]
Options
| Option | Description | Default |
|---|
--out <PATH>, -o <PATH> | Output file | stdout |
--format <FORMAT> | Format: text, markdown, json | text |
--top-k <K> | Sources to retrieve | 10 |
--snippet-chars <N> | Max chars per snippet | 500 |
--mode <MODE> | Retrieval mode | hybrid |
--scope <PREFIX> | Scope filter | None |
--start <DATE> | Start date | None |
--end <DATE> | End date | None |
--use-model <MODEL> | LLM for synthesis | None |
Examples
# Generate audit report
memvid audit project.mv2 "budget decisions" -o audit.md --format markdown
# JSON audit for compliance
memvid audit project.mv2 "data access" -o audit.json --format json
# With LLM summary
memvid audit project.mv2 "key decisions" --use-model openai -o report.md
Response (Markdown)
# Audit Report: Budget Decisions
Generated: 2024-01-20T15:30:00Z
Query: "budget decisions"
Sources: 8
## Findings
### Source 1: Q4 Budget Meeting
- **URI**: file:///meeting-q4.txt
- **Date**: 2024-01-15
- **Relevance**: 0.95
> The team decided to increase the marketing budget by 15%...
### Source 2: Finance Review
- **URI**: file:///finance-review.pdf
- **Date**: 2024-01-18
- **Relevance**: 0.87
> Budget allocation approved with amendments...
Real-World Examples
Documentation Search
# Find installation instructions
memvid find docs.mv2 --query "how to install" --mode auto
# Find API endpoint documentation
memvid find docs.mv2 --query "POST /users endpoint" --mode lex
# Ask about configuration (local model - private)
memvid ask docs.mv2 \
--question "What environment variables are required?" \
--use-model "ollama:qwen2.5:1.5b"
Codebase Search
# Find function implementations
memvid find code.mv2 --query "handleUserLogin" --mode lex
# Find error handling patterns
memvid find code.mv2 --query "try catch error handling" --mode auto
# Understand code architecture (local model - keeps code private)
memvid ask code.mv2 \
--question "How does the authentication flow work?" \
--use-model "ollama:qwen2.5:1.5b"
Research Search
# Find papers on specific topic
memvid find papers.mv2 --query "transformer architecture attention" --mode sem
# Summarize findings (use larger model for complex analysis)
memvid ask papers.mv2 \
--question "What are the main approaches to reducing transformer inference cost?" \
--top-k 15 \
--use-model "ollama:qwen2.5:3b"
Troubleshooting
No Results Found
Solutions:
- Try different search mode:
--mode sem or --mode lex
- Broaden your query terms
- Check that documents have been ingested:
memvid stats knowledge.mv2
- Verify lexical index exists:
memvid doctor knowledge.mv2 --rebuild-lex-index
Low Relevance Scores
Solutions:
- Use semantic search for natural language queries
- Use lexical search for exact technical terms
- Add more context to your query
- Increase
--top-k to see more results
LLM Errors
Error: Failed to contact LLM provider
Solutions:
Option 1: Use local Ollama model (recommended)
# Install Ollama
brew install ollama # macOS
# or: curl -fsSL https://ollama.com/install.sh | sh # Linux
# Start Ollama server
ollama serve &
# Pull a model
ollama pull qwen2.5:1.5b
# Use with memvid
memvid ask knowledge.mv2 --question "..." --use-model "ollama:qwen2.5:1.5b"
Option 2: Set API keys for cloud providers
# OpenAI
export OPENAI_API_KEY=your-key
memvid ask knowledge.mv2 --question "..." --use-model openai
# Gemini
export GEMINI_API_KEY=your-key
memvid ask knowledge.mv2 --question "..." --use-model "gemini-2.0-flash"
# Anthropic
export ANTHROPIC_API_KEY=your-key
memvid ask knowledge.mv2 --question "..." --use-model claude
Next Steps