Skip to main content
Memvid provides powerful search capabilities combining traditional keyword search with modern semantic understanding.

Search Modes

Memvid supports three search modes:
ModeEngineBest For
lexBM25Exact keywords, technical terms, names
semVector searchNatural language, concepts, similarity
autoHybridGeneral queries, best overall results

The find Command

memvid find knowledge.mv2 --query "your search query"

Options

OptionDescriptionDefault
--querySearch query stringRequired
--modeSearch mode (lex, sem, auto)auto
--top-kNumber of results8
--snippet-charsContext snippet length480
--jsonOutput as JSONfalse
--scopeFilter by URI prefixAll
--uriFilter to specific URIAll
--cursorPagination cursorNone
--query-embedding-modelOverride query embedding model (rare; auto-detects when possible)Auto
--adaptiveEnable adaptive retrieval (dynamic top-k)false
--min-relevancyAdaptive cutoff threshold0.5
--max-kAdaptive max results100
-m/--embedding-model is a global flag that selects the default embedding model (not the search mode). Use --mode for lex/sem/auto.

Time-Travel Options

OptionDescription
--as-of-frame IDShow results as of frame ID
--as-of-ts TIMESTAMPShow results as of timestamp

Search Mode Examples

Best for exact matches and technical terms:
# Find exact keyword
memvid find knowledge.mv2 --query "WebAuthn" --mode lex

# Technical error codes
memvid find knowledge.mv2 --query "ERR_CONNECTION_REFUSED" --mode lex

# Function names
memvid find knowledge.mv2 --query "handleAuthentication" --mode lex

# Date range filtering
memvid find knowledge.mv2 --query "date:[2024-01-01 TO 2024-12-31]" --mode lex
Best for natural language and conceptual queries:
# Natural language question
memvid find knowledge.mv2 --query "how do users log in" --mode sem

# Conceptual search
memvid find knowledge.mv2 --query "best practices for security" --mode sem

# Find similar content
memvid find knowledge.mv2 --query "machine learning model training" --mode sem
Semantic (sem) and hybrid (auto) search require query embeddings. Memvid auto-detects the correct embedding runtime from the .mv2 when vectors are present. Use --query-embedding-model (or global -m/--embedding-model) only when you need to override.
Combines both approaches for best results:
# General queries
memvid find knowledge.mv2 --query "authentication best practices" --mode auto

# Technical with context
memvid find knowledge.mv2 --query "OAuth2 implementation patterns" --mode auto

Filtering Results

Filter by scope or specific URI:
# Search within specific URI prefix
memvid find knowledge.mv2 --query "authentication" --scope "mv2://api/"

# Search specific document
memvid find knowledge.mv2 --query "error handling" --uri "mv2://docs/errors.md"

Limiting Results

# Get top 5 results
memvid find knowledge.mv2 --query "performance optimization" --top-k 5

# Get single best match
memvid find knowledge.mv2 --query "main entry point" --top-k 1

# Longer snippets
memvid find knowledge.mv2 --query "architecture" --snippet-chars 800

JSON Output

For programmatic use:
memvid find knowledge.mv2 --query "database schema" --json
Output:
{
  "query": "database schema",
  "elapsed_ms": 5,
  "engine": "hybrid",
  "total_hits": 12,
  "hits": [
    {
      "rank": 1,
      "frame_id": 124,
      "score": 0.892,
      "uri": "mv2://docs/database.md",
      "title": "Database Design",
      "text": "The schema defines the following tables...",
      "matches": 3,
      "range": [145, 290]
    }
  ],
  "next_cursor": "eyJvZmZzZXQiOjh9"
}

Pagination

For large result sets:
# First page
memvid find knowledge.mv2 --query "api" --top-k 10 --json

# Next page using cursor from previous response
memvid find knowledge.mv2 --query "api" --top-k 10 --cursor "eyJvZmZzZXQiOjEwfQ"

Time-Travel Queries

View search results at a specific point in time:
# Results as they were at frame 100
memvid find knowledge.mv2 --query "config" --as-of-frame 100

# Results as of a specific timestamp
memvid find knowledge.mv2 --query "config" --as-of-ts 1704067200

AI-Powered Q&A

The ask command retrieves relevant documents and synthesizes an answer using an LLM.

Basic Usage

# Ask a question with local Ollama model (recommended)
memvid ask knowledge.mv2 --question "Why is determinism important?" --use-model "ollama:qwen2.5:1.5b"

# Use cloud providers
memvid ask knowledge.mv2 --question "Why is determinism important?" --use-model openai
memvid ask knowledge.mv2 --question "Why is determinism important?" --use-model "gemini-2.0-flash"
memvid ask knowledge.mv2 --question "Why is determinism important?" --use-model claude
memvid ask knowledge.mv2 --question "Why is determinism important?" --use-model "nvidia:meta/llama3-8b-instruct"

Model Options

ModelTypeDescription
ollama:qwen2.5:1.5bLocalRecommended - Fast, private, no API costs
ollama:qwen2.5:3bLocalHigher quality, needs more RAM
ollama:phi3:miniLocalGood for reasoning tasks
openaiCloudUses GPT-4o-mini (requires OPENAI_API_KEY)
gemini-2.0-flashCloudFast Gemini model (requires GEMINI_API_KEY)
claudeCloudClaude Sonnet (requires ANTHROPIC_API_KEY)
nvidia:meta/llama3-8b-instructCloudNVIDIA Integrate API (requires NVIDIA_API_KEY)
For NVIDIA models, you can also set NVIDIA_LLM_MODEL and use --use-model nvidia.
For local models, see Local Models with Ollama for setup instructions.

Options

OptionDescriptionDefault
--questionQuestion to answerRequired
--use-modelLLM model (see table above)None
--top-kDocuments to retrieve8
--snippet-charsContext length per document480
--modeRetrieval mode (lex, sem, hybrid)hybrid
--context-onlyReturn context without synthesisfalse
--mask-piiMask PII before sending to LLMfalse
--llm-context-depthOverride context budgetAuto
--jsonOutput as JSONfalse
--query-embedding-modelOverride query embedding model (rare; auto-detects when possible)Auto
--adaptiveEnable adaptive retrieval (dynamic top-k)false
--min-relevancyAdaptive cutoff threshold0.5
--max-kAdaptive max results100
--adaptive-strategyAdaptive cutoff strategyrelative

Filtering Options

OptionDescription
--scopeFilter by URI prefix
--uriFilter to specific URI
--startStart date filter
--endEnd date filter
--as-of-frameTime-travel to frame ID
--as-of-tsTime-travel to timestamp

Examples

# Ask with local Ollama model (recommended)
memvid ask knowledge.mv2 \
  --question "How do I configure authentication?" \
  --use-model "ollama:qwen2.5:1.5b"

# Ask with more context
memvid ask knowledge.mv2 \
  --question "Explain the architecture in detail" \
  --top-k 15 \
  --use-model "ollama:qwen2.5:3b"

# Get just the context without LLM synthesis
memvid ask knowledge.mv2 \
  --question "What is the architecture?" \
  --context-only

# Mask sensitive data before sending to cloud LLM
memvid ask knowledge.mv2 \
  --question "What are the contact details?" \
  --use-model openai \
  --mask-pii

# Filter to specific date range
memvid ask knowledge.mv2 \
  --question "What happened in Q4?" \
  --start "2024-10-01" \
  --end "2024-12-31" \
  --use-model "ollama:qwen2.5:1.5b"

# JSON output with Gemini
memvid ask knowledge.mv2 \
  --question "Summarize the API" \
  --use-model "gemini-2.0-flash" \
  --json
Search across multiple memory files:
# Search multiple files
memvid ask docs.mv2 code.mv2 notes.mv2 \
  --question "How does authentication work?"

# Using glob patterns
memvid ask ./memories/*.mv2 \
  --question "What are the main features?"

JSON Output

{
  "question": "What is the architecture?",
  "answer": "The architecture follows a layered design with...",
  "mode": "hybrid",
  "context_only": false,
  "hits": [
    {
      "rank": 1,
      "frame_id": 124,
      "uri": "mv2://docs/arch.md",
      "title": "Architecture Overview",
      "score": 0.92,
      "text": "The system consists of..."
    }
  ],
  "grounding": {
    "score": 0.85,
    "label": "HIGH",
    "sentence_count": 3,
    "grounded_sentences": 3,
    "has_warning": false
  },
  "follow_up": {
    "needed": false
  },
  "stats": {
    "retrieval_ms": 5,
    "synthesis_ms": 1200,
    "latency_ms": 1205
  }
}

Grounding & Hallucination Detection

When using --json, the response includes a grounding object that measures how well the answer is supported by the retrieved context:
FieldDescription
scoreGrounding score from 0.0 to 1.0
labelQuality label: LOW, MEDIUM, or HIGH
sentence_countNumber of sentences in the answer
grounded_sentencesSentences supported by context
has_warningTrue if answer may be hallucinated
warning_reasonExplanation if warning is present
# Check grounding quality
memvid ask knowledge.mv2 \
  --question "What is the API endpoint?" \
  --use-model openai \
  --json | jq '.grounding'
Example output for low grounding (potential hallucination):
{
  "grounding": {
    "score": 0.15,
    "label": "LOW",
    "sentence_count": 2,
    "grounded_sentences": 0,
    "has_warning": true,
    "warning_reason": "Answer appears to be poorly grounded in context"
  },
  "follow_up": {
    "needed": true,
    "reason": "Answer may not be well-supported by the available context",
    "hint": "This memory contains information about different topics. Try asking about those instead.",
    "available_topics": ["API Reference", "Authentication", "Database Schema"],
    "suggestions": [
      "Tell me about API Reference",
      "Tell me about Authentication",
      "What topics are in this memory?"
    ]
  }
}
When follow_up.needed is true, the answer may not be reliable. Consider using the suggested follow-up questions or rephrasing your query.

Ground Truth Corrections

The correct command stores authoritative corrections that take priority in future retrievals. Use this to fix hallucinations or add verified facts.

Synopsis

memvid correct <FILE> <STATEMENT> [OPTIONS]

Options

OptionDescriptionDefault
--sourceAttribution for the correctionNone
--topicTopics for retrieval matching (can repeat)None
--boostRetrieval priority boost factor2.0

Examples

# Store a correction
memvid correct knowledge.mv2 "Ben Koenig reported to Chloe Nguyen before 2025"

# With source attribution
memvid correct knowledge.mv2 "The API rate limit is 1000 req/min" \
  --source "Engineering Team - Jan 2025"

# With topics for better retrieval
memvid correct knowledge.mv2 "OAuth tokens expire after 24 hours" \
  --topic "authentication" \
  --topic "OAuth" \
  --topic "tokens"

# Higher boost for critical corrections
memvid correct knowledge.mv2 "Production database is db.prod.example.com" \
  --boost 3.0

Verification

After storing a correction, verify it’s retrievable:
# Search for the correction
memvid find knowledge.mv2 --query "Ben Koenig reported to"

# Ask a question that should use the correction
memvid ask knowledge.mv2 \
  --question "Who did Ben Koenig report to before 2025?" \
  --use-model openai
Corrections are stored with a [Correction] label and receive boosted retrieval scores, ensuring they appear prominently in search results.

The vec-search command performs direct vector similarity search with pre-computed embeddings.

Synopsis

memvid vec-search <FILE> [OPTIONS]

Options

OptionDescriptionDefault
--vector <CSV>CSV-formatted vectorNone
--embedding <PATH>Path to JSON file with embeddingNone
--limit <K>Number of results10
--jsonJSON outputfalse

Examples

# Search with vector file
memvid vec-search project.mv2 --embedding ./query-vec.json --limit 5

# Search with inline vector
memvid vec-search project.mv2 --vector "0.1,0.2,0.3,..." --limit 10

Temporal Queries

The when command resolves temporal phrases and lists matching frames.

Synopsis

memvid when <FILE> --on <PHRASE> [OPTIONS]

Options

OptionDescriptionDefault
--on <PHRASE>Temporal phrase to resolveRequired
--tz <ZONE>Timezone for phrasesAmerica/Chicago
--limit <N>Maximum framesAll
--jsonJSON outputfalse

Examples

# Frames from "last Monday"
memvid when project.mv2 --on "last Monday"

# Frames from "yesterday"
memvid when project.mv2 --on "yesterday" --tz "America/New_York"

# Frames from "2 weeks ago"
memvid when project.mv2 --on "2 weeks ago" --limit 20

Audit Reports

The audit command generates audit reports with full source provenance.

Synopsis

memvid audit <FILE> <QUESTION> [OPTIONS]

Options

OptionDescriptionDefault
--out <PATH>, -o <PATH>Output filestdout
--format <FORMAT>Format: text, markdown, jsontext
--top-k <K>Sources to retrieve10
--snippet-chars <N>Max chars per snippet500
--mode <MODE>Retrieval modehybrid
--scope <PREFIX>Scope filterNone
--start <DATE>Start dateNone
--end <DATE>End dateNone
--use-model <MODEL>LLM for synthesisNone

Examples

# Generate audit report
memvid audit project.mv2 "budget decisions" -o audit.md --format markdown

# JSON audit for compliance
memvid audit project.mv2 "data access" -o audit.json --format json

# With LLM summary
memvid audit project.mv2 "key decisions" --use-model openai -o report.md

Response (Markdown)

# Audit Report: Budget Decisions

Generated: 2024-01-20T15:30:00Z
Query: "budget decisions"
Sources: 8

## Findings

### Source 1: Q4 Budget Meeting
- **URI**: file:///meeting-q4.txt
- **Date**: 2024-01-15
- **Relevance**: 0.95

> The team decided to increase the marketing budget by 15%...

### Source 2: Finance Review
- **URI**: file:///finance-review.pdf
- **Date**: 2024-01-18
- **Relevance**: 0.87

> Budget allocation approved with amendments...

Real-World Examples

# Find installation instructions
memvid find docs.mv2 --query "how to install" --mode auto

# Find API endpoint documentation
memvid find docs.mv2 --query "POST /users endpoint" --mode lex

# Ask about configuration (local model - private)
memvid ask docs.mv2 \
  --question "What environment variables are required?" \
  --use-model "ollama:qwen2.5:1.5b"
# Find function implementations
memvid find code.mv2 --query "handleUserLogin" --mode lex

# Find error handling patterns
memvid find code.mv2 --query "try catch error handling" --mode auto

# Understand code architecture (local model - keeps code private)
memvid ask code.mv2 \
  --question "How does the authentication flow work?" \
  --use-model "ollama:qwen2.5:1.5b"
# Find papers on specific topic
memvid find papers.mv2 --query "transformer architecture attention" --mode sem

# Summarize findings (use larger model for complex analysis)
memvid ask papers.mv2 \
  --question "What are the main approaches to reducing transformer inference cost?" \
  --top-k 15 \
  --use-model "ollama:qwen2.5:3b"

Troubleshooting

No Results Found

Solutions:
  • Try different search mode: --mode sem or --mode lex
  • Broaden your query terms
  • Check that documents have been ingested: memvid stats knowledge.mv2
  • Verify lexical index exists: memvid doctor knowledge.mv2 --rebuild-lex-index

Low Relevance Scores

Solutions:
  • Use semantic search for natural language queries
  • Use lexical search for exact technical terms
  • Add more context to your query
  • Increase --top-k to see more results

LLM Errors

Error: Failed to contact LLM provider
Solutions: Option 1: Use local Ollama model (recommended)
# Install Ollama
brew install ollama  # macOS
# or: curl -fsSL https://ollama.com/install.sh | sh  # Linux

# Start Ollama server
ollama serve &

# Pull a model
ollama pull qwen2.5:1.5b

# Use with memvid
memvid ask knowledge.mv2 --question "..." --use-model "ollama:qwen2.5:1.5b"
Option 2: Set API keys for cloud providers
# OpenAI
export OPENAI_API_KEY=your-key
memvid ask knowledge.mv2 --question "..." --use-model openai

# Gemini
export GEMINI_API_KEY=your-key
memvid ask knowledge.mv2 --question "..." --use-model "gemini-2.0-flash"

# Anthropic
export ANTHROPIC_API_KEY=your-key
memvid ask knowledge.mv2 --question "..." --use-model claude
See Local Models with Ollama for detailed setup instructions.

Next Steps