Search & Ask - Memvid

Memvid provides powerful search capabilities combining traditional keyword search with modern semantic understanding.

Search Modes

Memvid supports three search modes:

Mode	Engine	Best For
`lex`	BM25	Exact keywords, technical terms, names
`sem`	Vector search	Natural language, concepts, similarity
`auto`	Hybrid	General queries, best overall results

Basic Search

The `find` Command

memvid find knowledge.mv2 --query "your search query"

Options

Option	Description	Default
`--query`	Search query string	Required
`--mode`	Search mode (`lex`, `sem`, `auto`)	`auto`
`--top-k`	Number of results	8
`--snippet-chars`	Context snippet length	480
`--json`	Output as JSON	false
`--scope`	Filter by URI prefix	All
`--uri`	Filter to specific URI	All
`--cursor`	Pagination cursor	None
`--query-embedding-model`	Override query embedding model (rare; auto-detects when possible)	Auto
`--adaptive`	Enable adaptive retrieval (dynamic top-k)	false
`--min-relevancy`	Adaptive cutoff threshold	0.5
`--max-k`	Adaptive max results	100

-m/--embedding-model is a global flag that selects the default embedding model (not the search mode). Use --mode for lex/sem/auto.

Time-Travel Options

Option	Description
`--as-of-frame ID`	Show results as of frame ID
`--as-of-ts TIMESTAMP`	Show results as of timestamp

Search Mode Examples

Lexical Search

Best for exact matches and technical terms:

# Find exact keyword
memvid find knowledge.mv2 --query "WebAuthn" --mode lex

# Technical error codes
memvid find knowledge.mv2 --query "ERR_CONNECTION_REFUSED" --mode lex

# Function names
memvid find knowledge.mv2 --query "handleAuthentication" --mode lex

# Date range filtering
memvid find knowledge.mv2 --query "date:[2024-01-01 TO 2024-12-31]" --mode lex

Semantic Search

Best for natural language and conceptual queries:

# Natural language question
memvid find knowledge.mv2 --query "how do users log in" --mode sem

# Conceptual search
memvid find knowledge.mv2 --query "best practices for security" --mode sem

# Find similar content
memvid find knowledge.mv2 --query "machine learning model training" --mode sem

Semantic (sem) and hybrid (auto) search require query embeddings. Memvid auto-detects the correct embedding runtime from the .mv2 when vectors are present. Use --query-embedding-model (or global -m/--embedding-model) only when you need to override.

Hybrid Search (Recommended)

Combines both approaches for best results:

# General queries
memvid find knowledge.mv2 --query "authentication best practices" --mode auto

# Technical with context
memvid find knowledge.mv2 --query "OAuth2 implementation patterns" --mode auto

Query Syntax

Multi-Word Queries

By default, multi-word queries use OR logic for better recall:

# Finds documents containing "machine" OR "learning"
memvid find knowledge.mv2 --query "machine learning" --mode lex

Boolean Operators

Use explicit operators for precise control:

# Must contain both terms
memvid find knowledge.mv2 --query "machine AND learning" --mode lex

# Must contain either term
memvid find knowledge.mv2 --query "python OR javascript" --mode lex

# Exclude term
memvid find knowledge.mv2 --query "database NOT postgres" --mode lex

# Complex expressions
memvid find knowledge.mv2 --query "(api OR rest) AND authentication" --mode lex

Phrase Search

Use quotes for exact phrase matching:

# Exact phrase
memvid find knowledge.mv2 --query '"machine learning"' --mode lex

# Phrase with other terms
memvid find knowledge.mv2 --query '"neural network" AND training' --mode lex

For natural language queries, use --mode sem (semantic search) which understands meaning rather than exact keywords.

Advanced Search

Filtering Results

Filter by scope or specific URI:

# Search within specific URI prefix
memvid find knowledge.mv2 --query "authentication" --scope "mv2://api/"

# Search specific document
memvid find knowledge.mv2 --query "error handling" --uri "mv2://docs/errors.md"

Limiting Results

# Get top 5 results
memvid find knowledge.mv2 --query "performance optimization" --top-k 5

# Get single best match
memvid find knowledge.mv2 --query "main entry point" --top-k 1

# Longer snippets
memvid find knowledge.mv2 --query "architecture" --snippet-chars 800

JSON Output

For programmatic use:

memvid find knowledge.mv2 --query "database schema" --json

Output:

{
  "query": "database schema",
  "elapsed_ms": 5,
  "engine": "hybrid",
  "total_hits": 12,
  "hits": [
    {
      "rank": 1,
      "frame_id": 124,
      "score": 0.892,
      "uri": "mv2://docs/database.md",
      "title": "Database Design",
      "text": "The schema defines the following tables...",
      "matches": 3,
      "range": [145, 290]
    }
  ],
  "next_cursor": "eyJvZmZzZXQiOjh9"
}

Pagination

For large result sets:

# First page
memvid find knowledge.mv2 --query "api" --top-k 10 --json

# Next page using cursor from previous response
memvid find knowledge.mv2 --query "api" --top-k 10 --cursor "eyJvZmZzZXQiOjEwfQ"

Time-Travel Queries

View search results at a specific point in time:

# Results as they were at frame 100
memvid find knowledge.mv2 --query "config" --as-of-frame 100

# Results as of a specific timestamp
memvid find knowledge.mv2 --query "config" --as-of-ts 1704067200

AI-Powered Q&A

The ask command retrieves relevant documents and synthesizes an answer using an LLM.

Basic Usage

# Ask a question with local Ollama model (recommended)
memvid ask knowledge.mv2 --question "Why is determinism important?" --use-model "ollama:qwen2.5:1.5b"

# Use cloud providers
memvid ask knowledge.mv2 --question "Why is determinism important?" --use-model openai
memvid ask knowledge.mv2 --question "Why is determinism important?" --use-model "gemini-2.0-flash"
memvid ask knowledge.mv2 --question "Why is determinism important?" --use-model claude
memvid ask knowledge.mv2 --question "Why is determinism important?" --use-model "nvidia:meta/llama3-8b-instruct"

Model Options

Model	Type	Description
`ollama:qwen2.5:1.5b`	Local	Recommended - Fast, private, no API costs
`ollama:qwen2.5:3b`	Local	Higher quality, needs more RAM
`ollama:phi3:mini`	Local	Good for reasoning tasks
`openai`	Cloud	Uses GPT-4o-mini (requires `OPENAI_API_KEY`)
`gemini-2.0-flash`	Cloud	Fast Gemini model (requires `GEMINI_API_KEY`)
`claude`	Cloud	Claude Sonnet (requires `ANTHROPIC_API_KEY`)
`nvidia:meta/llama3-8b-instruct`	Cloud	NVIDIA Integrate API (requires `NVIDIA_API_KEY`)

For NVIDIA models, you can also set NVIDIA_LLM_MODEL and use --use-model nvidia.

For local models, see Local Models with Ollama for setup instructions.

Options

Option	Description	Default
`--question`	Question to answer	Required
`--use-model`	LLM model (see table above)	None
`--top-k`	Documents to retrieve	8
`--snippet-chars`	Context length per document	480
`--mode`	Retrieval mode (`lex`, `sem`, `hybrid`)	hybrid
`--context-only`	Return context without synthesis	false
`--mask-pii`	Mask PII before sending to LLM	false
`--llm-context-depth`	Override context budget	Auto
`--json`	Output as JSON	false
`--query-embedding-model`	Override query embedding model (rare; auto-detects when possible)	Auto
`--adaptive`	Enable adaptive retrieval (dynamic top-k)	false
`--min-relevancy`	Adaptive cutoff threshold	0.5
`--max-k`	Adaptive max results	100
`--adaptive-strategy`	Adaptive cutoff strategy	relative

Filtering Options

Option	Description
`--scope`	Filter by URI prefix
`--uri`	Filter to specific URI
`--start`	Start date filter
`--end`	End date filter
`--as-of-frame`	Time-travel to frame ID
`--as-of-ts`	Time-travel to timestamp

Examples

# Ask with local Ollama model (recommended)
memvid ask knowledge.mv2 \
  --question "How do I configure authentication?" \
  --use-model "ollama:qwen2.5:1.5b"

# Ask with more context
memvid ask knowledge.mv2 \
  --question "Explain the architecture in detail" \
  --top-k 15 \
  --use-model "ollama:qwen2.5:3b"

# Get just the context without LLM synthesis
memvid ask knowledge.mv2 \
  --question "What is the architecture?" \
  --context-only

# Mask sensitive data before sending to cloud LLM
memvid ask knowledge.mv2 \
  --question "What are the contact details?" \
  --use-model openai \
  --mask-pii

# Filter to specific date range
memvid ask knowledge.mv2 \
  --question "What happened in Q4?" \
  --start "2024-10-01" \
  --end "2024-12-31" \
  --use-model "ollama:qwen2.5:1.5b"

# JSON output with Gemini
memvid ask knowledge.mv2 \
  --question "Summarize the API" \
  --use-model "gemini-2.0-flash" \
  --json

Multi-File Search

Search across multiple memory files:

# Search multiple files
memvid ask docs.mv2 code.mv2 notes.mv2 \
  --question "How does authentication work?"

# Using glob patterns
memvid ask ./memories/*.mv2 \
  --question "What are the main features?"

JSON Output

{
  "question": "What is the architecture?",
  "answer": "The architecture follows a layered design with...",
  "mode": "hybrid",
  "context_only": false,
  "hits": [
    {
      "rank": 1,
      "frame_id": 124,
      "uri": "mv2://docs/arch.md",
      "title": "Architecture Overview",
      "score": 0.92,
      "text": "The system consists of..."
    }
  ],
  "grounding": {
    "score": 0.85,
    "label": "HIGH",
    "sentence_count": 3,
    "grounded_sentences": 3,
    "has_warning": false
  },
  "follow_up": {
    "needed": false
  },
  "stats": {
    "retrieval_ms": 5,
    "synthesis_ms": 1200,
    "latency_ms": 1205
  }
}

Grounding & Hallucination Detection

When using --json, the response includes a grounding object that measures how well the answer is supported by the retrieved context:

Field	Description
`score`	Grounding score from 0.0 to 1.0
`label`	Quality label: `LOW`, `MEDIUM`, or `HIGH`
`sentence_count`	Number of sentences in the answer
`grounded_sentences`	Sentences supported by context
`has_warning`	True if answer may be hallucinated
`warning_reason`	Explanation if warning is present

# Check grounding quality
memvid ask knowledge.mv2 \
  --question "What is the API endpoint?" \
  --use-model openai \
  --json | jq '.grounding'

Example output for low grounding (potential hallucination):

{
  "grounding": {
    "score": 0.15,
    "label": "LOW",
    "sentence_count": 2,
    "grounded_sentences": 0,
    "has_warning": true,
    "warning_reason": "Answer appears to be poorly grounded in context"
  },
  "follow_up": {
    "needed": true,
    "reason": "Answer may not be well-supported by the available context",
    "hint": "This memory contains information about different topics. Try asking about those instead.",
    "available_topics": ["API Reference", "Authentication", "Database Schema"],
    "suggestions": [
      "Tell me about API Reference",
      "Tell me about Authentication",
      "What topics are in this memory?"
    ]
  }
}

When follow_up.needed is true, the answer may not be reliable. Consider using the suggested follow-up questions or rephrasing your query.

Ground Truth Corrections

The correct command stores authoritative corrections that take priority in future retrievals. Use this to fix hallucinations or add verified facts.

Synopsis

memvid correct <FILE> <STATEMENT> [OPTIONS]

Options

Option	Description	Default
`--source`	Attribution for the correction	None
`--topic`	Topics for retrieval matching (can repeat)	None
`--boost`	Retrieval priority boost factor	2.0

Examples

# Store a correction
memvid correct knowledge.mv2 "Ben Koenig reported to Chloe Nguyen before 2025"

# With source attribution
memvid correct knowledge.mv2 "The API rate limit is 1000 req/min" \
  --source "Engineering Team - Jan 2025"

# With topics for better retrieval
memvid correct knowledge.mv2 "OAuth tokens expire after 24 hours" \
  --topic "authentication" \
  --topic "OAuth" \
  --topic "tokens"

# Higher boost for critical corrections
memvid correct knowledge.mv2 "Production database is db.prod.example.com" \
  --boost 3.0

Verification

After storing a correction, verify it’s retrievable:

# Search for the correction
memvid find knowledge.mv2 --query "Ben Koenig reported to"

# Ask a question that should use the correction
memvid ask knowledge.mv2 \
  --question "Who did Ben Koenig report to before 2025?" \
  --use-model openai

Corrections are stored with a [Correction] label and receive boosted retrieval scores, ensuring they appear prominently in search results.

Vector Search

The vec-search command performs direct vector similarity search with pre-computed embeddings.

Synopsis

memvid vec-search <FILE> [OPTIONS]

Options

Option	Description	Default
`--vector <CSV>`	CSV-formatted vector	None
`--embedding <PATH>`	Path to JSON file with embedding	None
`--limit <K>`	Number of results	10
`--json`	JSON output	false

Examples

# Search with vector file
memvid vec-search project.mv2 --embedding ./query-vec.json --limit 5

# Search with inline vector
memvid vec-search project.mv2 --vector "0.1,0.2,0.3,..." --limit 10

Temporal Queries

The when command resolves temporal phrases and lists matching frames.

Synopsis

memvid when <FILE> --on <PHRASE> [OPTIONS]

Options

Option	Description	Default
`--on <PHRASE>`	Temporal phrase to resolve	Required
`--tz <ZONE>`	Timezone for phrases	America/Chicago
`--limit <N>`	Maximum frames	All
`--json`	JSON output	false

Examples

# Frames from "last Monday"
memvid when project.mv2 --on "last Monday"

# Frames from "yesterday"
memvid when project.mv2 --on "yesterday" --tz "America/New_York"

# Frames from "2 weeks ago"
memvid when project.mv2 --on "2 weeks ago" --limit 20

Audit Reports

The audit command generates audit reports with full source provenance.

Synopsis

memvid audit <FILE> <QUESTION> [OPTIONS]

Options

Option	Description	Default
`--out <PATH>`, `-o <PATH>`	Output file	stdout
`--format <FORMAT>`	Format: `text`, `markdown`, `json`	text
`--top-k <K>`	Sources to retrieve	10
`--snippet-chars <N>`	Max chars per snippet	500
`--mode <MODE>`	Retrieval mode	hybrid
`--scope <PREFIX>`	Scope filter	None
`--start <DATE>`	Start date	None
`--end <DATE>`	End date	None
`--use-model <MODEL>`	LLM for synthesis	None

Examples

# Generate audit report
memvid audit project.mv2 "budget decisions" -o audit.md --format markdown

# JSON audit for compliance
memvid audit project.mv2 "data access" -o audit.json --format json

# With LLM summary
memvid audit project.mv2 "key decisions" --use-model openai -o report.md

Response (Markdown)

# Audit Report: Budget Decisions

Generated: 2024-01-20T15:30:00Z
Query: "budget decisions"
Sources: 8

## Findings

### Source 1: Q4 Budget Meeting
- **URI**: file:///meeting-q4.txt
- **Date**: 2024-01-15
- **Relevance**: 0.95

> The team decided to increase the marketing budget by 15%...

### Source 2: Finance Review
- **URI**: file:///finance-review.pdf
- **Date**: 2024-01-18
- **Relevance**: 0.87

> Budget allocation approved with amendments...

Real-World Examples

Documentation Search

# Find installation instructions
memvid find docs.mv2 --query "how to install" --mode auto

# Find API endpoint documentation
memvid find docs.mv2 --query "POST /users endpoint" --mode lex

# Ask about configuration (local model - private)
memvid ask docs.mv2 \
  --question "What environment variables are required?" \
  --use-model "ollama:qwen2.5:1.5b"

Codebase Search

# Find function implementations
memvid find code.mv2 --query "handleUserLogin" --mode lex

# Find error handling patterns
memvid find code.mv2 --query "try catch error handling" --mode auto

# Understand code architecture (local model - keeps code private)
memvid ask code.mv2 \
  --question "How does the authentication flow work?" \
  --use-model "ollama:qwen2.5:1.5b"

Research Search

# Find papers on specific topic
memvid find papers.mv2 --query "transformer architecture attention" --mode sem

# Summarize findings (use larger model for complex analysis)
memvid ask papers.mv2 \
  --question "What are the main approaches to reducing transformer inference cost?" \
  --top-k 15 \
  --use-model "ollama:qwen2.5:3b"

Troubleshooting

No Results Found

Solutions:

Try different search mode: --mode sem or --mode lex
Broaden your query terms
Check that documents have been ingested: memvid stats knowledge.mv2
Verify lexical index exists: memvid doctor knowledge.mv2 --rebuild-lex-index

Low Relevance Scores

Solutions:

Use semantic search for natural language queries
Use lexical search for exact technical terms
Add more context to your query
Increase --top-k to see more results

LLM Errors

Error: Failed to contact LLM provider

Solutions: Option 1: Use local Ollama model (recommended)

# Install Ollama
brew install ollama  # macOS
# or: curl -fsSL https://ollama.com/install.sh | sh  # Linux

# Start Ollama server
ollama serve &

# Pull a model
ollama pull qwen2.5:1.5b

# Use with memvid
memvid ask knowledge.mv2 --question "..." --use-model "ollama:qwen2.5:1.5b"

Option 2: Set API keys for cloud providers

# OpenAI
export OPENAI_API_KEY=your-key
memvid ask knowledge.mv2 --question "..." --use-model openai

# Gemini
export GEMINI_API_KEY=your-key
memvid ask knowledge.mv2 --question "..." --use-model "gemini-2.0-flash"

# Anthropic
export ANTHROPIC_API_KEY=your-key
memvid ask knowledge.mv2 --question "..." --use-model claude

See Local Models with Ollama for detailed setup instructions.

Next Steps

Timeline & View

Explore documents by time and view frame details

Maintenance

Verify integrity and manage your files

Get Started

Comparisons

Install

Hosting

Architecture

Search & Retrieval

Enrichment

Media Processing

Embeddings

Security & Limits

Performance

CLI

Python SDK

Node.js SDK

Examples & Packages

Testing

Help

​Search Modes

​Basic Search

​The find Command

​Options

​Time-Travel Options

​Search Mode Examples

​Lexical Search

​Semantic Search

​Hybrid Search (Recommended)

​Query Syntax

​Multi-Word Queries

​Boolean Operators

​Phrase Search

​Advanced Search

​Filtering Results

​Limiting Results

​JSON Output

​Pagination

​Time-Travel Queries

​AI-Powered Q&A

​Basic Usage

​Model Options

​Options

​Filtering Options

​Examples

​Multi-File Search

​JSON Output

​Grounding & Hallucination Detection

​Ground Truth Corrections

​Synopsis

​Options

​Examples

​Verification

​Vector Search

​Synopsis

​Options

​Examples

​Temporal Queries

​Synopsis

​Options

​Examples

​Audit Reports

​Synopsis

​Options

​Examples

​Response (Markdown)

​Real-World Examples

​Documentation Search

​Codebase Search

​Research Search

​Troubleshooting

​No Results Found

​Low Relevance Scores

​LLM Errors

​Next Steps

Timeline & View

Maintenance

Search Modes

Basic Search

The `find` Command

Options

Time-Travel Options

Search Mode Examples

Lexical Search

Semantic Search

Hybrid Search (Recommended)

Query Syntax

Multi-Word Queries

Boolean Operators

Phrase Search

Advanced Search

Filtering Results

Limiting Results

JSON Output

Pagination

Time-Travel Queries

AI-Powered Q&A

Basic Usage

Model Options

Options

Filtering Options

Examples

Multi-File Search

JSON Output

Grounding & Hallucination Detection

Ground Truth Corrections

Synopsis

Options

Examples

Verification

Vector Search

Synopsis

Options

Examples

Temporal Queries

Synopsis

Options

Examples

Audit Reports

Synopsis

Options

Examples

Response (Markdown)

Real-World Examples

Documentation Search

Codebase Search

Research Search

Troubleshooting

No Results Found

Low Relevance Scores

LLM Errors

Next Steps