Documentation Index Fetch the complete documentation index at: https://docs.memvid.com/llms.txt
Use this file to discover all available pages before exploring further.
Learn how to create new memory files and ingest documents using the Memvid CLI.
Creating a Memory File
Basic Usage
Create a new .mv2 memory file:
memvid create my-knowledge.mv2
Options
Option Description Default --tierCapacity tier (free, dev, enterprise) free--sizeCapacity override (e.g. 15MB, capped at 50MB) 50MB--no-lexDisable lexical/full-text index Enabled --no-vectorDisable vector index Enabled
Examples
# Create a basic memory file
memvid create research.mv2
# Create without lexical index
memvid create notes.mv2 --no-lex
# Create a smaller memory (capacity override)
memvid create small.mv2 --size 512MB
memvid create is capped at 50MB. To go beyond 50MB, create the file and then apply a signed capacity ticket (see memvid tickets sync/apply).
JSON Output
memvid create my-memory.mv2 --json
{
"path" : "my-memory.mv2" ,
"size_limit_bytes" : 536870912 ,
"lex_enabled" : true ,
"vec_enabled" : true ,
"created_at" : "2024-01-15T10:30:00Z"
}
Inspecting a Memory File
The open command shows metadata and manifests of an existing memory file.
Synopsis
memvid open < FIL E > [OPTIONS]
Options
Option Description --jsonEmit JSON output
Examples
# Inspect a memory file
memvid open my-memory.mv2
# Get JSON output for scripting
memvid open my-memory.mv2 --json
Response
Memory File: my-memory.mv2
Version: 2.1.0
Created: 2024-01-15T10:30:00Z
Frames: 1,234
Size: 45.2 MB / 512 MB (8.8%)
Indexes:
Lexical: enabled (12,456 terms)
Vector: enabled (1,234 vectors, 384d)
Time: enabled (1,234 entries)
Tracks:
default: 890 frames
meetings: 234 frames
emails: 110 frames
Memory Binding:
Memory ID: mem_abc123
Bound at: 2024-01-15T10:30:00Z
JSON Output
{
"path" : "my-memory.mv2" ,
"version" : "2.1.0" ,
"created_at" : "2024-01-15T10:30:00Z" ,
"frame_count" : 1234 ,
"size_bytes" : 47395430 ,
"size_limit_bytes" : 536870912 ,
"indexes" : {
"lex" : { "enabled" : true , "term_count" : 12456 },
"vec" : { "enabled" : true , "vector_count" : 1234 , "dimension" : 384 },
"time" : { "enabled" : true , "entry_count" : 1234 }
},
"tracks" : {
"default" : 890 ,
"meetings" : 234 ,
"emails" : 110
},
"binding" : {
"memory_id" : "mem_abc123" ,
"bound_at" : "2024-01-15T10:30:00Z"
}
}
Ingesting Documents
The put command adds documents to your memory file as frames.
Basic Usage
# Ingest a single file (text-only)
memvid put my-knowledge.mv2 --input document.pdf
# Ingest a directory
memvid put my-knowledge.mv2 --input ./documents/
# Ingest with semantic embeddings (+16x PQ compression)
memvid put my-knowledge.mv2 --input document.pdf --embedding --vector-compression
# Ingest from stdin (text-only by default)
echo "Some text content" | memvid put my-knowledge.mv2
Core Options
Option Description --input PATHPath to file or directory --uri URICustom URI for the frame --title TITLEDocument title --timestamp UNIX_TSPOSIX timestamp --track TRACKTrack/collection name --kind KINDContent type metadata --jsonOutput as JSON
Option Description --tag KEY=VALUEAdd tags (repeatable) --label LABELAdd labels (repeatable) --metadata JSONAdditional metadata as JSON --no-auto-tagDisable automatic tag extraction --no-extract-datesDisable date extraction
When the CLIP and NER models are installed, the CLI automatically enables visual embeddings for images/PDFs and entity extraction.
Option Description --clipExplicitly enable CLIP visual embeddings --no-clipDisable CLIP even when model is available --logic-meshExplicitly enable entity extraction --no-logic-meshDisable entity extraction even when model is available
Install models manually:
memvid models install --clip mobileclip-s2
memvid models install --ner distilbert-ner
Embedding Options
Option Description --embeddingEnable semantic embeddings -m, --embedding-model MODELChoose default embedding model (global flag; see below) --vector-compressionGenerate semantic embeddings with 16x compression --no-embeddingExplicitly disable embeddings
Embedding Model Options:
Model Description bge-smallLocal fastembed default (384d) bge-baseLocal higher quality (768d) nomicLocal high accuracy (768d) gte-largeLocal best semantic depth (1024d) openai-smallOpenAI text-embedding-3-small (1536d) openai-largeOpenAI text-embedding-3-large (3072d) openaiAlias for openai-large openai-adaOpenAI text-embedding-ada-002 (1536d, legacy)
# Use built-in BGE (default, no API key needed)
memvid put knowledge.mv2 --input docs/ --embedding
# Use OpenAI embeddings
export OPENAI_API_KEY = sk- ...
memvid put knowledge.mv2 --input docs/ --embedding -m openai-small
# Use OpenAI large model for higher quality
memvid put knowledge.mv2 --input docs/ --embedding -m openai-large
Option Description --tablesExtract tables from PDF files --embed-rowsEmbed individual table rows for semantic search (default: true)
Duplicate Handling
Option Description --update-existingReplace existing frame with same URI --allow-duplicateAllow multiple frames with same URI
Lock Control
Option Description Default --lock-timeout MSWait time for lock 250ms --forceForce takeover of stale lock false
Ingesting Different File Types
Memvid automatically detects and processes various file formats:
Text Files
Documents
Media
# Plain text
memvid put knowledge.mv2 --input notes.txt --vector-compression
# Markdown
memvid put knowledge.mv2 --input README.md --vector-compression
# HTML
memvid put knowledge.mv2 --input page.html --vector-compression
# PDF files
memvid put knowledge.mv2 --input report.pdf --vector-compression
# PDF with table extraction
memvid put knowledge.mv2 --input invoice.pdf --tables --vector-compression
# Word documents
memvid put knowledge.mv2 --input document.docx --vector-compression
# Excel spreadsheets
memvid put knowledge.mv2 --input data.xlsx --vector-compression
# PowerPoint presentations
memvid put knowledge.mv2 --input slides.pptx --vector-compression
# Images with EXIF extraction
memvid put knowledge.mv2 --input photo.jpg
# Audio files
memvid put knowledge.mv2 --input recording.mp3 --audio
# Video files (stored without transcoding)
memvid put knowledge.mv2 --input video.mp4 --video
Organize your documents with tracks, tags, and timestamps:
# Add to a specific track
memvid put knowledge.mv2 --input meeting-notes.md --vector-compression --track "meetings"
# Add metadata tags
memvid put knowledge.mv2 --input api-docs.md --vector-compression \
--tag "category=documentation" \
--tag "version=2.0" \
--tag "author=team"
# Add labels
memvid put knowledge.mv2 --input report.pdf --vector-compression \
--label "quarterly" \
--label "finance"
# Set custom timestamp
memvid put knowledge.mv2 --input old-report.pdf --vector-compression \
--timestamp 1686819000
# Combine options
memvid put knowledge.mv2 --input quarterly-report.pdf --vector-compression \
--track "reports" \
--title "Q3 2024 Report" \
--tag "quarter=Q3" \
--tag "year=2024"
Parallel Ingestion
For large datasets, enable multi-threaded processing:
# Enable parallel ingestion
memvid put knowledge.mv2 --input ./large-dataset/ --vector-compression \
--parallel-segments \
--parallel-threads 8
# Fine-tune parallel settings
memvid put knowledge.mv2 --input ./corpus/ --vector-compression \
--parallel-segments \
--parallel-seg-tokens 4000 \
--parallel-threads 4 \
--parallel-queue-depth 16
Option Description Default --parallel-segmentsEnable multi-threaded processing false --parallel-threadsNumber of worker threads CPU count - 1 --parallel-queue-depthQueue size for workers Auto --parallel-seg-tokensTarget tokens per segment Auto
Ingesting from Stdin
Useful for piping data from other commands:
# Pipe text content
echo "Important note to remember" | memvid put knowledge.mv2 --vector-compression
# Pipe from curl
curl -s https://api.example.com/data | memvid put knowledge.mv2 --vector-compression --title "API Response"
# Pipe from another command
cat log.txt | grep "ERROR" | memvid put knowledge.mv2 --vector-compression --track "errors"
Extract structured tables from PDFs (invoices, financial reports, pay stubs):
Basic Usage
# Extract tables from a PDF
memvid put knowledge.mv2 --input invoice.pdf --tables --vector-compression
# Extract tables and embed individual rows for semantic search
memvid put knowledge.mv2 --input financial-report.pdf --tables --embed-rows --vector-compression
Detection Methods
The table extractor uses multiple detection methods:
Method Best For Stream Tables without visible borders, text-based layouts Lattice Tables with visible grid lines and borders LineBased Columnar data with clear alignment patterns
The extractor automatically tries each method and picks the best results.
After extraction, use the tables command to view and export:
# List all tables in a memory
memvid tables list knowledge.mv2
# Output:
# Found 3 tables:
# - pdf_table_1_page1: 5 rows x 4 cols (LineBased)
# - pdf_table_2_page1: 12 rows x 3 cols (Stream)
# - pdf_table_3_page2: 8 rows x 5 cols (Lattice)
# View a specific table
memvid tables view knowledge.mv2 --table-id pdf_table_1_page1
# Export to CSV
memvid tables export knowledge.mv2 --table-id pdf_table_1_page1 --format csv > data.csv
# Export to JSON
memvid tables export knowledge.mv2 --table-id pdf_table_1_page1 --format json
Example: Invoice Processing
# Create memory for invoices
memvid create invoices.mv2
# Ingest invoice with table extraction
memvid put invoices.mv2 --input amazon-invoice.pdf --tables --vector-compression
# Search for specific items
memvid find invoices.mv2 --query "total" --json
# List extracted tables
memvid tables list invoices.mv2
# Export line items to CSV
memvid tables export invoices.mv2 --table-id pdf_table_1_page1 --format csv
Updating Documents
The update command modifies an existing frame.
Synopsis
memvid update < FIL E > [OPTIONS]
Options
Option Description --frame-id <ID>Target frame by ID --uri <URI>Target frame by URI --input <PATH>New payload from file --set-uri <URI>Update frame URI --title <TITLE>Update title --timestamp <TS>Update timestamp --track <TRACK>Update track --kind <KIND>Update kind --tag <KEY=VALUE>Add/update tags --label <LABEL>Add/update labels --metadata <JSON>Add/update metadata --embeddingsRecompute embeddings --jsonJSON output
Examples
# Update title
memvid update project.mv2 --frame-id 1234 --title "Updated Title"
# Update content and recompute embeddings
memvid update project.mv2 --uri "file:///doc.txt" \
--input updated-doc.txt \
--embeddings
# Add new tags
memvid update project.mv2 --frame-id 1234 \
--tag "status=reviewed" \
--label approved
Response
Updated frame 1234 in project.mv2
Title: Updated Title
Tags added: status=reviewed
Labels added: approved
Embeddings: recomputed
Deleting Documents
The delete command removes a frame from the memory.
Synopsis
memvid delete < FIL E > [OPTIONS]
Options
Option Description --frame-id <ID>Target by frame ID --uri <URI>Target by frame URI --yesSkip confirmation prompt --jsonJSON output
Examples
# Delete by frame ID
memvid delete project.mv2 --frame-id 1234
# Delete by URI (skip confirmation)
memvid delete project.mv2 --uri "file:///old-doc.txt" --yes
Response
Deleted frame 1234 from project.mv2
URI: file:///old-doc.txt
Title: Old Document
Remote API Ingestion
The api-fetch command fetches remote content from APIs and ingests as frames.
Synopsis
memvid api-fetch < FIL E > < CONFI G > [OPTIONS]
Options
Option Description --dry-runPreview without writing --mode <MODE>Override configured ingest mode --uri <URI>Override base URI --jsonJSON output
{
"url" : "https://api.example.com/documents" ,
"method" : "GET" ,
"headers" : {
"Authorization" : "Bearer ${API_TOKEN}"
},
"pagination" : {
"type" : "cursor" ,
"cursor_param" : "after" ,
"cursor_path" : "$.meta.next_cursor"
},
"items_path" : "$.data" ,
"mapping" : {
"title" : "$.name" ,
"text" : "$.content" ,
"uri" : "$.id"
}
}
Examples
# Fetch from API
memvid api-fetch project.mv2 ./fetch-config.json
# Dry run to preview
memvid api-fetch project.mv2 ./fetch-config.json --dry-run
Real-World Examples
Documentation Knowledge Base
# Create the memory
memvid create docs.mv2
# Ingest documentation with embeddings
memvid put docs.mv2 --input ./docs/ --vector-compression --track "documentation"
# Add API reference
memvid put docs.mv2 --input ./api-reference/ --vector-compression \
--track "api" \
--tag "type=reference"
Research Paper Archive
# Create the memory
memvid create papers.mv2
# Ingest papers with metadata
for paper in ./papers/*.pdf ; do
memvid put papers.mv2 --input " $paper " --vector-compression \
--track "research" \
--tag "source=arxiv"
done
Code Repository
# Create memory for codebase
memvid create codebase.mv2
# Ingest with parallel processing
memvid put codebase.mv2 --input ./src/ --vector-compression \
--parallel-segments \
--track "source"
# Add tests and docs
memvid put codebase.mv2 --input ./tests/ --vector-compression --track "tests"
memvid put codebase.mv2 --input ./docs/ --vector-compression --track "docs"
Troubleshooting
File Locked
Error: File is locked by another process
Solutions:
# Check who holds the lock
memvid who knowledge.mv2
# Request release
memvid nudge knowledge.mv2
# Find process on macOS/Linux
lsof knowledge.mv2
# Wait longer for lock
memvid put knowledge.mv2 --input doc.pdf --lock-timeout 5000
# Force takeover (only if previous writer crashed)
memvid put knowledge.mv2 --input doc.pdf --force
Capacity Exceeded
Solutions:
# Check current usage
memvid stats knowledge.mv2
# Delete unused frames
memvid delete knowledge.mv2 --frame-id 42 --yes
# Compact the file
memvid doctor knowledge.mv2 --vacuum
Embedding Model Issues
Error: Failed to load embedding model
Solution:
# Set model directory
export MEMVID_MODELS_DIR =~ /. memvid / models
# Or use offline mode with pre-cached models
export MEMVID_OFFLINE = 1
Next Steps
Search & Ask Query your memories with lexical, semantic, and hybrid search
Timeline & View Explore your memories chronologically