FAQ – Security & Compliance

File Security

How are `.mv2` files protected?

Integrity relies on cascading checksums:

Header checksum: Validates file header
TOC checksum: Validates table of contents
Per-segment checksums: Validates each data segment
Time index checksum: Validates timeline data

Confidentiality depends on OS file permissions. Memvid intentionally avoids bundling key management to keep the core simple.

Are checksums validated automatically?

Yes. When opening a file, Memvid validates:

Header checksum
TOC integrity
WAL consistency

Deep verification (via memvid verify --deep) additionally checks all segment checksums.

What happens if a file is corrupted?

Memvid provides tools to detect and repair corruption:

# Detect issues
memvid verify knowledge.mv2 --deep

# Repair issues
memvid doctor knowledge.mv2 --rebuild-time-index --rebuild-lex-index

The embedded WAL protects against data loss from crashes or power failures.

Crash Safety

What ensures data survives crashes?

The embedded Write-Ahead Log (WAL):

All mutations are written to WAL first
WAL is synced to disk (fsync)
Changes are then applied to main data
On recovery, uncommitted WAL entries are replayed

How long does recovery take?

Recovery is fast:

Typical recovery: < 100ms
Large WAL replay (4MB): < 250ms

Are there any single points of failure?

No. The .mv2 file is self-contained:

No external databases
No network dependencies
No sidecar files that could be lost

Access Control

How does file locking work?

Memvid uses OS-level file locks:

Writers: Exclusive lock (one at a time)
Readers: Shared lock (multiple concurrent)

# Check who holds the lock
memvid who knowledge.mv2

# Request release
memvid nudge knowledge.mv2

Can multiple users access the same file?

Yes, but only one can write at a time:

# Reader (concurrent access OK)
mem = use('basic', 'knowledge.mv2', read_only=True)

# Writer (exclusive access)
mem = use('basic', 'knowledge.mv2')

Data Privacy

Is my data sent anywhere?

Local operations (search, timeline, stats) never send data anywhere. Ask operations with external LLMs (openai, claude, gemini) send context to those providers. To prevent this:

Use the local model (tinyllama):

memvid ask knowledge.mv2 --question "What is X?"

Use context-only mode:

memvid ask knowledge.mv2 --question "What is X?" --context-only

Enable PII masking:

memvid ask knowledge.mv2 --question "Contact info?" --mask-pii --use-model openai

What does PII masking protect?

The --mask-pii flag masks sensitive information before sending to external LLMs:

PII Type	Example	Masked As
Email addresses	`john@example.com`	`[EMAIL]`
Phone numbers	`555-123-4567`	`[PHONE]`
US Social Security Numbers	`123-45-6789`	`[SSN]`
Credit card numbers	`4111-1111-1111-1111`	`[CREDIT_CARD]`
IPv4 addresses	`192.168.1.1`	`[IP_ADDRESS]`
API keys/tokens	`sk-abc123...`	`[API_KEY]`

Using PII Masking

CLI:

memvid ask knowledge.mv2 --question "Contact info?" --mask-pii --use-model openai

Python SDK:

from memvid_sdk import use

mem = use('basic', 'knowledge.mv2')

# Enable PII masking for ask queries
answer = mem.ask(
    "What are the customer contact details?",
    model="openai:gpt-4o",
    mask_pii=True
)
print(answer['answer'])

# Standalone PII masking function
from memvid_sdk import mask_pii

text = "Contact john@example.com or call 555-123-4567"
masked = mask_pii(text)
# Output: "Contact [EMAIL] or call [PHONE]"

Node.js SDK:

import { use, maskPii } from '@memvid/sdk';

const mem = await use('basic', 'knowledge.mv2');

// Enable PII masking for ask queries
const answer = await mem.ask('What are the customer contact details?', {
  model: 'openai:gpt-4o',
  modelApiKey: process.env.OPENAI_API_KEY,
  maskPii: true
});
console.log(answer.answer);

// Standalone PII masking function
const text = 'Contact john@example.com or call 555-123-4567';
const masked = maskPii(text);
// Output: "Contact [EMAIL] or call [PHONE]"

PII masking is applied to the context sent to external LLMs, not to data stored in the memory file. The original data remains intact.

Verification

How do I verify file integrity?

# Basic verification
memvid verify knowledge.mv2

# Deep verification (all checksums)
memvid verify knowledge.mv2 --deep

# Single-file compliance (no sidecars)
memvid verify-single-file knowledge.mv2

What does deep verification check?

Check	Description
`HeaderChecksum`	Header integrity
`TocIntegrity`	Table of contents valid
`WalConsistency`	WAL state consistent
`TimeIndexSortOrder`	Time index properly sorted
`LexIndexDecode`	Lexical index readable
`VecIndexDecode`	Vector index readable
`FrameCountConsistency`	Frame counts match

Best Practices

File Storage

Use appropriate permissions: Restrict file access to authorized users
Regular backups: Copy .mv2 files to backup storage
Verify after transfer: Run memvid verify --deep after copying files

Production Use

Read-only mode: Use for query-only workloads
Monitor capacity: Check utilization before large ingestions
Periodic verification: Run memvid verify --deep weekly

Sensitive Data

PII masking: Always enable for external LLM calls
Local models: Use tinyllama for sensitive queries
Context-only mode: Get relevant docs without LLM synthesis

Get Started

Comparisons

Install

Hosting

Architecture

Search & Retrieval

Enrichment

Media Processing

Embeddings

Security & Limits

Performance

CLI

Python SDK

Node.js SDK

Examples & Packages

Testing

Help

FAQ – Security & Compliance

File Security

How are `.mv2` files protected?

Are checksums validated automatically?

What happens if a file is corrupted?

Crash Safety

What ensures data survives crashes?

How long does recovery take?

Are there any single points of failure?

Access Control

How does file locking work?

Can multiple users access the same file?

Data Privacy

Is my data sent anywhere?

What does PII masking protect?

Using PII Masking

Verification

How do I verify file integrity?

What does deep verification check?

Best Practices

File Storage

Production Use

Sensitive Data

Get Started

Comparisons

Install

Hosting

Architecture

Search & Retrieval

Enrichment

Media Processing

Embeddings

Security & Limits

Performance

CLI

Python SDK

Node.js SDK

Examples & Packages

Testing

Help

​File Security

​How are .mv2 files protected?

​Are checksums validated automatically?

​What happens if a file is corrupted?

​Crash Safety

​What ensures data survives crashes?

​How long does recovery take?

​Are there any single points of failure?

​Access Control

​How does file locking work?

​Can multiple users access the same file?

​Data Privacy

​Is my data sent anywhere?

​What does PII masking protect?

​Using PII Masking

​Verification

​How do I verify file integrity?

​What does deep verification check?

​Best Practices

​File Storage

​Production Use

​Sensitive Data

File Security

How are `.mv2` files protected?

Are checksums validated automatically?

What happens if a file is corrupted?

Crash Safety

What ensures data survives crashes?

How long does recovery take?

Are there any single points of failure?

Access Control

How does file locking work?

Can multiple users access the same file?

Data Privacy

Is my data sent anywhere?

What does PII masking protect?

Using PII Masking

Verification

How do I verify file integrity?

What does deep verification check?

Best Practices

File Storage

Production Use

Sensitive Data