Best practices for memory architecture design including user vs agent vs session memory patterns, vector vs graph memory tradeoffs, retention strategies, and performance optimization. Use when designing memory systems, architecting AI memory layers, choosing memory types, planning retention strategies, or when user mentions memory architecture, user memory, agent memory, session memory, memory patterns, vector storage, graph memory, or Mem0 architecture.
Limited to specific tools
Additional assets for this skill
This skill is limited to using the following tools:
README.mdSKILL_SUMMARY.txtexamples/customer-support-memory-architecture.mdretention-policy-session.yamlscripts/analyze-memory-costs.shscripts/analyze-memory-performance.shscripts/analyze-retention.shscripts/audit-memory-security.shscripts/deduplicate-memories.shscripts/generate-retention-policy.shscripts/suggest-memory-type.shscripts/suggest-storage-architecture.shtemplates/graph-memory-config.pytemplates/multi-level-memory-pattern.pytemplates/retention-policy.yamltemplates/vector-only-config.pyProduction-ready memory architecture patterns for AI applications using Mem0. This skill provides comprehensive guidance on designing scalable, performant memory systems with proper isolation, retention strategies, and optimization techniques.
Mem0 provides three distinct memory scopes, each serving different purposes:
Purpose: Long-term personal preferences, profile data, and user characteristics that persist across all interactions.
Use Cases:
Implementation:
# Add user-level memory
memory.add(
"User prefers concise responses without technical jargon"
user_id="customer_bob"
)
# Search user memories
user_context = memory.search(
"communication style"
user_id="customer_bob"
)
Key Characteristics:
Purpose: Agent-specific knowledge, behaviors, and learned patterns that apply across all users interacting with this agent.
Use Cases:
Implementation:
# Add agent-level memory
memory.add(
"When handling refund requests, always check order date first"
agent_id="support_agent_v2"
)
# Search agent memories
agent_context = memory.search(
"refund process"
agent_id="support_agent_v2"
)
Key Characteristics:
Purpose: Ephemeral context specific to a single conversation or task session.
Use Cases:
Implementation:
# Add session-level memory
memory.add(
"Current issue: payment failed with error code 402"
run_id="session_12345_20250115"
)
# Search session memories
session_context = memory.search(
"current issue"
run_id="session_12345_20250115"
)
Key Characteristics:
How It Works: Embeddings stored in vector database, semantic similarity search using cosine distance.
Strengths:
Weaknesses:
Best For:
Configuration:
from mem0 import Memory
# Default vector-only configuration
memory = Memory()
How It Works: Entities and relationships stored in graph database (Neo4j/Memgraph), enables relationship traversal and complex queries.
Strengths:
Weaknesses:
Best For:
Configuration:
from mem0 import Memory
from mem0.configs.base import MemoryConfig
config = MemoryConfig(
graph_store={
"provider": "neo4j"
"config": {
"url": "bolt://localhost:7687"
"username": "neo4j"
"password": "password"
}
}
)
memory = Memory(config)
Decision Matrix:
| Use Case | Vector | Graph |
|---|---|---|
| User preferences | ✅ Best | ⚠️ Overkill |
| Product recommendations | ✅ Best | ⚠️ Overkill |
| Customer support | ✅ Good | ✅ Better |
| Knowledge management | ⚠️ Limited | ✅ Best |
| Multi-tenant systems | ✅ Good | ✅ Best |
| Team collaboration | ⚠️ Limited | ✅ Best |
Use the retention strategy template:
bash scripts/generate-retention-policy.sh <memory-type> <retention-days>
User Memory:
Agent Memory:
Session Memory:
Run the retention analyzer:
bash scripts/analyze-retention.sh <user_id_or_agent_id>
This script:
Pattern: Combine all three memory types for comprehensive context.
Template: Use templates/multi-level-memory-pattern.py
Architecture:
Query Processing Flow:
1. Retrieve session context (immediate)
2. Retrieve user context (preferences)
3. Retrieve agent context (capabilities)
4. Merge contexts with priority weighting
5. Generate response with full context
Priority Weighting:
Implementation:
# Retrieve all context levels
session_memories = memory.search(query, run_id=run_id)
user_memories = memory.search(query, user_id=user_id)
agent_memories = memory.search(query, agent_id=agent_id)
# Weighted merge
context = merge_contexts(
session=session_memories
user=user_memories
agent=agent_memories
weights={"session": 0.4, "user": 0.35, "agent": 0.25}
)
Run the performance analyzer:
bash scripts/analyze-memory-performance.sh <project_name>
Optimization Techniques:
Limit Search Results:
memories = memory.search(query, user_id=user_id, limit=5)
Use Filters to Reduce Search Space:
memories = memory.search(
query
filters={
"AND": [
{"user_id": "alex"}
{"agent_id": "support_agent"}
]
}
)
Cache Frequently Accessed Memories:
Batch Operations:
# Add multiple memories in one call
memory.add(messages, user_id=user_id)
For graph memory:
Run the cost analyzer:
bash scripts/analyze-memory-costs.sh <user_id> <date_range>
Cost Optimization Strategies:
Deduplication: Remove similar/redundant memories
bash scripts/deduplicate-memories.sh <user_id>
Archival: Move old memories to cold storage
Compression: Use shorter embeddings for less critical memories
Smart Pruning: Remove low-value memories
Pattern: Ensure complete data isolation between users/organizations.
Implementation:
# Always scope by user_id or org_id
memories = memory.search(
query
filters={"user_id": current_user_id}
)
# Validate access before retrieval
if not user_has_access(user_id, requested_user_id):
raise PermissionError("Access denied")
Security Checklist:
Run the security audit:
bash scripts/audit-memory-security.sh
Use the decision helper:
bash scripts/suggest-memory-type.sh "<use_case_description>"
Quick Reference:
Use the architecture advisor:
bash scripts/suggest-storage-architecture.sh "<project_description>"
Decision Criteria:
Scripts (all functional, not placeholders):
scripts/generate-retention-policy.sh - Create retention policy configsscripts/analyze-retention.sh - Analyze memory age and access patternsscripts/analyze-memory-performance.sh - Performance profilingscripts/analyze-memory-costs.sh - Cost analysis and optimization suggestionsscripts/deduplicate-memories.sh - Find and remove duplicate memoriesscripts/audit-memory-security.sh - Security compliance checkingscripts/suggest-memory-type.sh - Interactive memory type advisorscripts/suggest-storage-architecture.sh - Architecture recommendation toolTemplates:
templates/multi-level-memory-pattern.py - Complete implementationtemplates/retention-policy.yaml - Retention configurationtemplates/vector-only-config.py - Vector memory setuptemplates/graph-memory-config.py - Graph memory setuptemplates/hybrid-architecture.py - Vector + Graph combinedtemplates/cost-optimization-config.yaml - Cost optimization settingsExamples:
examples/customer-support-memory-architecture.md - Full implementation guideexamples/multi-agent-collaboration.md - Shared memory patternsexamples/e-commerce-personalization.md - Product recommendation memoryexamples/healthcare-assistant.md - HIPAA-compliant memory architectureSlow Memory Retrieval:
High Costs:
Poor Search Results:
Memory Leakage Between Users:
Plugin: mem0 Version: 1.0.0 Last Updated: 2025-10-27