Design and implement long-term memory systems for AI agents using vector stores, knowledge graphs, and hybrid approaches. Includes benchmarks and decision frameworks. Use when building persistent agent memory, implementing RAG, designing knowledge bases, or when user mentions 'memory', 'RAG', 'vector store', 'knowledge graph', 'long-term memory', 'retrieval', or 'embeddings'.
/plugin marketplace add greyhaven-ai/claude-code-config/plugin install knowledge-base@grey-haven-pluginsThis skill is limited to using the following tools:
checklists/memory-selection-checklist.mdreference/architecture-patterns.mdDesign and implement long-term memory systems for AI agents.
Memory exists on a spectrum from ephemeral to permanent:
Ephemeral ◄────────────────────────────────────► Permanent
Context Window Short-term Long-term Knowledge
(disappears) Cache Memory Base
(session) (weeks) (forever)
| Memory Type | Duration | Use Case |
|---|---|---|
| Context window | Single turn | Immediate task context |
| Short-term cache | Session | Conversation history |
| Long-term memory | Weeks/months | User preferences, learnings |
| Knowledge base | Permanent | Facts, documentation, procedures |
Store embeddings, retrieve by semantic similarity.
Pros:
Cons:
Best for: Document search, FAQ systems, code search
Store entities and relationships explicitly.
Pros:
Cons:
Best for: Domain modeling, reasoning tasks, complex queries
Knowledge graphs with time-based relationships.
Pros:
Cons:
Best for: Historical analysis, change tracking, audit trails
Combine vector + graph for best of both:
Query ──▶ Vector Search ──▶ Top K candidates
│
▼
Graph Traversal ──▶ Related entities
│
▼
Re-ranking ──▶ Final results
Research benchmarks for memory systems (2024 data):
| System | Recall@10 | Latency (P50) | Cost/Query |
|---|---|---|---|
| Zep | 94.8% | 45ms | $0.0001 |
| MemGPT | 93.4% | 120ms | $0.0003 |
| LangChain Memory | 87.2% | 80ms | $0.0002 |
| Simple RAG | 78.5% | 30ms | $0.00005 |
examples/)reference/)checklists/)This skill complements the knowledge-base agents:
| Agent | Purpose |
|---|---|
memory-architect | Design memory storage, semantic search |
knowledge-curator | Create and organize knowledge entries |
ontology-builder | Map relationships between entries |
kb-search-analyzer | Search and synthesize from KB |
kb-entry-creator | Create structured KB entries |
kb-validator | Validate KB integrity |
kb-manifest-generator | Generate KB indexes |
kb-ontology-mapper | Visualize knowledge structure |
class ConversationMemory:
def __init__(self):
self.short_term = [] # Last N messages
self.long_term = VectorStore() # Semantic search
self.entities = EntityStore() # Mentioned entities
def add_message(self, message: str, role: str):
# Short-term: sliding window
self.short_term.append({"role": role, "content": message})
if len(self.short_term) > 20:
self.short_term.pop(0)
# Long-term: embed and store
self.long_term.add(message, metadata={"role": role})
# Entity extraction
entities = extract_entities(message)
self.entities.update(entities)
def retrieve(self, query: str, k: int = 5) -> list:
# Combine short-term + relevant long-term
recent = self.short_term[-5:]
similar = self.long_term.search(query, k=k)
entities = self.entities.get_relevant(query)
return {
"recent": recent,
"similar": similar,
"entities": entities
}
class EntityMemory:
def __init__(self):
self.entities = {} # entity_name -> EntityRecord
self.relationships = [] # (entity1, relation, entity2)
def update(self, entity: str, info: dict):
if entity not in self.entities:
self.entities[entity] = EntityRecord(entity)
self.entities[entity].update(info)
self.entities[entity].last_mentioned = now()
def get_context(self, entity: str) -> str:
if entity not in self.entities:
return ""
record = self.entities[entity]
related = self.get_relationships(entity)
return f"""
Entity: {entity}
Type: {record.type}
Properties: {record.properties}
Related: {related}
Last mentioned: {record.last_mentioned}
"""
class TieredMemory:
def __init__(self):
self.hot = LRUCache(100) # Frequent access
self.warm = VectorStore() # Semantic search
self.cold = PersistentStore() # Rarely accessed
def get(self, key: str):
# Check hot first
if key in self.hot:
return self.hot[key]
# Then warm
result = self.warm.get(key)
if result:
self.hot[key] = result # Promote
return result
# Finally cold
result = self.cold.get(key)
if result:
self.warm.add(key, result) # Promote
return result
return None
context-management - Managing context in workflowsdata-modeling - Designing memory data structuresllm-project-development - Building LLM applications# Understand architecture options
cat reference/architecture-patterns.md
# See implementation examples
cat examples/conversation-memory.md
# Use selection checklist
cat checklists/memory-selection-checklist.md
Skill Version: 1.0 Key Benchmark: Zep 94.8% recall, 45ms latency Related Agents: 8 knowledge-base agents Last Updated: 2025-01-15
Implement GDPR-compliant data handling with consent management, data subject rights, and privacy by design. Use when building systems that process EU personal data, implementing privacy controls, or conducting GDPR compliance reviews.
Create employment contracts, offer letters, and HR policy documents following legal best practices. Use when drafting employment agreements, creating HR policies, or standardizing employment documentation.