Memory
Agentspan provides two memory systems: ConversationMemory for managing chat history and SemanticMemory for long-term knowledge retrieval. They serve different purposes and can be used together.
Import
from agentspan.agents import Agent, ConversationMemory
from agentspan.agents.semantic_memory import SemanticMemory
ConversationMemory
Manages chat history as a list of messages. Messages are prepended to the LLM’s message list at compile time, giving the LLM context from previous interactions.
from agentspan.agents import Agent, AgentRuntime, ConversationMemory
memory = ConversationMemory(max_messages=100)
agent = Agent(
name="assistant",
model="openai/gpt-4o",
instructions="You are a helpful assistant.",
memory=memory,
)
with AgentRuntime() as runtime:
result = runtime.run(agent, "My name is Alice.")
memory.add_user_message("My name is Alice.")
memory.add_assistant_message(result.output['result'])
result2 = runtime.run(agent, "What's my name?")
result2.print_result() # "Your name is Alice."
Parameters
| Field | Type | Default | Description |
|---|---|---|---|
messages | list[dict] | [] | Accumulated conversation messages |
max_messages | int | None | Maximum messages to retain. None = unlimited |
Methods
| Method | Description |
|---|---|
add_user_message(content) | Append a user message |
add_assistant_message(content) | Append an assistant message |
add_system_message(content) | Append a system message |
add_tool_call(tool_name, args, task_reference_name) | Record a tool invocation |
add_tool_result(tool_name, result, task_reference_name) | Record a tool result |
clear() | Clear all history |
Trimming Behavior
When max_messages is set and the message count exceeds it:
- System messages are preserved in their original positions
- Oldest non-system messages are removed first
- Budget:
max_messages - system_countnon-system messages kept (newest)
SemanticMemory
Long-term memory with similarity-based retrieval. Stores facts, preferences, and knowledge recalled by relevance to the current query.
from agentspan.agents.semantic_memory import SemanticMemory
memory = SemanticMemory(max_results=3)
memory.add("Customer prefers email communication.")
memory.add("Account is on the Enterprise plan since March 2021.")
memory.add("Last issue: billing discrepancy on invoice #1042.")
context = memory.get_context("What plan am I on?")
# "Relevant context from memory:\n 1. Account is on the Enterprise plan..."
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
store | MemoryStore | InMemoryStore() | Storage backend |
max_results | int | 5 | Maximum memories to retrieve per query |
session_id | str | None | Optional session scope |
Methods
| Method | Returns | Description |
|---|---|---|
add(content, metadata) | str (entry ID) | Store a memory |
search(query, top_k) | list[str] | Return relevant memory content strings |
search_entries(query, top_k) | list[MemoryEntry] | Return full MemoryEntry objects |
get_context(query) | str | Get memories formatted for prompt injection |
delete(memory_id) | bool | Delete a memory by ID |
clear() | — | Delete all memories |
list_all() | list[MemoryEntry] | Return all stored memories |
Usage: Expose as a Tool (recommended)
The agent decides when to search and what to query:
from agentspan.agents import Agent, AgentRuntime, tool
from agentspan.agents.semantic_memory import SemanticMemory
memory = SemanticMemory(max_results=3)
memory.add("User prefers Python over JavaScript")
memory.add("User is a senior engineer with 10 years experience")
@tool
def get_context(query: str) -> str:
"""Retrieve relevant context from memory."""
return memory.get_context(query)
agent = Agent(
name="assistant",
model="openai/gpt-4o",
tools=[get_context],
)
with AgentRuntime() as runtime:
result = runtime.run(agent, "What language should I use?")
result.print_result()
Usage: Inject into System Prompt
def build_instructions() -> str:
context = memory.get_context("relevant query")
return f"You are a support agent.\n\n{context}"
agent = Agent(
name="support",
model="openai/gpt-4o",
instructions=build_instructions, # callable
)
Storage Backends
The default InMemoryStore uses Jaccard similarity. Non-persistent — suitable for development.
For production, implement MemoryStore:
from agentspan.agents.semantic_memory import MemoryStore, MemoryEntry, SemanticMemory
class PineconeStore(MemoryStore):
def __init__(self, index_name: str, api_key: str):
self.index = pinecone.Index(index_name, api_key=api_key)
def add(self, entry: MemoryEntry) -> str:
embedding = get_embedding(entry.content)
self.index.upsert([(entry.id, embedding, {"content": entry.content})])
return entry.id
def search(self, query: str, top_k: int = 5) -> list[MemoryEntry]:
embedding = get_embedding(query)
results = self.index.query(embedding, top_k=top_k)
return [MemoryEntry(id=r.id, content=r.metadata["content"]) for r in results.matches]
def delete(self, memory_id: str) -> bool:
self.index.delete(ids=[memory_id])
return True
def clear(self) -> None:
self.index.delete(delete_all=True)
def list_all(self) -> list[MemoryEntry]:
...
memory = SemanticMemory(store=PineconeStore("my-index", api_key="..."))
Compatible backends: Pinecone, Weaviate, ChromaDB, Qdrant, Mem0, or any vector search service.
Comparison
| ConversationMemory | SemanticMemory | |
|---|---|---|
| Purpose | Chat history (messages) | Long-term knowledge (facts) |
| Retrieval | All messages, FIFO trimmed | Similarity search |
| Injection | Prepended as LLM messages | Formatted text via tool or instructions |
| Persistence | In-process (lost on restart) | Pluggable backend (can persist) |
| Best for | Multi-turn conversations in a session | Cross-session preferences, user facts |