Best Practices
How to memory like a pro π
You know how to use Memof.ai. Now let's make sure you're using it well. These are battle-tested patterns from production apps handling millions of memories.
Memory Design Patterns
Pattern 1: User-Centric Bots
When to use: Customer-facing apps, personalization, user contexts
# One bot per user
user_bot = client.bots.create({
'name': f'user_{user_id}',
'workspace_id': workspace_id,
'description': f'Personal memories for user {user_id}'
})
# Store everything about this user
client.memories.create({
'bot_id': user_bot['id'],
'content': f'User name: {name}, email: {email}, preferences: {prefs}'
})
Pros: β
Clear separation, β
Easy privacy management, β
Scales well
Cons: β Can create many bots
Pattern 2: Shared Knowledge Bot
When to use: FAQs, product information, company knowledge
# One bot for shared knowledge
knowledge_bot = client.bots.create({
'name': 'Product Knowledge Base',
'workspace_id': workspace_id
})
# All users search the same knowledge
results = client.memories.search({
'bot_id': knowledge_bot['id'],
'query': user_question
})
Pros: β
Efficient, β
Single source of truth, β
Easy updates
Cons: β No user-specific data
Pattern 3: Hybrid Approach
When to use: Complex apps needing both personal and shared context
class ContextManager:
def __init__(self, user_id, workspace_id):
# Personal bot
self.user_bot = self._get_user_bot(user_id, workspace_id)
# Shared bot
self.knowledge_bot = self._get_knowledge_bot(workspace_id)
def get_context(self, query):
# Search both personal and shared knowledge
personal = client.memories.search({
'bot_id': self.user_bot['id'],
'query': query,
'limit': 3
})
shared = client.memories.search({
'bot_id': self.knowledge_bot['id'],
'query': query,
'limit': 3
})
return {
'personal': personal,
'shared': shared
}
Pros: β
Best of both worlds, β
Flexible
Cons: β More complex, β More API calls
Pattern 4: Session-Based Bots
When to use: Temporary contexts, conversations, short-term memory
# Create bot for conversation
session_bot = client.bots.create({
'name': f'session_{session_id}',
'workspace_id': workspace_id,
'description': f'Conversation context for {session_id}'
})
# Store conversation history
for message in conversation:
client.memories.create({
'bot_id': session_bot['id'],
'content': f'{message["role"]}: {message["content"]}'
})
# Clean up when done
# (or keep for history)
Pros: β
Isolated contexts, β
Clean separation
Cons: β Can accumulate old sessions
Writing Good Memories
Quality over quantity π
β
DO: Write Natural Language
# Good - Natural and searchable
"Sarah from Acme Corp is the CTO, interested in API integration,
prefers technical documentation, timezone EST"
# Bad - Too structured
"name:Sarah;company:Acme;role:CTO;interest:API;pref:docs;tz:EST"
Why? Semantic search works best with natural language!
β
DO: Include Context
# Good - Has who, what, when, why
"Customer John reported slow API responses on 2025-01-15 around 3 PM EST.
Issue was traced to database query timeout. Resolved by adding index."
# Bad - Missing context
"slow API fixed"
β
DO: Be Specific
# Good - Specific and actionable
"User prefers Python 3.11+, FastAPI framework, PostgreSQL database,
deployed on AWS, uses Docker containers"
# Bad - Too vague
"User likes Python"
β DON'T: Store Sensitive Data Unencrypted
# Bad - Never store raw passwords/keys!
"User password: hunter2"
"API key: sk_live_..."
# Good - Store references or encrypted versions
"User has password set, last changed 2025-01-15"
"User has API key configured (key ID: key_abc123)"
β DON'T: Store Everything
# Bad - Too granular
"User clicked button at 10:30:45"
"User moved mouse"
"User scrolled page"
# Good - Store meaningful events
"User completed onboarding flow, viewed tutorial, enabled notifications"
β DON'T: Use Only Keywords
# Bad - Hard to search semantically
"python fastapi docker aws"
# Good - Use sentences
"User's tech stack includes Python with FastAPI framework,
containerized with Docker, and deployed on AWS"
Searching Effectively
Getting the best results π―
Use Broad Queries
# Good - Broad concept
results = client.memories.search({
'bot_id': bot_id,
'query': 'programming language preferences'
})
# Will find:
# - "User loves Python"
# - "Prefers TypeScript over JavaScript"
# - "Enjoys functional programming"
Adjust Result Limits
# For quick context - top results
quick = client.memories.search({
'bot_id': bot_id,
'query': 'user preferences',
'limit': 3 # Just the top 3
})
# For comprehensive search
comprehensive = client.memories.search({
'bot_id': bot_id,
'query': 'project history',
'limit': 20 # Get more context
})
Combine Multiple Searches
# Search different aspects
tech_stack = client.memories.search({
'bot_id': bot_id,
'query': 'technology preferences'
})
work_style = client.memories.search({
'bot_id': bot_id,
'query': 'work habits and schedule'
})
projects = client.memories.search({
'bot_id': bot_id,
'query': 'current projects'
})
# Combine results for complete picture
context = {
'tech': tech_stack,
'style': work_style,
'projects': projects
}
Performance Optimization
Making it fast β‘
1. Cache Bot IDs
# Bad - Fetching bot every time
def get_user_context(user_id):
bots = client.bots.list()
user_bot = find_bot(bots, user_id) # Slow!
return client.memories.search({'bot_id': user_bot['id'], ...})
# Good - Cache bot IDs
BOT_CACHE = {}
def get_user_bot_id(user_id):
if user_id not in BOT_CACHE:
bots = client.bots.list()
BOT_CACHE[user_id] = find_bot(bots, user_id)['id']
return BOT_CACHE[user_id]
2. Batch Operations
# Bad - One at a time
for item in items:
client.memories.create({'bot_id': bot_id, 'content': item})
# Good - Batch when possible
from concurrent.futures import ThreadPoolExecutor
def store_memory(content):
return client.memories.create({'bot_id': bot_id, 'content': content})
with ThreadPoolExecutor(max_workers=5) as executor:
results = list(executor.map(store_memory, items))
3. Use Appropriate Limits
# Don't over-fetch
results = client.memories.search({
'bot_id': bot_id,
'query': 'preferences',
'limit': 5 # Usually enough!
})
# Instead of
all_results = client.memories.search({
'bot_id': bot_id,
'query': 'preferences',
'limit': 100 # Overkill for most cases
})
4. Implement Retry Logic
from time import sleep
def store_with_retry(memory_data, max_retries=3):
for attempt in range(max_retries):
try:
return client.memories.create(memory_data)
except Exception as e:
if attempt == max_retries - 1:
raise
sleep(2 ** attempt) # Exponential backoff
Production Checklist
Before you go live π
Security
import os
# Good
api_token = os.getenv('MEMOFAI_TOKEN')
# Bad
api_token = 'moa_hardcoded_token_12345' # Never do this!
Error Handling
from memofai import create_moa_client
from memofai.exceptions import MoaError, RateLimitError
client = create_moa_client(api_token=os.getenv('MEMOFAI_TOKEN'))
try:
memory = client.memories.create({
'bot_id': bot_id,
'content': content
})
except RateLimitError:
# Handle rate limits
logger.warning("Rate limit hit, queuing for retry")
queue_for_retry(memory_data)
except MoaError as e:
# Handle other API errors
logger.error(f"API error: {e}")
# Fallback behavior
except Exception as e:
# Handle unexpected errors
logger.exception("Unexpected error")
# Alert monitoring system
Monitoring
import logging
from time import time
logger = logging.getLogger(__name__)
def search_with_monitoring(bot_id, query):
start = time()
try:
results = client.memories.search({
'bot_id': bot_id,
'query': query
})
duration = time() - start
logger.info(f"Search completed in {duration:.2f}s, "
f"found {len(results)} results")
# Track metrics
metrics.record('memofai.search.duration', duration)
metrics.record('memofai.search.results', len(results))
return results
except Exception as e:
logger.error(f"Search failed: {e}")
metrics.increment('memofai.search.errors')
raise
Data Management
# Cleanup script (run periodically)
from datetime import datetime, timedelta
def cleanup_old_session_bots():
"""Delete session bots older than 7 days"""
cutoff = datetime.now() - timedelta(days=7)
bots = client.bots.list()
for bot in bots:
if bot['name'].startswith('session_'):
created = datetime.fromisoformat(bot['created_at'])
if created < cutoff:
client.bots.delete(bot['id'])
logger.info(f"Deleted old session bot: {bot['id']}")
Common Pitfalls
Learn from others' mistakes π§
Pitfall 1: Over-Engineering
# β Too complex
class MemoryManager:
def __init__(self):
self.cache = {}
self.queue = []
self.workers = []
# ... 200 more lines
# β
Keep it simple
def store_memory(bot_id, content):
return client.memories.create({
'bot_id': bot_id,
'content': content
})
Lesson: Start simple, optimize when needed.
Pitfall 2: Ignoring Rate Limits
# β Hammering the API
for i in range(10000):
client.memories.create(...) # Will hit rate limit!
# β
Respect rate limits
from time import sleep
for i in range(10000):
client.memories.create(...)
if i % 100 == 0:
sleep(1) # Brief pause every 100 requests
Pitfall 3: Not Handling Errors
# β No error handling
memory = client.memories.create(data) # What if it fails?
# β
Proper error handling
try:
memory = client.memories.create(data)
except Exception as e:
logger.error(f"Failed to store memory: {e}")
# Fallback or retry logic
Pitfall 4: Storing Too Much
# β Storing everything
for event in user_events: # 10,000 events!
client.memories.create({
'bot_id': bot_id,
'content': f'User did {event}'
})
# β
Store meaningful summaries
summary = summarize_events(user_events)
client.memories.create({
'bot_id': bot_id,
'content': summary # One meaningful memory
})
Testing Strategies
Test before you ship π§ͺ
Unit Tests
import unittest
from unittest.mock import Mock, patch
class TestMemoryManager(unittest.TestCase):
@patch('memofai.create_moa_client')
def test_store_memory(self, mock_client):
# Mock the client
mock_client.return_value.memories.create.return_value = {
'id': 'mem_123',
'content': 'test'
}
# Test your code
manager = MemoryManager()
result = manager.store('test content')
self.assertEqual(result['id'], 'mem_123')
Integration Tests
def test_full_workflow():
"""Test the complete workflow with real API"""
# Use test environment
client = create_moa_client(
api_token=os.getenv('MEMOFAI_TEST_TOKEN'),
environment='sandbox' # Use sandbox!
)
# Create test workspace
workspace = client.workspaces.create({'name': 'Test Workspace'})
try:
# Test operations
bot = client.bots.create({
'name': 'Test Bot',
'workspace_id': workspace['id']
})
memory = client.memories.create({
'bot_id': bot['id'],
'content': 'Test memory'
})
results = client.memories.search({
'bot_id': bot['id'],
'query': 'test'
})
assert len(results) > 0
finally:
# Cleanup
client.bots.delete(bot['id'])
client.workspaces.delete(workspace['id'])
Environment Management
Different environments for different stages π
import os
ENVIRONMENTS = {
'development': 'http://127.0.0.1:8000',
'sandbox': 'https://sandbox-api.memof.ai',
'production': 'https://api.memof.ai'
}
env = os.getenv('APP_ENV', 'development')
token_key = f'MEMOFAI_TOKEN_{env.upper()}'
client = create_moa_client(
api_token=os.getenv(token_key),
environment=env
)
Scaling Considerations
For when you get popular π
1. Database Caching
# Cache bot IDs in your database
class User(Model):
id = Column(Integer, primary_key=True)
memofai_bot_id = Column(String) # Cache the bot ID
def get_user_bot_id(user):
if not user.memofai_bot_id:
# Create and cache
bot = client.bots.create({
'name': f'user_{user.id}',
'workspace_id': WORKSPACE_ID
})
user.memofai_bot_id = bot['id']
db.session.commit()
return user.memofai_bot_id
2. Async Workers
# Use task queues for heavy operations
from celery import Celery
celery = Celery('tasks')
@celery.task
def store_memories_async(bot_id, memories):
"""Store memories in background"""
for memory in memories:
client.memories.create({
'bot_id': bot_id,
'content': memory
})
# In your API
@app.post('/store')
def store_endpoint(memories: list):
store_memories_async.delay(bot_id, memories)
return {'status': 'queued'}
3. Load Balancing
# Distribute requests across multiple clients
from random import choice
clients = [
create_moa_client(api_token=token1),
create_moa_client(api_token=token2),
create_moa_client(api_token=token3),
]
def get_client():
return choice(clients)
# Use
client = get_client()
results = client.memories.search(...)
TL;DR Quick Reference
The cheat sheet π
DO β
:
- Write natural language memories
- Include context (who, what, when, why)
- Cache bot IDs
- Handle errors gracefully
- Use appropriate search limits
- Test in sandbox first
- Store tokens in environment variables
DON'T β:
- Store sensitive data unencrypted
- Ignore rate limits
- Over-engineer initially
- Store every single event
- Hardcode API tokens
- Skip error handling
Need More Help?
Ready to optimize? β Troubleshooting Guide