Best Practices

How to memory like a pro πŸ†

You know how to use Memof.ai. Now let's make sure you're using it well. These are battle-tested patterns from production apps handling millions of memories.

Memory Design Patterns

Pattern 1: User-Centric Bots

When to use: Customer-facing apps, personalization, user contexts

# One bot per user
user_bot = client.bots.create({
    'name': f'user_{user_id}',
    'workspace_id': workspace_id,
    'description': f'Personal memories for user {user_id}'
})

# Store everything about this user
client.memories.create({
    'bot_id': user_bot['id'],
    'content': f'User name: {name}, email: {email}, preferences: {prefs}'
})

Pros: βœ… Clear separation, βœ… Easy privacy management, βœ… Scales well
Cons: ❌ Can create many bots

Pattern 2: Shared Knowledge Bot

When to use: FAQs, product information, company knowledge

# One bot for shared knowledge
knowledge_bot = client.bots.create({
    'name': 'Product Knowledge Base',
    'workspace_id': workspace_id
})

# All users search the same knowledge
results = client.memories.search({
    'bot_id': knowledge_bot['id'],
    'query': user_question
})

Pros: βœ… Efficient, βœ… Single source of truth, βœ… Easy updates
Cons: ❌ No user-specific data

Pattern 3: Hybrid Approach

When to use: Complex apps needing both personal and shared context

class ContextManager:
    def __init__(self, user_id, workspace_id):
        # Personal bot
        self.user_bot = self._get_user_bot(user_id, workspace_id)
        # Shared bot
        self.knowledge_bot = self._get_knowledge_bot(workspace_id)
    
    def get_context(self, query):
        # Search both personal and shared knowledge
        personal = client.memories.search({
            'bot_id': self.user_bot['id'],
            'query': query,
            'limit': 3
        })
        
        shared = client.memories.search({
            'bot_id': self.knowledge_bot['id'],
            'query': query,
            'limit': 3
        })
        
        return {
            'personal': personal,
            'shared': shared
        }

Pros: βœ… Best of both worlds, βœ… Flexible
Cons: ❌ More complex, ❌ More API calls

Pattern 4: Session-Based Bots

When to use: Temporary contexts, conversations, short-term memory

# Create bot for conversation
session_bot = client.bots.create({
    'name': f'session_{session_id}',
    'workspace_id': workspace_id,
    'description': f'Conversation context for {session_id}'
})

# Store conversation history
for message in conversation:
    client.memories.create({
        'bot_id': session_bot['id'],
        'content': f'{message["role"]}: {message["content"]}'
    })

# Clean up when done
# (or keep for history)

Pros: βœ… Isolated contexts, βœ… Clean separation
Cons: ❌ Can accumulate old sessions

Writing Good Memories

Quality over quantity πŸ’Ž

βœ… DO: Write Natural Language

# Good - Natural and searchable
"Sarah from Acme Corp is the CTO, interested in API integration, 
prefers technical documentation, timezone EST"

# Bad - Too structured
"name:Sarah;company:Acme;role:CTO;interest:API;pref:docs;tz:EST"

Why? Semantic search works best with natural language!

βœ… DO: Include Context

# Good - Has who, what, when, why
"Customer John reported slow API responses on 2025-01-15 around 3 PM EST. 
Issue was traced to database query timeout. Resolved by adding index."

# Bad - Missing context
"slow API fixed"

βœ… DO: Be Specific

# Good - Specific and actionable
"User prefers Python 3.11+, FastAPI framework, PostgreSQL database, 
deployed on AWS, uses Docker containers"

# Bad - Too vague
"User likes Python"

❌ DON'T: Store Sensitive Data Unencrypted

# Bad - Never store raw passwords/keys!
"User password: hunter2"
"API key: sk_live_..."

# Good - Store references or encrypted versions
"User has password set, last changed 2025-01-15"
"User has API key configured (key ID: key_abc123)"

❌ DON'T: Store Everything

# Bad - Too granular
"User clicked button at 10:30:45"
"User moved mouse"
"User scrolled page"

# Good - Store meaningful events
"User completed onboarding flow, viewed tutorial, enabled notifications"

❌ DON'T: Use Only Keywords

# Bad - Hard to search semantically
"python fastapi docker aws"

# Good - Use sentences
"User's tech stack includes Python with FastAPI framework, 
containerized with Docker, and deployed on AWS"

Searching Effectively

Getting the best results 🎯

Use Broad Queries

# Good - Broad concept
results = client.memories.search({
    'bot_id': bot_id,
    'query': 'programming language preferences'
})

# Will find:
# - "User loves Python"
# - "Prefers TypeScript over JavaScript"
# - "Enjoys functional programming"

Adjust Result Limits

# For quick context - top results
quick = client.memories.search({
    'bot_id': bot_id,
    'query': 'user preferences',
    'limit': 3  # Just the top 3
})

# For comprehensive search
comprehensive = client.memories.search({
    'bot_id': bot_id,
    'query': 'project history',
    'limit': 20  # Get more context
})

Combine Multiple Searches

# Search different aspects
tech_stack = client.memories.search({
    'bot_id': bot_id,
    'query': 'technology preferences'
})

work_style = client.memories.search({
    'bot_id': bot_id,
    'query': 'work habits and schedule'
})

projects = client.memories.search({
    'bot_id': bot_id,
    'query': 'current projects'
})

# Combine results for complete picture
context = {
    'tech': tech_stack,
    'style': work_style,
    'projects': projects
}

Performance Optimization

Making it fast ⚑

1. Cache Bot IDs

# Bad - Fetching bot every time
def get_user_context(user_id):
    bots = client.bots.list()
    user_bot = find_bot(bots, user_id)  # Slow!
    return client.memories.search({'bot_id': user_bot['id'], ...})

# Good - Cache bot IDs
BOT_CACHE = {}

def get_user_bot_id(user_id):
    if user_id not in BOT_CACHE:
        bots = client.bots.list()
        BOT_CACHE[user_id] = find_bot(bots, user_id)['id']
    return BOT_CACHE[user_id]

2. Batch Operations

# Bad - One at a time
for item in items:
    client.memories.create({'bot_id': bot_id, 'content': item})

# Good - Batch when possible
from concurrent.futures import ThreadPoolExecutor

def store_memory(content):
    return client.memories.create({'bot_id': bot_id, 'content': content})

with ThreadPoolExecutor(max_workers=5) as executor:
    results = list(executor.map(store_memory, items))

3. Use Appropriate Limits

# Don't over-fetch
results = client.memories.search({
    'bot_id': bot_id,
    'query': 'preferences',
    'limit': 5  # Usually enough!
})

# Instead of
all_results = client.memories.search({
    'bot_id': bot_id,
    'query': 'preferences',
    'limit': 100  # Overkill for most cases
})

4. Implement Retry Logic

from time import sleep

def store_with_retry(memory_data, max_retries=3):
    for attempt in range(max_retries):
        try:
            return client.memories.create(memory_data)
        except Exception as e:
            if attempt == max_retries - 1:
                raise
            sleep(2 ** attempt)  # Exponential backoff

Production Checklist

Before you go live πŸš€

Security

  • Store API tokens in environment variables, not in code
  • Use least-privilege tokens (if available in your plan)
  • Don't store sensitive data (passwords, credit cards, etc.)
  • Implement rate limiting on your end
  • Log API errors for debugging
import os

# Good
api_token = os.getenv('MEMOFAI_TOKEN')

# Bad
api_token = 'moa_hardcoded_token_12345'  # Never do this!

Error Handling

from memofai import create_moa_client
from memofai.exceptions import MoaError, RateLimitError

client = create_moa_client(api_token=os.getenv('MEMOFAI_TOKEN'))

try:
    memory = client.memories.create({
        'bot_id': bot_id,
        'content': content
    })
except RateLimitError:
    # Handle rate limits
    logger.warning("Rate limit hit, queuing for retry")
    queue_for_retry(memory_data)
except MoaError as e:
    # Handle other API errors
    logger.error(f"API error: {e}")
    # Fallback behavior
except Exception as e:
    # Handle unexpected errors
    logger.exception("Unexpected error")
    # Alert monitoring system

Monitoring

import logging
from time import time

logger = logging.getLogger(__name__)

def search_with_monitoring(bot_id, query):
    start = time()
    try:
        results = client.memories.search({
            'bot_id': bot_id,
            'query': query
        })
        duration = time() - start
        
        logger.info(f"Search completed in {duration:.2f}s, "
                   f"found {len(results)} results")
        
        # Track metrics
        metrics.record('memofai.search.duration', duration)
        metrics.record('memofai.search.results', len(results))
        
        return results
    except Exception as e:
        logger.error(f"Search failed: {e}")
        metrics.increment('memofai.search.errors')
        raise

Data Management

  • Regular cleanup of old/unused bots
  • Archive old memories if needed
  • Monitor storage usage
  • Backup critical bot IDs and workspace IDs
# Cleanup script (run periodically)
from datetime import datetime, timedelta

def cleanup_old_session_bots():
    """Delete session bots older than 7 days"""
    cutoff = datetime.now() - timedelta(days=7)
    
    bots = client.bots.list()
    for bot in bots:
        if bot['name'].startswith('session_'):
            created = datetime.fromisoformat(bot['created_at'])
            if created < cutoff:
                client.bots.delete(bot['id'])
                logger.info(f"Deleted old session bot: {bot['id']}")

Common Pitfalls

Learn from others' mistakes 🚧

Pitfall 1: Over-Engineering

# ❌ Too complex
class MemoryManager:
    def __init__(self):
        self.cache = {}
        self.queue = []
        self.workers = []
        # ... 200 more lines

# βœ… Keep it simple
def store_memory(bot_id, content):
    return client.memories.create({
        'bot_id': bot_id,
        'content': content
    })

Lesson: Start simple, optimize when needed.

Pitfall 2: Ignoring Rate Limits

# ❌ Hammering the API
for i in range(10000):
    client.memories.create(...)  # Will hit rate limit!

# βœ… Respect rate limits
from time import sleep

for i in range(10000):
    client.memories.create(...)
    if i % 100 == 0:
        sleep(1)  # Brief pause every 100 requests

Pitfall 3: Not Handling Errors

# ❌ No error handling
memory = client.memories.create(data)  # What if it fails?

# βœ… Proper error handling
try:
    memory = client.memories.create(data)
except Exception as e:
    logger.error(f"Failed to store memory: {e}")
    # Fallback or retry logic

Pitfall 4: Storing Too Much

# ❌ Storing everything
for event in user_events:  # 10,000 events!
    client.memories.create({
        'bot_id': bot_id,
        'content': f'User did {event}'
    })

# βœ… Store meaningful summaries
summary = summarize_events(user_events)
client.memories.create({
    'bot_id': bot_id,
    'content': summary  # One meaningful memory
})

Testing Strategies

Test before you ship πŸ§ͺ

Unit Tests

import unittest
from unittest.mock import Mock, patch

class TestMemoryManager(unittest.TestCase):
    @patch('memofai.create_moa_client')
    def test_store_memory(self, mock_client):
        # Mock the client
        mock_client.return_value.memories.create.return_value = {
            'id': 'mem_123',
            'content': 'test'
        }
        
        # Test your code
        manager = MemoryManager()
        result = manager.store('test content')
        
        self.assertEqual(result['id'], 'mem_123')

Integration Tests

def test_full_workflow():
    """Test the complete workflow with real API"""
    # Use test environment
    client = create_moa_client(
        api_token=os.getenv('MEMOFAI_TEST_TOKEN'),
        environment='sandbox'  # Use sandbox!
    )
    
    # Create test workspace
    workspace = client.workspaces.create({'name': 'Test Workspace'})
    
    try:
        # Test operations
        bot = client.bots.create({
            'name': 'Test Bot',
            'workspace_id': workspace['id']
        })
        
        memory = client.memories.create({
            'bot_id': bot['id'],
            'content': 'Test memory'
        })
        
        results = client.memories.search({
            'bot_id': bot['id'],
            'query': 'test'
        })
        
        assert len(results) > 0
        
    finally:
        # Cleanup
        client.bots.delete(bot['id'])
        client.workspaces.delete(workspace['id'])

Environment Management

Different environments for different stages 🌍

import os

ENVIRONMENTS = {
    'development': 'http://127.0.0.1:8000',
    'sandbox': 'https://sandbox-api.memof.ai',
    'production': 'https://api.memof.ai'
}

env = os.getenv('APP_ENV', 'development')
token_key = f'MEMOFAI_TOKEN_{env.upper()}'

client = create_moa_client(
    api_token=os.getenv(token_key),
    environment=env
)

Scaling Considerations

For when you get popular πŸ“ˆ

1. Database Caching

# Cache bot IDs in your database
class User(Model):
    id = Column(Integer, primary_key=True)
    memofai_bot_id = Column(String)  # Cache the bot ID
    
def get_user_bot_id(user):
    if not user.memofai_bot_id:
        # Create and cache
        bot = client.bots.create({
            'name': f'user_{user.id}',
            'workspace_id': WORKSPACE_ID
        })
        user.memofai_bot_id = bot['id']
        db.session.commit()
    
    return user.memofai_bot_id

2. Async Workers

# Use task queues for heavy operations
from celery import Celery

celery = Celery('tasks')

@celery.task
def store_memories_async(bot_id, memories):
    """Store memories in background"""
    for memory in memories:
        client.memories.create({
            'bot_id': bot_id,
            'content': memory
        })

# In your API
@app.post('/store')
def store_endpoint(memories: list):
    store_memories_async.delay(bot_id, memories)
    return {'status': 'queued'}

3. Load Balancing

# Distribute requests across multiple clients
from random import choice

clients = [
    create_moa_client(api_token=token1),
    create_moa_client(api_token=token2),
    create_moa_client(api_token=token3),
]

def get_client():
    return choice(clients)

# Use
client = get_client()
results = client.memories.search(...)

TL;DR Quick Reference

The cheat sheet πŸ“

DO βœ…:

  • Write natural language memories
  • Include context (who, what, when, why)
  • Cache bot IDs
  • Handle errors gracefully
  • Use appropriate search limits
  • Test in sandbox first
  • Store tokens in environment variables

DON'T ❌:

  • Store sensitive data unencrypted
  • Ignore rate limits
  • Over-engineer initially
  • Store every single event
  • Hardcode API tokens
  • Skip error handling

Need More Help?


Ready to optimize? β†’ Troubleshooting Guide