Memory and Threads

Persist conversations across sessions with thread providers, semantic recall, and embeddings.

AvatarLayer's memory system persists conversation history across sessions using pluggable thread providers. Optionally, semantic recall retrieves relevant past messages via vector embeddings for long-term context.

Setup

Add a memory config to your session:

import {
  AvatarSession,
  OpenAIAdapter,
  ElevenLabsAdapter,
  VRMLocalRenderer,
  LocalStorageThreadProvider,
} from "avatarlayer";

const session = new AvatarSession({
  llm: new OpenAIAdapter({ apiKey: "sk-..." }),
  tts: new ElevenLabsAdapter({ apiKey: "..." }),
  renderer: new VRMLocalRenderer({ modelUrl: "/models/avatar.vrm" }),
  memory: {
    provider: new LocalStorageThreadProvider(),
    maxMessages: 50,
  },
});

When the session starts, it loads (or creates) a thread and populates message history. All messages sent and received are automatically persisted.

MemoryConfig

FieldTypeDefaultDescription
providerThreadProviderrequiredStorage backend for threads and messages
threadIdstringResume a specific thread. If omitted, a new thread is created.
resourceIdstringGroup threads by resource (e.g. a user ID)
maxMessagesnumberMax recent messages to include in the LLM context window
semanticRecallSemanticRecallConfigEnable vector-based recall of relevant past messages

Thread providers

InMemoryThreadProvider

Stores threads in memory. Data is lost on page refresh. Useful for development.

import { InMemoryThreadProvider } from "avatarlayer";
const provider = new InMemoryThreadProvider();

LocalStorageThreadProvider

Persists threads to localStorage. Good for prototyping and single-user apps.

import { LocalStorageThreadProvider } from "avatarlayer";
const provider = new LocalStorageThreadProvider({ prefix: "avatarlayer-threads" });
OptionTypeDefaultDescription
prefixstring"avatarlayer-threads"localStorage key prefix

IndexedDBThreadProvider

Persists threads to IndexedDB. Handles larger data volumes than localStorage.

import { IndexedDBThreadProvider } from "avatarlayer";
const provider = new IndexedDBThreadProvider({ dbName: "avatarlayer" });
OptionTypeDefaultDescription
dbNamestring"avatarlayer"IndexedDB database name
versionnumber1Database version

NeonThreadProvider

Server-side thread provider backed by Neon Postgres. Supports vector operations for semantic recall.

import { NeonThreadProvider } from "avatarlayer";

const provider = new NeonThreadProvider({
  connectionString: process.env.NEON_DATABASE_URL!,
  schema: "avatarlayer",    // optional
  dimensions: 1536,          // optional, must match your embedding model
});
OptionTypeDefaultDescription
connectionStringstringrequiredNeon Postgres connection string
schemastring"avatarlayer"Database schema name
dimensionsnumber1536Vector dimensions (must match embedding model)

Vector thread providers

For client-side semantic recall, use vector-capable thread providers:

LocalStorageVectorThreadProvider

import { LocalStorageVectorThreadProvider } from "avatarlayer";
const provider = new LocalStorageVectorThreadProvider({ prefix: "avatarlayer-threads" });

InMemoryVectorThreadProvider

import { InMemoryVectorThreadProvider } from "avatarlayer";
const provider = new InMemoryVectorThreadProvider();

Both implement VectorThreadProvider, which extends ThreadProvider with upsertVectors, queryVectors, and createVectorIndex methods.

Semantic recall

When configured, semantic recall retrieves relevant past messages using vector similarity and injects them into the LLM context:

import {
  LocalStorageVectorThreadProvider,
  OpenAIEmbeddingProvider,
} from "avatarlayer";

const session = new AvatarSession({
  // ...other config
  memory: {
    provider: new LocalStorageVectorThreadProvider(),
    maxMessages: 50,
    semanticRecall: {
      embedder: new OpenAIEmbeddingProvider({ apiKey: "sk-..." }),
      topK: 5,
      messageRange: 5,
      threshold: 0.5,
      scope: "thread",
      timeDecayWeight: 0.2,
      indexName: "memory_messages",
    },
  },
});

SemanticRecallConfig

FieldTypeDefaultDescription
embedderEmbeddingProviderrequiredEmbedding model for vectorization
topKnumber5Number of vector hits to retrieve
messageRangenumber5Messages before/after each hit to include as context
thresholdnumber0.5Minimum cosine similarity to include a result
scope"thread" | "resource""thread"Search within current thread or across all threads for the resource
timeDecayWeightnumber0.2Weight for time-based recency boost (0 disables)
indexNamestring"memory_messages"Name of the vector index

Embedding providers

OpenAI:

import { OpenAIEmbeddingProvider } from "avatarlayer";

const embedder = new OpenAIEmbeddingProvider({
  apiKey: "sk-...",
  model: "text-embedding-3-small",  // optional
  dimensions: 1536,                  // optional
});

Transformers.js (local):

import { TransformersEmbeddingProvider } from "avatarlayer/local";

const embedder = new TransformersEmbeddingProvider({
  model: "Xenova/all-MiniLM-L6-v2",  // optional
});

See Local ML for more on-device options.

Thread management

Switch between threads or create new ones at runtime:

await session.switchThread("thread-abc-123");

await session.newThread({ title: "New conversation" });

Events

EventPayloadDescription
thread-changeThreadActive thread changed
history-loadedChatMessage[]Persisted messages loaded into history

ThreadProvider interface

interface ThreadProvider {
  getThread(threadId: string): Promise<Thread | null>;
  saveThread(thread: Thread): Promise<Thread>;
  updateThread(threadId: string, patch: { title?: string; metadata?: Record<string, unknown> }): Promise<Thread>;
  deleteThread(threadId: string): Promise<void>;
  listThreads(opts?: ListThreadsOptions): Promise<Thread[]>;
  getMessages(threadId: string, opts?: GetMessagesOptions): Promise<ChatMessage[]>;
  saveMessages(threadId: string, messages: ChatMessage[]): Promise<void>;
  deleteMessages(threadId: string, messageIds: string[]): Promise<void>;
}