Custom Adapters

Implement your own LLM, TTS, or renderer adapters.

AvatarLayer's provider model is interface-based. Implement any of the provider interfaces to add your own integrations.

Custom LLM

Implement the LLMProvider interface:

import type {
  LLMProvider,
  ChatMessage,
  LLMChunk,
  LLMOptions,
} from "avatarlayer";

class MyLLM implements LLMProvider {
  readonly id = "my-llm";

  async *chat(
    messages: ChatMessage[],
    opts?: LLMOptions,
  ): AsyncIterable<LLMChunk> {
    const response = await fetch("https://my-llm-api.com/chat", {
      method: "POST",
      headers: { "Content-Type": "application/json" },
      body: JSON.stringify({ messages, model: opts?.model }),
      signal: opts?.signal,
    });

    const reader = response.body!.getReader();
    const decoder = new TextDecoder();

    while (true) {
      const { done, value } = await reader.read();
      if (done) break;
      yield { text: decoder.decode(value), done: false };
    }

    yield { text: "", done: true };
  }
}

Key requirements:

  • The chat method must be an async generator yielding LLMChunk objects
  • The final chunk should have done: true
  • Respect opts.signal for cancellation support

Custom TTS

Implement the TTSProvider interface:

import type { TTSProvider, TTSOptions } from "avatarlayer";

class MyTTS implements TTSProvider {
  readonly id = "my-tts";

  async synthesize(text: string, opts?: TTSOptions): Promise<Blob> {
    const response = await fetch("https://my-tts-api.com/synthesize", {
      method: "POST",
      headers: { "Content-Type": "application/json" },
      body: JSON.stringify({
        text,
        voice: opts?.voiceId ?? "default",
      }),
      signal: opts?.signal,
    });

    return response.blob();
  }
}

Key requirements:

  • Return an audio Blob that the browser can decode (MP3, WAV, OGG, etc.)
  • Respect opts.signal for cancellation support

Custom renderer

Implement the AvatarRenderer interface:

import type { AvatarRenderer, AvatarControl } from "avatarlayer";

class MyRenderer implements AvatarRenderer {
  readonly id = "my-renderer";
  readonly type = "local" as const;

  private container: HTMLElement | null = null;
  private audio: HTMLAudioElement | null = null;
  private resolve: (() => void) | null = null;

  async mount(container: HTMLElement): Promise<void> {
    this.container = container;
    // Set up your rendering surface (canvas, video, etc.)
  }

  update(control: Partial<AvatarControl>): void {
    // Apply avatar state changes (face, emotion, etc.)
  }

  async speak(audio: Blob): Promise<void> {
    return new Promise((resolve) => {
      this.resolve = resolve;
      const url = URL.createObjectURL(audio);
      this.audio = new Audio(url);
      this.audio.onended = () => {
        URL.revokeObjectURL(url);
        this.resolve = null;
        resolve();
      };
      this.audio.play();
    });
  }

  interrupt(): void {
    if (this.audio) {
      this.audio.pause();
      this.audio = null;
    }
    if (this.resolve) {
      this.resolve();
      this.resolve = null;
    }
  }

  unmount(): void {
    this.interrupt();
    this.container = null;
  }
}

Key requirements:

  • mount() must resolve once the renderer is ready to display
  • speak() must resolve when audio playback finishes
  • interrupt() must stop playback immediately and resolve any pending speak() promise
  • unmount() must clean up all resources

Optional: speakText

If your renderer handles TTS internally (like HeyGen), implement speakText and AvatarSession will skip external TTS:

async speakText(text: string, signal?: AbortSignal): Promise<void> {
  // Send text to your service, wait for speech to complete
}

Using custom adapters

const session = new AvatarSession({
  llm: new MyLLM(),
  tts: new MyTTS(),
  renderer: new MyRenderer(),
});

Custom adapters are first-class citizens — they work with all session features including interruption, runtime swaps, and React bindings.