Renderers

VRM (Local)

Render 3D VRM avatars locally in the browser with Three.js.

The VRMLocalRenderer renders a 3D VRM model in the browser using Three.js and @pixiv/three-vrm. It includes automatic blink animation, expression presets, and lip-sync (RMS-based or viseme-based).

Installation

npm install three @pixiv/three-vrm

Usage

import { VRMLocalRenderer } from "avatarlayer/renderers";

const renderer = new VRMLocalRenderer({
  modelUrl: "/models/avatar.vrm",
  idleAnimationUrl: "/animations/idle.fbx",  // optional
  visemeLipSync: true,                        // optional
  renderer: "webgpu",                         // optional, default "webgl"
});

Constructor options

OptionTypeDefaultDescription
modelUrlstringrequiredURL to the .vrm model file
idleAnimationUrlstringOptional URL to an idle animation (FBX/GLB)
visemeLipSyncbooleanEnable viseme-based lip-sync instead of RMS amplitude mapping
renderer'webgl' | 'webgpu''webgl'Rendering backend. When 'webgpu', uses Three.js WebGPURenderer with MToonNodeMaterial. Falls back to WebGL automatically when WebGPU is unavailable.

How it works

When mounted, the renderer:

  1. Resolves the rendering backend (WebGL or WebGPU), falling back to WebGL if needed
  2. Creates a Three.js scene with camera and lighting
  3. Loads the VRM model via GLTFLoader with the VRM plugin (using MToonNodeMaterial for WebGPU)
  4. Starts an automatic blink loop (random interval, 3-6 seconds)
  5. Optionally loads an idle animation from the provided URL
  6. Registers with a shared render pool that drives all VRM renderers from a single requestAnimationFrame loop

When speak(audio) is called:

  1. The audio blob is played through an <audio> element
  2. A LipSyncEngine analyzes the audio in realtime via AnalyserNode
  3. Audio is mapped to the VRM mouth-open expression — either via RMS amplitude (Aa preset) or viseme weights when visemeLipSync is enabled
  4. The promise resolves when the audio ends

Avatar control

The VRM renderer responds to update() calls for fine-grained control:

session.updateControl({
  avatar: {
    face: {
      mouth: { jawOpen: 0.5, smile: 0, mouthPucker: 0 },
      eyes: { blinkL: 0, blinkR: 0, gazeX: 0, gazeY: 0 },
    },
    emotion: {
      label: "happy",
      intensity: 0.8,
      valence: 0.7,
      arousal: 0.5,
    },
  },
});

Supported expression presets: happy, sad, angry, surprised, relaxed, neutral.

Resize handling

The renderer automatically responds to container resize via ResizeObserver, updating the camera aspect ratio and display canvas dimensions.

WebGL / WebGPU backends

The renderer supports both WebGL (default) and WebGPU rendering backends. When renderer: 'webgpu' is set and the browser supports WebGPU, the renderer uses Three.js WebGPURenderer with MToonNodeMaterial for VRM materials. When WebGPU is unavailable, it falls back to WebGL automatically.

Shared GPU context

When multiple VRMLocalRenderer instances are active simultaneously — for example, in an AvatarStage with several characters — they transparently share GPU contexts and a single render loop. Each renderer maintains its own Three.js scene, camera, and VRM model, but a module-level render pool drives all of them from one requestAnimationFrame callback. WebGL and WebGPU entries use separate shared renderers, so you can mix backends in the same page.

On each frame the pool renders each scene into the shared GPU canvas, then blits the result to that renderer's 2D display canvas via drawImage(). This collapses N GPU contexts into one, avoiding the resource limits and performance costs that multiple contexts cause on mobile devices.

This optimization is fully transparent — the AvatarRenderer interface is unchanged and client code does not need to opt in. A single renderer works identically; the pool activates automatically when any VRMLocalRenderer is mounted.