VRM (Local)
Render 3D VRM avatars locally in the browser with Three.js.
The VRMLocalRenderer renders a 3D VRM model in the browser using Three.js and @pixiv/three-vrm. It includes automatic blink animation, expression presets, and lip-sync (RMS-based or viseme-based).
Installation
npm install three @pixiv/three-vrmUsage
import { VRMLocalRenderer } from "avatarlayer/renderers";
const renderer = new VRMLocalRenderer({
modelUrl: "/models/avatar.vrm",
idleAnimationUrl: "/animations/idle.fbx", // optional
visemeLipSync: true, // optional
renderer: "webgpu", // optional, default "webgl"
});Constructor options
| Option | Type | Default | Description |
|---|---|---|---|
modelUrl | string | required | URL to the .vrm model file |
idleAnimationUrl | string | — | Optional URL to an idle animation (FBX/GLB) |
visemeLipSync | boolean | — | Enable viseme-based lip-sync instead of RMS amplitude mapping |
renderer | 'webgl' | 'webgpu' | 'webgl' | Rendering backend. When 'webgpu', uses Three.js WebGPURenderer with MToonNodeMaterial. Falls back to WebGL automatically when WebGPU is unavailable. |
How it works
When mounted, the renderer:
- Resolves the rendering backend (WebGL or WebGPU), falling back to WebGL if needed
- Creates a Three.js scene with camera and lighting
- Loads the VRM model via GLTFLoader with the VRM plugin (using
MToonNodeMaterialfor WebGPU) - Starts an automatic blink loop (random interval, 3-6 seconds)
- Optionally loads an idle animation from the provided URL
- Registers with a shared render pool that drives all VRM renderers from a single
requestAnimationFrameloop
When speak(audio) is called:
- The audio blob is played through an
<audio>element - A
LipSyncEngineanalyzes the audio in realtime viaAnalyserNode - Audio is mapped to the VRM mouth-open expression — either via RMS amplitude (
Aapreset) or viseme weights whenvisemeLipSyncis enabled - The promise resolves when the audio ends
Avatar control
The VRM renderer responds to update() calls for fine-grained control:
session.updateControl({
avatar: {
face: {
mouth: { jawOpen: 0.5, smile: 0, mouthPucker: 0 },
eyes: { blinkL: 0, blinkR: 0, gazeX: 0, gazeY: 0 },
},
emotion: {
label: "happy",
intensity: 0.8,
valence: 0.7,
arousal: 0.5,
},
},
});Supported expression presets: happy, sad, angry, surprised, relaxed, neutral.
Resize handling
The renderer automatically responds to container resize via ResizeObserver, updating the camera aspect ratio and display canvas dimensions.
WebGL / WebGPU backends
The renderer supports both WebGL (default) and WebGPU rendering backends. When renderer: 'webgpu' is set and the browser supports WebGPU, the renderer uses Three.js WebGPURenderer with MToonNodeMaterial for VRM materials. When WebGPU is unavailable, it falls back to WebGL automatically.
Shared GPU context
When multiple VRMLocalRenderer instances are active simultaneously — for example, in an AvatarStage with several characters — they transparently share GPU contexts and a single render loop. Each renderer maintains its own Three.js scene, camera, and VRM model, but a module-level render pool drives all of them from one requestAnimationFrame callback. WebGL and WebGPU entries use separate shared renderers, so you can mix backends in the same page.
On each frame the pool renders each scene into the shared GPU canvas, then blits the result to that renderer's 2D display canvas via drawImage(). This collapses N GPU contexts into one, avoiding the resource limits and performance costs that multiple contexts cause on mobile devices.
This optimization is fully transparent — the AvatarRenderer interface is unchanged and client code does not need to opt in. A single renderer works identically; the pool activates automatically when any VRMLocalRenderer is mounted.