Skip to main content

Overview

OmniHuman V1.5 (by ByteDance) generates full-body human animation videos from a single image and audio file. Unlike lipsync models that only animate the mouth, OmniHuman animates the entire body including gestures and posture.
Model IDInputResolutionCredits~Cost
omnihuman-v1.5Image + audio720p or 1080p100$1.00

Quick start

import { createVarg } from "vargai/ai"

const varg = createVarg({ apiKey: process.env.VARG_API_KEY! })

const result = await varg.videoModel("omnihuman-v1.5").generate({
  imageUrl: "https://example.com/full-body-portrait.jpg",
  audioUrl: "https://example.com/speech.mp3",
})

console.log(result.video.url)

Parameters

files
array
required
Two files: one image (full body preferred) and one audio.
provider_options
object
resolution"720p" or "1080p" (default "1080p"). turbo_mode — boolean for faster generation at slightly lower quality.

Pricing

ModelCreditsUSD
omnihuman-v1.5100$1.00

Tips

  • Full-body images work best. OmniHuman animates hands, arms, and posture — use images showing the full or upper body.
  • Turbo mode trades quality for speed. Good for previewing before final renders.
  • 1080p resolution is recommended for production content. 720p for faster iteration.
  • Quality is variable — results depend heavily on the input image pose and composition.

VEED Fabric

Faster and simpler. Face-only animation.

Sync V2 Pro

Higher quality lip sync on existing video.