Skip to main content

Overview

Grok Imagine is xAI’s video generation model family. It supports text-to-video, image-to-video, and video editing with native audio generation. Available at 480p and 720p resolutions.
Model IDModeCredits~Cost
grok-imagineText/Image-to-video100$1.00
grok-imagine-editVideo-to-video editing100$1.00

Quick start

import { createVarg } from "vargai/ai"

const varg = createVarg({ apiKey: process.env.VARG_API_KEY! })

const result = await varg.videoModel("grok-imagine").generate({
  prompt: "a coffee cup steaming on a desk, morning light, cozy atmosphere",
  duration: 5,
  aspectRatio: "16:9",
})

console.log(result.video.url)

Capabilities

Video editing

Edit an existing video with a text prompt. Input video is resized to max 854x480 and truncated to 8 seconds.
const result = await varg.videoModel("grok-imagine-edit").generate({
  prompt: "add falling snow to the scene",
  videoUrl: "https://example.com/outdoor-scene.mp4",
})

Native audio generation

Generate synchronized audio alongside the video:
cURL
curl -X POST https://api.varg.ai/v1/video \
  -H "Authorization: Bearer $VARG_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "grok-imagine",
    "prompt": "a crackling campfire under the stars, ambient forest sounds",
    "duration": 5,
    "provider_options": { "varg": { "generate_audio": true } }
  }'

Parameters

prompt
string
required
Text description of the video or edit to apply.
duration
number
default:"5"
Video duration in seconds. Range: 1-15 seconds.
aspect_ratio
string
default:"16:9"
Output aspect ratio.
files
array
For image-to-video: [{ url: "image.jpg" }]. For video editing: [{ url: "video.mp4" }].
provider_options
object
Set { "varg": { "generate_audio": true } } to enable native audio. Supports resolution: "480p" or "720p".
Video editing limits: Input video for grok-imagine-edit is resized to max 854x480 and truncated to 8 seconds. Plan accordingly.

Pricing

ModelCreditsUSD
grok-imagine100$1.00
grok-imagine-edit100$1.00

Tips

  • Native audio makes this model useful for ambient and atmospheric videos where sound matters.
  • Video editing is good for adding effects (weather, lighting changes) to existing footage.
  • Resolution defaults to 480p for speed. Set resolution: "720p" in provider options for higher quality.

Sora 2

Also supports video transformation via remix mode.

Seedance 2

Higher quality video editing with ByteDance’s model.

Grok Imagine Image

The image generation counterpart — cheapest image model.