Overview
Grok Imagine is xAI’s video generation model family. It supports text-to-video, image-to-video, and video editing with native audio generation. Available at 480p and 720p resolutions.| Model ID | Mode | Credits | ~Cost |
|---|---|---|---|
grok-imagine | Text/Image-to-video | 100 | $1.00 |
grok-imagine-edit | Video-to-video editing | 100 | $1.00 |
Quick start
Capabilities
Video editing
Edit an existing video with a text prompt. Input video is resized to max 854x480 and truncated to 8 seconds.Native audio generation
Generate synchronized audio alongside the video:cURL
Parameters
Text description of the video or edit to apply.
Video duration in seconds. Range: 1-15 seconds.
Output aspect ratio.
For image-to-video:
[{ url: "image.jpg" }]. For video editing: [{ url: "video.mp4" }].Set
{ "varg": { "generate_audio": true } } to enable native audio. Supports resolution: "480p" or "720p".Pricing
| Model | Credits | USD |
|---|---|---|
grok-imagine | 100 | $1.00 |
grok-imagine-edit | 100 | $1.00 |
Tips
- Native audio makes this model useful for ambient and atmospheric videos where sound matters.
- Video editing is good for adding effects (weather, lighting changes) to existing footage.
- Resolution defaults to 480p for speed. Set
resolution: "720p"in provider options for higher quality.
Related models
Sora 2
Also supports video transformation via remix mode.
Seedance 2
Higher quality video editing with ByteDance’s model.
Grok Imagine Image
The image generation counterpart — cheapest image model.