Model Details
Veo 3.1 Image-to-Video transforms static imagery into dynamic, high-definition video sequences. By combining a single reference image with a descriptive text prompt, the model generates realistic motion that faithfully preserves the visual style, subject identity, and lighting of the source. It is capable of producing cinema-grade videos at 24 frames per second in either 720p or 1080p resolution.
A standout feature of Veo 3.1 is its ability to generate synchronized native audio alongside the video. Whether animating a landscape with ambient nature sounds or a character with specific dialogue and sound effects, the model interprets audio cues within the prompt to create an immersive experience. The input image serves as the initial frame, ensuring a seamless transition from static to motion.
### Usage Guidelines
To achieve the best results, provide a high-quality initial image and a detailed prompt covering the subject, action, camera movement, and desired audio atmosphere.
- **Prompting**: Describe the action clearly (e.g., "The cat yawns and stretches") and include audio cues (e.g., "faint purring sound"). - **Resolution & Aspect Ratio**: Choose between `720p` and `1080p`, and select standard `16:9` or vertical `9:16` aspect ratios to fit your platform. - **Duration**: Generate clips of `4`, `6`, or `8` seconds.
### Example Usage
To run via ModelRunner javascript client, use the following code:
```javascript import { modelrunner } from "@modelrunner/client";
const result = await modelrunner.subscribe('google/veo-3.1-image-to-video', { input: { prompt: "A cinematic shot of a majestic lion roaring in the savannah. Audio: loud roar and wind blowing through grass.", image_url: "https://example.com/lion_image.jpg", negative_prompt: "cartoon, drawing, low quality, blurry", resolution: "1080p", aspect_ratio: "16:9", duration: 8 } }); ```
