Video Generation
Generate videos using AI models like Veo, Sora, and fal.ai.
The Video Generation node creates short videos from text prompts, images, or existing videos. It supports multiple AI providers and generation modes.
Inputs & Outputs
| Direction | Handle | Position | Type | Description |
|---|---|---|---|---|
| Input | start-frame | 20% | 🟣 Image | Starting image for image-to-video or interpolation |
| Input | end-frame | 35% | 🟣 Image | Ending image for video interpolation |
| Input | references | 50% | 🟣 Image | Style or content reference images |
| Input | prompt-input | 65% | 🟢 Text | Text description of the desired video |
| Output | video-output | — | 🟠 Video | The generated video |
| Output | first-frame-out | — | 🟣 Image | First frame extracted from the video |
| Output | last-frame-out | — | 🟣 Image | Last frame extracted from the video |
Not all inputs are used in every mode. The node adapts its interface based on the selected provider and connected inputs.
Generation Modes
Text to Video
Generate a video entirely from a text prompt — no image inputs needed.
How to use: Connect only a Text Prompt to the prompt-input handle.
Available on: Veo, Sora, fal.ai
Image to Video
Animate a single image into a video clip. The image becomes the starting frame and the AI generates motion from there.
How to use: Connect an image to the start-frame handle, plus a Text Prompt describing the desired motion.
Available on: Veo, Sora, fal.ai
Video Interpolation
Generate a video that transitions between two keyframe images. The AI creates smooth motion from the first image to the second.
How to use: Connect images to both start-frame and end-frame handles.
Available on: Veo (8-second duration, forced)
Video Remix
Remix an existing video with a new prompt to change its style or content while maintaining the original motion.
How to use: Connect a video source and a Text Prompt describing the desired changes.
Available on: Sora only
Mode Auto-Detection
The node automatically selects the appropriate mode based on your connections:
Veo:
- Both
start-frameandend-frameconnected → Video Interpolation - Only
start-frameorend-frameconnected → Image to Video - No image inputs → Text to Video
Sora:
- Remix video connected → Video Remix
start-frameconnected → Image to Video- No image inputs → Text to Video
Provider Comparison
| Feature | Veo 3.1 | Sora | fal.ai |
|---|---|---|---|
| Duration | 4, 6, or 8 seconds | 5, 8, 10, 15, or 20 seconds | 5, 6, or 10 seconds |
| Aspect Ratio | 16:9, 9:16 | Size-based (see below) | 16:9, 9:16, 1:1 |
| Resolution | 720p HD, 1080p Full HD | — | — |
| Audio | Yes (generated soundtrack) | No | No |
| Reference Images | Up to 3 (Asset) or 1 (Style) | — | — |
| Modes | Text-to-video, Image-to-video, Interpolation | Text-to-video, Image-to-video, Remix | Text-to-video, Image-to-video |
Sora Size Options
Instead of aspect ratios, Sora uses explicit dimensions:
| Size | Aspect Ratio |
|---|---|
| 1280×720 | 16:9 |
| 720×1280 | 9:16 |
| 1792×1024 | 16:9 HD |
| 1024×1792 | 9:16 HD |
Veo Reference Types
When using Veo with reference images, you can choose:
| Type | Max Images | Use Case |
|---|---|---|
| Asset | 3 | Content reference — the AI incorporates elements from the images |
| Style | 1 | Style reference — the AI matches the visual style and mood |
Configuration
| Setting | Description | Availability |
|---|---|---|
| Model | AI model to use for generation | All providers |
| Duration | Video length in seconds | All providers |
| Aspect Ratio | Output dimensions | Veo, fal.ai |
| Size | Output pixel dimensions | Sora |
| Resolution | 720p or 1080p | Veo only |
| Audio | Generate soundtrack with the video | Veo only |
| Reference Type | Asset (content) or Style (mood) | Veo only |
Frame Outputs
The Video Generation node also outputs the first and last frames of the generated video as separate image outputs. This is useful for:
- Chaining videos — use the last frame of one video as the start frame of the next
- Creating thumbnail images from video content
- Feeding frames back into Image Generation for further refinement
Credits
Video generation costs vary by model, duration, and resolution. Longer videos and higher resolutions cost more credits. Credits are deducted before generation starts and refunded automatically on failure.
See Credit System for details.
Tips
- Start with shorter durations (4–5 seconds) to iterate on prompts before committing to longer videos
- Use Image to Video mode for more predictable results — the starting image anchors the output
- For Veo, enable audio generation to get a matching soundtrack (enabled by default)
- Connect the
last-frame-outto another Video Generation node'sstart-frameto create multi-shot sequences