Clickraft
Nodes

Video Generation

Generate videos using AI models like Veo, Sora, and fal.ai.

The Video Generation node creates short videos from text prompts, images, or existing videos. It supports multiple AI providers and generation modes.

Inputs & Outputs

DirectionHandlePositionTypeDescription
Inputstart-frame20%🟣 ImageStarting image for image-to-video or interpolation
Inputend-frame35%🟣 ImageEnding image for video interpolation
Inputreferences50%🟣 ImageStyle or content reference images
Inputprompt-input65%🟢 TextText description of the desired video
Outputvideo-output🟠 VideoThe generated video
Outputfirst-frame-out🟣 ImageFirst frame extracted from the video
Outputlast-frame-out🟣 ImageLast frame extracted from the video

Not all inputs are used in every mode. The node adapts its interface based on the selected provider and connected inputs.

Generation Modes

Text to Video

Generate a video entirely from a text prompt — no image inputs needed.

How to use: Connect only a Text Prompt to the prompt-input handle.

Available on: Veo, Sora, fal.ai

Image to Video

Animate a single image into a video clip. The image becomes the starting frame and the AI generates motion from there.

How to use: Connect an image to the start-frame handle, plus a Text Prompt describing the desired motion.

Available on: Veo, Sora, fal.ai

Video Interpolation

Generate a video that transitions between two keyframe images. The AI creates smooth motion from the first image to the second.

How to use: Connect images to both start-frame and end-frame handles.

Available on: Veo (8-second duration, forced)

Video Remix

Remix an existing video with a new prompt to change its style or content while maintaining the original motion.

How to use: Connect a video source and a Text Prompt describing the desired changes.

Available on: Sora only

Mode Auto-Detection

The node automatically selects the appropriate mode based on your connections:

Veo:

  • Both start-frame and end-frame connected → Video Interpolation
  • Only start-frame or end-frame connected → Image to Video
  • No image inputs → Text to Video

Sora:

  • Remix video connected → Video Remix
  • start-frame connected → Image to Video
  • No image inputs → Text to Video

Provider Comparison

FeatureVeo 3.1Sorafal.ai
Duration4, 6, or 8 seconds5, 8, 10, 15, or 20 seconds5, 6, or 10 seconds
Aspect Ratio16:9, 9:16Size-based (see below)16:9, 9:16, 1:1
Resolution720p HD, 1080p Full HD
AudioYes (generated soundtrack)NoNo
Reference ImagesUp to 3 (Asset) or 1 (Style)
ModesText-to-video, Image-to-video, InterpolationText-to-video, Image-to-video, RemixText-to-video, Image-to-video

Sora Size Options

Instead of aspect ratios, Sora uses explicit dimensions:

SizeAspect Ratio
1280×72016:9
720×12809:16
1792×102416:9 HD
1024×17929:16 HD

Veo Reference Types

When using Veo with reference images, you can choose:

TypeMax ImagesUse Case
Asset3Content reference — the AI incorporates elements from the images
Style1Style reference — the AI matches the visual style and mood

Configuration

SettingDescriptionAvailability
ModelAI model to use for generationAll providers
DurationVideo length in secondsAll providers
Aspect RatioOutput dimensionsVeo, fal.ai
SizeOutput pixel dimensionsSora
Resolution720p or 1080pVeo only
AudioGenerate soundtrack with the videoVeo only
Reference TypeAsset (content) or Style (mood)Veo only

Frame Outputs

The Video Generation node also outputs the first and last frames of the generated video as separate image outputs. This is useful for:

  • Chaining videos — use the last frame of one video as the start frame of the next
  • Creating thumbnail images from video content
  • Feeding frames back into Image Generation for further refinement

Credits

Video generation costs vary by model, duration, and resolution. Longer videos and higher resolutions cost more credits. Credits are deducted before generation starts and refunded automatically on failure.

See Credit System for details.

Tips

  • Start with shorter durations (4–5 seconds) to iterate on prompts before committing to longer videos
  • Use Image to Video mode for more predictable results — the starting image anchors the output
  • For Veo, enable audio generation to get a matching soundtrack (enabled by default)
  • Connect the last-frame-out to another Video Generation node's start-frame to create multi-shot sequences