Video generation

marmot video generates short video clips via OpenRouter or Vercel AI Gateway, routing to Veo, Sora, Kling, Hailuo, Seedance, Wan, and others.

marmot video <prompt> [flags…]

Providers: openrouter, vercel. On first run, marmot detects an available API key (OPENROUTER_API_KEY or AI_GATEWAY_API_KEY) and auto-configures a default in that order. openai, anthropic, cloudflare, and ollama don't currently route video generation.

The default model is google/veo-3.1-lite, the cheapest tier (~$0.03/sec at 720p without audio). Override with --model or via marmot setup.

Video generation is async and slow — clips routinely take 1–5 minutes to render. Marmot handles polling internally; you'll see a spinner until the bytes arrive. The default per-attempt timeout is bumped to 600 seconds for this verb.

Cost

Per-second pricing varies widely by model. As of the launch lineup:

Model$/sec8s clip
google/veo-3.1-lite (default, 720p no-audio)$0.03$0.24
google/veo-3.1-lite (1080p with audio)$0.08$0.64
google/veo-3.1-fast$0.10$0.80
minimax/hailuo-2.3$0.082$0.65
google/veo-3.1 (1080p)$0.75$6.00
openai/sora-2-provaries

Audio adds ~50–100% to per-second pricing on models that support toggling. The default is --no-audio to keep typical usage in the pennies-per-second tier. Use --audio to opt in.

Output

Default behavior is TTY-aware, mirroring marmot image:

InvocationOutput
marmot video '...' (terminal)Writes auto-named file in CWD (e.g. openrouter-video-20260505123456.mp4), prints the path on stdout.
marmot video '...' > out.mp4Writes raw video bytes to stdout (auto-binary, n=1 only).
marmot video '...' | somethingSame — bytes on stdout.
marmot video '...' -o boat.mp4Writes to boat.mp4, prints the path.
marmot video '...' --binaryForces raw bytes regardless.
marmot video '...' --b64JSON envelope with inline base64.
marmot video '...' --jsonWrites file, emits full JSON envelope.

Multi-clip (--n > 1) always writes files and emits one path per line (or the JSON envelope under --json).

Examples

# Cheapest: 4-second 720p no-audio clip via Veo 3.1 Lite (default)
marmot video 'a wooden boat sailing at sunset'

# Specific dimensions and length
marmot video 'a hummingbird in slow motion' --aspect 9:16 --duration 6

# With audio
marmot video 'a busy market in Tokyo' --audio --resolution 1080p

# Image-to-video (single reference image)
marmot video 'gentle camera pan' --image ./still.jpg

# First-frame + last-frame conditioning (Veo, Kling, Seedance, Wan)
marmot video 'morph between these' --image ./start.jpg --image ./end.jpg

# Pick a different model
marmot video 'cinematic timelapse' --model openai/sora-2-pro
marmot video 'cheap test clip' --model minimax/hailuo-2.3

# Pipe a generated prompt
marmot 'write a vivid one-line video prompt about a marmot' | marmot video

Flags

For cross-cutting flags (--provider, --api-key, --retries, --timeout) see Common flags. Video-specific:

FlagDescription
--model <id>Video model slug. Defaults to provider's default (google/veo-3.1-lite for openrouter and vercel).
-o, --output <path>Output path. {i} template for batches (e.g. ./clip-{i}.mp4).
-p, --prompt-file <path>Prompt from a file (merges with positional and stdin).
--aspect <W:H>Aspect ratio. 16:9 (default), 9:16, 1:1. Wider/taller ratios are model-specific.
--resolution <res>Resolution label (720p, 1080p, 4k) or WxH. Default depends on model and tier.
--duration <sec>Clip length in seconds. Most models accept 4–15s; check the model docs.
--fps <n>Frames per second. Honored by Wan and Seedance; ignored elsewhere.
--audio / --no-audioToggle synced audio. Default --no-audio. Some models (full Veo, Sora, Wan, Seedance-1.5) always emit audio regardless of the flag.
--image <path>Reference image. Repeatable up to 2: position 1 = first-frame conditioning (or single-image-to-video), position 2 = last-frame for models that support it.
--n <count>Number of clips. Most models cap at 1 per call; the AI SDK batches when needed.
--seed <int>Reproducibility seed.
--binaryForce raw video bytes to stdout. Requires --n 1.
--b64JSON envelope with inline base64; no file written.
--jsonEmit JSON envelope on stdout (default prints just the path).

Provider notes

OpenRouter routes the broadest video catalog: Veo (3.1 + fast + lite), Sora 2 Pro, Kling v3.0 (pro/std/o1), MiniMax Hailuo 2.3, ByteDance Seedance (2.0/fast/1.5-pro), Alibaba Wan (2.6/2.7). Discoverable via marmot cache refresh openrouter then marmot models --modality video.

Vercel AI Gateway routes the same providers via Vercel's proxy. Pricing tracks Vercel's gateway markup (similar to direct in most cases). Discoverable via the gateway's /v1/models endpoint.

OpenAI direct for Sora is not wired through marmot today — --provider openai errors. Use --provider openrouter --model openai/sora-2-pro to access Sora via OpenRouter.

OpenAI's Videos API (and the sora-2-pro route) is scheduled for deprecation on Sep 24, 2026.