Video generation

marmot video generates short video clips via OpenRouter or Vercel AI Gateway, routing to Veo, Sora, Kling, Hailuo, Seedance, Wan, and others.

marmot video <prompt> [flags…]

Providers: openrouter, vercel. On first run, marmot detects an available API key (OPENROUTER_API_KEY or AI_GATEWAY_API_KEY) and auto-configures a default in that order. openai, anthropic, cloudflare, and ollama don't currently route video generation.

The default model is google/veo-3.1-lite, the cheapest tier (~$0.03/sec at 720p without audio). Override with --model or via marmot setup.

Video generation is async and slow — clips routinely take 1–5 minutes to render. Marmot handles polling internally; you'll see a spinner until the bytes arrive. The default per-attempt timeout is bumped to 600 seconds for this verb.

Cost

Per-second pricing varies widely by model. As of the launch lineup:

Model	$/sec	8s clip
`google/veo-3.1-lite` (default, 720p no-audio)	$0.03	$0.24
`google/veo-3.1-lite` (1080p with audio)	$0.08	$0.64
`google/veo-3.1-fast`	$0.10	$0.80
`minimax/hailuo-2.3`	$0.082	$0.65
`google/veo-3.1` (1080p)	$0.75	$6.00
`openai/sora-2-pro`	varies	—

Audio adds ~50–100% to per-second pricing on models that support toggling. The default is --no-audio to keep typical usage in the pennies-per-second tier. Use --audio to opt in.

Output

Default behavior is TTY-aware, mirroring marmot image:

Invocation	Output
`marmot video '...'` (terminal)	Writes auto-named file in CWD (e.g. `openrouter-video-20260505123456.mp4`), prints the path on stdout.
`marmot video '...' > out.mp4`	Writes raw video bytes to stdout (auto-binary, `n=1` only).
`marmot video '...' \| something`	Same — bytes on stdout.
`marmot video '...' -o boat.mp4`	Writes to `boat.mp4`, prints the path.
`marmot video '...' --binary`	Forces raw bytes regardless.
`marmot video '...' --b64`	JSON envelope with inline base64.
`marmot video '...' --json`	Writes file, emits full JSON envelope.

Multi-clip (--n > 1) always writes files and emits one path per line (or the JSON envelope under --json).

Examples

# Cheapest: 4-second 720p no-audio clip via Veo 3.1 Lite (default)
marmot video 'a wooden boat sailing at sunset'

# Specific dimensions and length
marmot video 'a hummingbird in slow motion' --aspect 9:16 --duration 6

# With audio
marmot video 'a busy market in Tokyo' --audio --resolution 1080p

# Image-to-video (single reference image)
marmot video 'gentle camera pan' --image ./still.jpg

# First-frame + last-frame conditioning (Veo, Kling, Seedance, Wan)
marmot video 'morph between these' --image ./start.jpg --image ./end.jpg

# Pick a different model
marmot video 'cinematic timelapse' --model openai/sora-2-pro
marmot video 'cheap test clip' --model minimax/hailuo-2.3

# Pipe a generated prompt
marmot 'write a vivid one-line video prompt about a marmot' | marmot video

Flags

For cross-cutting flags (--provider, --api-key, --retries, --timeout) see Common flags. Video-specific:

Flag	Description
`--model <id>`	Video model slug. Defaults to provider's default (`google/veo-3.1-lite` for openrouter and vercel).
`-o, --output <path>`	Output path. `{i}` template for batches (e.g. `./clip-{i}.mp4`).
`-p, --prompt-file <path>`	Prompt from a file (merges with positional and stdin).
`--aspect <W:H>`	Aspect ratio. `16:9` (default), `9:16`, `1:1`. Wider/taller ratios are model-specific.
`--resolution <res>`	Resolution label (`720p`, `1080p`, `4k`) or `WxH`. Default depends on model and tier.
`--duration <sec>`	Clip length in seconds. Most models accept 4–15s; check the model docs.
`--fps <n>`	Frames per second. Honored by Wan and Seedance; ignored elsewhere.
`--audio` / `--no-audio`	Toggle synced audio. Default `--no-audio`. Some models (full Veo, Sora, Wan, Seedance-1.5) always emit audio regardless of the flag.
`--image <path>`	Reference image. Repeatable up to 2: position 1 = first-frame conditioning (or single-image-to-video), position 2 = last-frame for models that support it.
`--n <count>`	Number of clips. Most models cap at 1 per call; the AI SDK batches when needed.
`--seed <int>`	Reproducibility seed.
`--binary`	Force raw video bytes to stdout. Requires `--n 1`.
`--b64`	JSON envelope with inline base64; no file written.
`--json`	Emit JSON envelope on stdout (default prints just the path).

Provider notes

OpenRouter routes the broadest video catalog: Veo (3.1 + fast + lite), Sora 2 Pro, Kling v3.0 (pro/std/o1), MiniMax Hailuo 2.3, ByteDance Seedance (2.0/fast/1.5-pro), Alibaba Wan (2.6/2.7). Discoverable via marmot cache refresh openrouter then marmot models --modality video.

Vercel AI Gateway routes the same providers via Vercel's proxy. Pricing tracks Vercel's gateway markup (similar to direct in most cases). Discoverable via the gateway's /v1/models endpoint.

OpenAI direct for Sora is not wired through marmot today — --provider openai errors. Use --provider openrouter --model openai/sora-2-pro to access Sora via OpenRouter.

OpenAI's Videos API (and the sora-2-pro route) is scheduled for deprecation on Sep 24, 2026.