---
name: nano-to-grok-video
description: >
  Two-step pipeline to generate high-quality images with Nano Banana 2 (Gemini Flash Image)
  and animate them into videos with Grok Imagine API. Use when user wants to create AI videos,
  animate a landscape or portrait, run image-to-video, or combine Nano Banana and Grok for a
  creative pipeline. Triggers: generate a video, animate this image, nano banana to video,
  grok video, image to video, create a video of.
---

# Nano Banana 2 → Grok Imagine Video Pipeline

Two-step creative pipeline: generate a still image with Nano Banana 2, then animate it into a video with Grok Imagine.

## Requirements

- `GEMINI_API_KEY` — for Nano Banana 2 image generation
- `XAI_API_KEY` — for Grok Imagine video generation
- `uv` — Python runner (install: `brew install uv`)

## Step 1: Generate Image (Nano Banana 2)

```bash
uv run /usr/local/lib/node_modules/openclaw/skills/nano-banana-pro/scripts/generate_image.py \
  --prompt "your image description" \
  --filename "output.png" \
  --resolution 1K
```

## Step 2: Animate to Video (Grok Imagine)

```bash
uv run {baseDir}/scripts/generate_video.py \
  --prompt "describe the motion and camera" \
  --image output.png \
  --duration 8 \
  --aspect-ratio 16:9 \
  --output output-video.mp4
```

Options:
- `--duration` 3–10 seconds (default: 8)
- `--aspect-ratio` `16:9` | `9:16` | `1:1`
- `--resolution` `720p` | `480p`
- `--image` local path or public URL (omit for text-to-video)

## Full Pipeline (one command)

```bash
bash {baseDir}/scripts/pipeline.sh \
  "image prompt for Nano Banana" \
  "animation/motion prompt for Grok" \
  output-name
```

## Text-to-Video (skip image step)

```bash
uv run {baseDir}/scripts/generate_video.py \
  --prompt "A rocket launching from Mars at sunset, cinematic" \
  --duration 10 \
  --aspect-ratio 16:9
```

## Prompt Strategy

See `references/prompt-guide.md` for:
- How to write animatable image prompts
- Motion/camera vocabulary for Grok
- Platform aspect ratio guide
- Example pipelines (cultural landscapes, portraits, urban scenes)

## Notes

- Video generation is async — typically 1–4 minutes; script polls automatically
- Videos are temporary URLs — script downloads to local `.mp4` automatically
- Prints `MEDIA:` line for OpenClaw auto-attach on Telegram
- Default model: `grok-imagine-video` (720p, up to 10s with native audio)
