seedream-image-generator

eze-is/seedream-image-generator

Generate images using the Doubao SeeDream API based on text prompts. Use this skill when users request AI-generated images, artwork, illustrations, or visual content creation. The skill handles API calls, downloads generated images to the project's /pic folder, and supports batch generation of up to 4 sequential images.

15 stars
5 forks
Python
8 views

SKILL.md


name: seedream-image-generator description: Generate images using the Doubao SeeDream API based on text prompts. Use this skill when users request AI-generated images, artwork, illustrations, or visual content creation. The skill handles API calls, downloads generated images to the project's /pic folder, and supports batch generation of up to 4 sequential images.

SeeDream Image Generator

Overview

This skill enables image generation using the Doubao SeeDream 4.0 API (model: doubao-seedream-4-0-250828). It converts text prompts into high-quality images and automatically downloads them to the project's /pic folder. The skill supports single and batch image generation (up to 4 sequential images), customizable image sizes, and watermark control.

When to Use This Skill

Use this skill when users request:

  • "Generate an image of [description]"
  • "Create artwork showing [scene/concept]"
  • "Make an illustration of [subject]"
  • "Generate 4 seasonal variations of [scene]"
  • Any request involving AI image generation, visual content creation, or artwork generation

Quick Start

Prerequisites

Before generating images, obtain the user's Volcano Engine (火山引擎) ARK API key:

IMPORTANT: Always ask the user for their ARK API key before proceeding with image generation, as the skill does not include a pre-configured key.

Example prompt to user:

"To generate images, I need your Volcano Engine ARK API key. You can find it at: https://console.volcengine.com/ark/region:ark+cn-beijing/apikey

Please provide your ARK_API_KEY."

Basic Workflow

  1. Receive user request for image generation
  2. Request API key from the user if not already provided
  3. Clarify requirements:
    • Prompt/description
    • Number of images (1-4)
    • Image size preference
    • Watermark preference
  4. Execute generation using the generate_image.py script
  5. Report results with file paths and preview the images if possible

Image Generation Tasks

Single Image Generation

Generate a single image based on a text prompt.

Example user request:

"Generate an image of a futuristic city at sunset with flying cars"

Script usage:

python scripts/generate_image.py "futuristic city at sunset with flying cars" \
    --api-key "YOUR_ARK_API_KEY" \
    --size "2K"

Parameters:

  • prompt (required): Text description of the desired image
  • --api-key: ARK API key (can also use ARK_API_KEY environment variable)
  • --size: Image size, options include:
    • "2K" (default)
    • "1024x1024"
    • "2048x2048"
    • "3104x1312" (widescreen)
    • Custom dimensions in format "{width}x{height}"
  • --no-watermark: Disable watermark (watermark enabled by default)
  • --output-dir: Custom output directory (defaults to project_root/pic)

Batch Image Generation

Generate multiple sequential/related images (2-4 images) with a single prompt.

Example user request:

"Generate 4 images showing the same garden courtyard through all four seasons"

Script usage:

python scripts/generate_image.py \
    "同一庭院一角的四季变迁,统一风格展现四季独特色彩、元素与氛围" \
    --api-key "YOUR_ARK_API_KEY" \
    --size "2048x2048" \
    --max-images 4

Additional parameter:

  • --max-images: Number of images to generate (1-4, default: 1)

Note: When --max-images is greater than 1, the API automatically uses sequential image generation to create related/coherent images.

Advanced Prompt Engineering

For high-quality results, guide users to provide detailed prompts that include:

  1. Subject: Main focus of the image
  2. Style: Art style, rendering technique (e.g., "photorealistic", "oil painting", "anime style")
  3. Lighting: Lighting conditions and atmosphere
  4. Color: Color palette or dominant colors
  5. Composition: Perspective, framing, depth of field
  6. Details: Specific elements to include

Example of a well-crafted prompt:

"Interstellar scene with a massive black hole, a vintage train emerging from it half-destroyed,
strong visual impact, cinematic, apocalyptic atmosphere, dynamic motion, contrasting colors,
octane render, ray tracing, motion blur, depth of field, surrealism, deep blue tones,
detailed color layers shaping the subject, realistic textures, dark background with dramatic
lighting creating atmosphere, artistic fantasy feel, exaggerated wide-angle perspective,
lens flare, reflections, extreme lighting and shadows, strong gravity, consuming effect"

Script Details

Location

scripts/generate_image.py

Key Features

  • Automatic project root detection (looks for .git, .claude, package.json, etc.)
  • Creates /pic folder if it doesn't exist
  • Timestamps filenames to prevent overwrites (format: seedream_YYYYMMDD_HHMMSS.png)
  • Downloads images directly from API response URLs
  • Prints usage statistics (tokens, generated images count)
  • Error handling for API calls and downloads

Output Format

  • Images are saved to: {project_root}/pic/seedream_{timestamp}.png
  • For batch generation: {project_root}/pic/seedream_{timestamp}_1.png, _2.png, etc.
  • Default format: PNG

Requirements

The script requires the following Python packages:

  • openai (for API client)
  • requests (for image downloading)

Install with:

pip install openai requests

Workflow Decision Tree

User requests image generation
    ↓
Do we have ARK API key?
    ├─ No → Request API key from user → Store for session
    └─ Yes → Continue
    ↓
Single image or multiple images?
    ├─ Single (default)
    │   └─ Run: generate_image.py "prompt" --api-key KEY --size SIZE
    └─ Multiple (2-4 images)
        └─ Run: generate_image.py "prompt" --api-key KEY --max-images N
    ↓
Script executes:
    1. Calls SeeDream API
    2. Receives image URL(s)
    3. Downloads to /pic folder
    4. Reports file paths and statistics
    ↓
Inform user of results
    ├─ Success → Show file paths, offer to view images
    └─ Failure → Report error, suggest troubleshooting

Troubleshooting

Common Issues

"ARK API key is required"

  • Ensure the user has provided their API key
  • Verify the key is correctly passed via --api-key or ARK_API_KEY environment variable

"Error calling API"

  • Check API key validity
  • Verify network connectivity to ark.cn-beijing.volces.com
  • Ensure the prompt doesn't violate content policies
  • Check API quota/limits

"Error downloading image"

  • Check network connectivity
  • Verify the image URL is accessible
  • Ensure sufficient disk space in output directory

Module not found errors

  • Install required dependencies: pip install openai requests

Best Practices

  1. Always request API key first - Don't assume the user has configured it
  2. Clarify image requirements - Ask about size, quantity, and style preferences
  3. Optimize prompts - Help users craft detailed, descriptive prompts for better results
  4. Batch generation for variations - Suggest --max-images when users want variations or sequences
  5. Inform about output location - Always tell users where images are saved
  6. Preview results - After generation, offer to display or describe the generated images
  7. Respect content policies - Ensure prompts comply with API content guidelines

Example Interactions

Example 1: Simple request

User: "Generate a sunset over mountains"
Claude: "I'll generate that image for you. First, I need your Volcano Engine ARK API key..."
[User provides key]
Claude: [Executes generate_image.py]
Claude: "✅ Image generated successfully! Saved to: /project/pic/seedream_20250112_143022.png"

Example 2: Batch generation

User: "Create 4 images of a coffee shop in different seasons"
Claude: "I'll generate 4 seasonal variations of a coffee shop. Using your API key..."
[Executes with --max-images 4]
Claude: "✅ Generated 4 images:
- /project/pic/seedream_20250112_143530_1.png (Spring)
- /project/pic/seedream_20250112_143530_2.png (Summer)
- /project/pic/seedream_20250112_143530_3.png (Autumn)
- /project/pic/seedream_20250112_143530_4.png (Winter)"

Example 3: Custom specifications

User: "Generate a 2048x2048 image of a cyberpunk street without watermark"
Claude: [Executes with --size "2048x2048" --no-watermark]
Claude: "✅ Image generated (2048x2048, no watermark)
Saved to: /project/pic/seedream_20250112_144015.png"

README

Seedream 图像生成器

一个 Claude Skills 项目,让 AI 帮你通过文字描述生成高质量图像。

项目简介

Seedream 图像生成器通过整合火山引擎的 Doubao Seedream 4.0 API,将你的文字描述转换为高质量图像。这个技能支持单张和批量图像生成(最多 4 张),自动下载到项目的 /pic 文件夹,并提供灵活的尺寸和水印控制选项。

核心功能

🎨 智能图像生成

  • 基于文字描述生成高质量图像
  • 通过向 Agent 自然语言描述,可:
    • 支持多种预设尺寸(2K、1024x1024、2048x2048 等)
    • 支持自定义尺寸
    • 选择是否添加水印

📦 批量生成

  • 一次生成多张相关图像
  • 自动创建序列化图像(如四季变化、不同角度等)
  • 智能文件命名,避免覆盖

💾 自动管理

  • 自动检测项目根目录
  • 自动创建 /pic 输出文件夹
  • 时间戳命名,确保文件唯一性
  • 自动下载并保存生成的图像

使用场景

当你需要:

  • 🖼️ 根据文字描述生成图像、插画或艺术作品
  • 🎨 创建视觉内容用于项目、演示或设计
  • 📸 生成同一场景的多个变体(如四季变化、不同风格)
  • 🎭 快速生成概念图、示意图或创意视觉

安装与使用

安装技能

将此项目复制到你的工作目录下的 .claude/skills/ 文件夹中,记得删掉 README:

<你的项目根目录>/
└── .claude/
    └── skills/
        └── seedream-image-generator/    # 本技能包
            ├── scripts/
            │   └── generate_image.py
            ├── SKILL.md

使用技能

在 Claude Code 中,发送以下指令即可启用此技能:

使用 seedream-image-generator 创作图片

AI agent 会自动:

  • 读取技能配置和指引
  • 在需要时创建必要的目录结构(如 pic/ 文件夹)
  • 根据你的需求执行图像生成

准备工作

重要:首次使用前,你需要获取火山引擎 ARK API 密钥。

获取方式:

  1. 访问:https://console.volcengine.com/ark/region:ark+cn-beijing/apikey
  2. 创建或查看你的 ARK API 密钥

AI agent 会在首次使用时自动询问你的 API 密钥。

开始生成图像

配置完成后,直接告诉 AI 你想要生成的图像:

示例 1:简单请求

生成一张夕阳下的未来城市图像,有飞行汽车

示例 2:批量生成

生成 4 张图片,展示同一个花园庭院在四季的变化

示例 3:自定义规格

生成一张 2048x2048 的赛博朋克街道图像,不要水印

AI 将:

  • 理解你的需求
  • 调用 Seedream API 生成图像
  • 自动下载到 pic/ 文件夹
  • 告诉你生成的文件路径

项目结构

技能包结构(位于 .claude/skills/seedream-image-generator/

seedream-image-generator/
├── scripts/             # 核心脚本
│   └── generate_image.py
├── SKILL.md            # 技能详细文档(AI agent 会读取此文件)

用户数据目录(位于项目根目录)

AI agent 会在你的项目根目录创建以下结构:

<项目根目录>/
└── pic/                # 生成的图像存储目录(由 AI agent 自动创建)
    ├── seedream_20250112_143022.png
    ├── seedream_20250112_143530_1.png
    └── ...

重要:所有生成的图像存储在项目根目录的 pic/ 文件夹中,便于管理和备份。

工作流程

  1. 接收请求:理解用户想要生成的图像描述
  2. 确认 API 密钥:确保已获取用户的 ARK API 密钥
  3. 明确需求
    • 图像描述/提示词
    • 生成数量(1-4 张)
    • 图像尺寸偏好
    • 水印偏好
  4. 执行生成:调用 generate_image.py 脚本
  5. 报告结果:显示文件路径和生成统计信息

功能详解

参数说明:

  • prompt(必需):图像的文字描述
  • --api-key:ARK API 密钥(也可使用 ARK_API_KEY 环境变量)
  • --size:图像尺寸,选项包括:
    • "2K"(默认)
    • "1024x1024"
    • "2048x2048"
    • "3104x1312"(宽屏)
    • 自定义尺寸格式:"{width}x{height}"
  • --no-watermark:禁用水印(默认启用水印)
  • --output-dir:自定义输出目录(默认为 项目根目录/pic

维护与更新

更新依赖

脚本需要以下 Python 包:

  • openai(用于 API 客户端)
  • requests(用于图像下载)

如果遇到模块错误,告诉 AI:

请安装所需的 Python 依赖包

AI agent 会自动安装:pip install openai requests

更改输出目录

如果你想更改图像保存位置,可以在请求时指定:

生成图像并保存到 custom_images 文件夹

或者直接使用 --output-dir 参数。

注意事项

  • API 密钥:首次使用时,AI agent 会询问你的 ARK API 密钥。请确保密钥有效且有足够的配额。
  • 网络连接:生成和下载图像需要网络连接到 ark.cn-beijing.volces.com
  • 内容政策:确保提示词符合 API 的内容政策要求
  • 文件存储:所有生成的图像存储在项目根目录的 pic/ 文件夹中,而非技能包目录内
  • 技能位置:确保技能包位于 .claude/skills/seedream-image-generator/ 目录下

更多信息

详细的技术文档和使用说明请参考 SKILL.md 文件。


让 AI 帮你将创意想法转化为视觉现实,轻松生成高质量图像。