# Gemini Image Generation (/docs/configuration/tools/gemini_image_gen)

Gemini Image Generation is a powerful tool that integrates Google's Gemini Image Models for high-quality text-to-image generation and image context-aware editing. It supports both the simple Gemini API and Google Cloud Vertex AI.

## Setup Instructions

You can use either the Gemini API (recommended for most users) or Vertex AI with a service account.

### Option 1: Gemini API (Recommended)

1. Get your API key from [Google AI Studio](https://aistudio.google.com/app/apikey)
2. Set the `GEMINI_API_KEY` environment variable in your `.env` file:

```bash
GEMINI_API_KEY=your_api_key_here
```

### Option 2: Vertex AI (For Enterprise/GCP Users)

1. Create a service account in Google Cloud Console with Vertex AI permissions
2. Download the service account JSON key file
3. Place the JSON file in the project (e.g., `api/data/auth.json`) or set the path:

```bash
# Path to your service account JSON file (default: api/data/auth.json)
GOOGLE_SERVICE_KEY_FILE=/path/to/service-account.json

# Optional: Set the location (default: global)
GOOGLE_CLOUD_LOCATION=us-central1
```

When no `GEMINI_API_KEY` or `GOOGLE_KEY` is configured, the tool automatically falls back to Vertex AI using the service account file.

## Configuration Options

### Model Selection

You can choose which Gemini image model to use via environment variable:

```bash
# Default model
GEMINI_IMAGE_MODEL=gemini-2.5-flash-image

# Or use the newer Gemini 3 Pro Image model
GEMINI_IMAGE_MODEL=gemini-3-pro-image-preview
```

### Available Models

| Model | Description |
|-------|-------------|
| `gemini-2.5-flash-image` | Default model, fast and efficient |
| `gemini-3-pro-image-preview` | Higher quality, more detailed generations |

## Features

### Core Capabilities

- **Text-to-Image Generation**: Create images from detailed text descriptions
- **Image Context Support**: Use existing images as context/inspiration for new generations
- **Image Editing**: Generate new images based on modifications to existing ones
- **Safety Filtering**: Built-in content safety with user-friendly error messages

### Parameters

The Gemini Image Gen tool accepts the following parameters:

- **prompt** (required) – A detailed text description of the desired image, up to 32,000 characters
- **image_ids** (optional) – Array of image IDs to use as visual context for generation

## Best Practices

### Prompt Writing

1. **Be specific and detailed** in your descriptions
2. **Start with the image type**: photo, oil painting, watercolor, illustration, cartoon, drawing, vector, render, etc.
3. **Include key elements**:
   - Subject matter and composition
   - Style and artistic approach
   - Lighting and atmosphere
   - Color palette preferences
   - Technical specifications

### Image Editing Tips

When editing existing images:

1. **Include the original image ID** in the `image_ids` array
2. **Use direct editing instructions**:
   - "Remove the background from this image"
   - "Add sunglasses to the person in this image"
   - "Change the color of the car to red"
3. **Don't reconstruct the original prompt** – use simple, direct modification instructions

## Usage Examples

### Basic Image Generation

> A serene Japanese garden at golden hour, featuring a traditional red bridge over a koi pond. Cherry blossom trees frame the scene with soft pink petals falling. Photorealistic style with warm, diffused lighting and rich colors.

### Image with Context

When you have an existing image and want to create something inspired by it:

1. Reference the image ID in the `image_ids` parameter
2. Describe what you want: "Create a winter version of this landscape scene with snow-covered trees and a frozen lake"

### Image Editing

To modify an existing image:

1. Include the image ID in `image_ids`
2. Describe the change: "Remove the person from the background of this image"

## Error Handling

### Common Issues

| Error | Solution |
|-------|----------|
| "Image blocked by content safety filters" | Modify your prompt to avoid content that violates safety policies |
| "No image was generated" | Try a different prompt or simplify your request |
| "GEMINI_API_KEY or service account required" | Ensure you've configured either the API key or Vertex AI credentials |

### Safety Filtering

Gemini includes built-in safety filters. If your image is blocked:

- Review your prompt for potentially problematic content
- Try rephrasing to be more specific about artistic intent
- Avoid requests for harmful, violent, or explicit content

## Technical Details

### Storage Integration

Generated images are automatically saved using your configured file strategy (local, S3, Azure, or Firebase). This is handled by the framework — the tool returns image data and the agent callback system persists it as a message attachment.

### Image Format

- Output format defaults to PNG, configurable via the app's `imageOutputType` setting
- Images include unique identifiers for reference in subsequent requests

## Rate Limits

Rate limits depend on your API tier:

- **Gemini API**: Check [Google AI Studio](https://aistudio.google.com/) for current limits
- **Vertex AI**: Based on your Google Cloud project quotas

