Gemini Image Generation

Gemini Image Generation is a powerful tool that integrates Google’s Gemini Image Models for high-quality text-to-image generation and image context-aware editing. It supports both the simple Gemini API and Google Cloud Vertex AI.

Setup Instructions

You can use either the Gemini API (recommended for most users) or Vertex AI with a service account.

Option 1: Gemini API (Recommended)

Get your API key from Google AI Studio
Set the GEMINI_API_KEY environment variable in your .env file:

GEMINI_API_KEY=your_api_key_here

Option 2: Vertex AI (For Enterprise/GCP Users)

Create a service account in Google Cloud Console with Vertex AI permissions
Download the service account JSON key file
Configure the environment variables:

# Path to your service account JSON file
GOOGLE_SERVICE_KEY_FILE=/path/to/service-account.json
 
# Optional: Set the location (default: global)
GOOGLE_CLOUD_LOCATION=us-central1

Configuration Options

Model Selection

You can choose which Gemini image model to use via environment variable:

# Default model
GEMINI_IMAGE_MODEL=gemini-2.5-flash-image
 
# Or use the newer Gemini 3 Pro Image model
GEMINI_IMAGE_MODEL=gemini-3-pro-image-preview

Available Models

Model	Description
`gemini-2.5-flash-image`	Default model, fast and efficient
`gemini-3-pro-image-preview`	Higher quality, more detailed generations

Features

Core Capabilities

Text-to-Image Generation: Create images from detailed text descriptions
Image Context Support: Use existing images as context/inspiration for new generations
Image Editing: Generate new images based on modifications to existing ones
Safety Filtering: Built-in content safety with user-friendly error messages
Multi-Storage Support: Compatible with local, S3, Azure, and Firebase storage strategies

Parameters

The Gemini Image Gen tool accepts the following parameters:

prompt (required) – A detailed text description of the desired image, up to 32,000 characters
image_ids (optional) – Array of image IDs to use as visual context for generation

Best Practices

Prompt Writing

Be specific and detailed in your descriptions
Start with the image type: photo, oil painting, watercolor, illustration, cartoon, drawing, vector, render, etc.
Include key elements:
- Subject matter and composition
- Style and artistic approach
- Lighting and atmosphere
- Color palette preferences
- Technical specifications

Image Editing Tips

When editing existing images:

Include the original image ID in the image_ids array
Use direct editing instructions:
- “Remove the background from this image"
- "Add sunglasses to the person in this image"
- "Change the color of the car to red”
Don’t reconstruct the original prompt – use simple, direct modification instructions

Usage Examples

Basic Image Generation

A serene Japanese garden at golden hour, featuring a traditional red bridge over a koi pond. Cherry blossom trees frame the scene with soft pink petals falling. Photorealistic style with warm, diffused lighting and rich colors.

Image with Context

When you have an existing image and want to create something inspired by it:

Reference the image ID in the image_ids parameter
Describe what you want: “Create a winter version of this landscape scene with snow-covered trees and a frozen lake”

Image Editing

To modify an existing image:

Include the image ID in image_ids
Describe the change: “Remove the person from the background of this image”

Error Handling

Common Issues

Error	Solution
”Image blocked by content safety filters”	Modify your prompt to avoid content that violates safety policies
”No image was generated”	Try a different prompt or simplify your request
”GEMINI_API_KEY or service account required”	Ensure you’ve configured either the API key or Vertex AI credentials

Safety Filtering

Gemini includes built-in safety filters. If your image is blocked:

Review your prompt for potentially problematic content
Try rephrasing to be more specific about artistic intent
Avoid requests for harmful, violent, or explicit content

Technical Details

Storage Integration

Generated images are automatically saved using your configured file strategy:

Local: Saved to client/public/images/{userId}/
S3/Azure/Firebase: Uploaded to your configured cloud storage

Image Format

Output format: PNG
Images include unique identifiers for reference in subsequent requests

Rate Limits

Rate limits depend on your API tier:

Gemini API: Check Google AI Studio for current limits
Vertex AI: Based on your Google Cloud project quotas

Flux Azure AI Search