Docs
⚙️ Configuration
librechat.yaml
Settings
Anthropic Vertex AI

Anthropic Vertex AI Object Structure

LibreChat supports running Anthropic Claude models through Google Cloud Vertex AI. This allows you to use Claude models with your existing Google Cloud infrastructure, billing, and credentials.

For quick setup using environment variables, see the Anthropic configuration guide

Benefits

  • Unified Billing: Use your existing Google Cloud billing account
  • Enterprise Features: Access Google Cloud’s enterprise security and compliance features
  • Regional Compliance: Deploy in specific regions to meet data residency requirements
  • Existing Infrastructure: Leverage your current GCP service accounts and IAM policies

Prerequisites

Before configuring Anthropic Vertex AI, ensure you have:

  1. Google Cloud Project with the Vertex AI API enabled
  2. Service Account with the Vertex AI User role (roles/aiplatform.user)
  3. Claude models enabled in your Vertex AI Model Garden
  4. Service Account Key (JSON file) downloaded and accessible to LibreChat

Example Configuration

Example Anthropic Vertex AI Configuration
endpoints:
  anthropic:
    streamRate: 20
    titleModel: "claude-3.5-haiku"  # Use the visible model name (key from models config)
 
    vertex:
      region: "us-east5"
      # serviceKeyFile: "/path/to/service-account.json"  # Optional, defaults to api/data/auth.json
      # projectId: "${VERTEX_PROJECT_ID}"  # Optional, auto-detected from service key
 
      # Model mapping: visible name -> Vertex AI deployment name
      models:
        claude-opus-4.5:
          deploymentName: claude-opus-4-5@20251101
        claude-sonnet-4:
          deploymentName: claude-sonnet-4-20250514
        claude-3.7-sonnet:
          deploymentName: claude-3-7-sonnet-20250219
        claude-3.5-sonnet:
          deploymentName: claude-3-5-sonnet-v2@20241022
        claude-3.5-haiku:
          deploymentName: claude-3-5-haiku@20241022

Note: Anthropic endpoint supports all Shared Endpoint Settings, including streamRate, titleModel, titleMethod, titlePrompt, titlePromptTemplate, and titleEndpoint.


vertex

The vertex object contains all Vertex AI-specific configuration options.

region

Key:

KeyTypeDescriptionExample
regionStringThe Google Cloud region where your Vertex AI endpoint is deployed.Must be a region where Claude models are available on Vertex AI.

Default: us-east5

Available Regions:

  • global (recommended for most use cases)
  • us-east5
  • us-central1
  • europe-west1
  • europe-west4
  • asia-southeast1

Tip: The global region is recommended as it provides automatic routing to the nearest available region. Use specific regions only if you have data residency requirements.

Example:

endpoints / anthropic / vertex / region
region: "global"

projectId

Key:

KeyTypeDescriptionExample
projectIdStringThe Google Cloud Project ID. Supports environment variable references.Optional. If not specified, auto-detected from the service account key file.

Default: Auto-detected from service key file

Example:

endpoints / anthropic / vertex / projectId
projectId: "${GOOGLE_PROJECT_ID}"

serviceKeyFile

Key:

KeyTypeDescriptionExample
serviceKeyFileStringPath to the Google Cloud service account key JSON file.Can be absolute or relative to the LibreChat root directory.

Default: api/data/auth.json (or GOOGLE_SERVICE_KEY_FILE environment variable)

Example:

endpoints / anthropic / vertex / serviceKeyFile
serviceKeyFile: "/etc/secrets/gcp-service-account.json"

models

The models field defines the available Claude models and maps user-friendly names to Vertex AI deployment IDs. This works similarly to Azure OpenAI model mapping.

Format Options

You can configure models in three ways:

Option 1: Simple Array

Use the actual Vertex AI model IDs directly. These will be shown as-is in the UI:

Simple array format
models:
  - "claude-sonnet-4-20250514"
  - "claude-3-7-sonnet-20250219"
  - "claude-3-5-haiku@20241022"

Option 2: Object with Custom Names (Recommended)

Map user-friendly names to Vertex AI deployment names:

Object format with custom names
models:
  claude-opus-4.5:           # Visible in UI
    deploymentName: claude-opus-4-5@20251101  # Actual Vertex AI model ID
  claude-sonnet-4:
    deploymentName: claude-sonnet-4-20250514
  claude-3.5-haiku:
    deploymentName: claude-3-5-haiku@20241022

Option 3: Mixed Format with Default

Set a default deployment name and use boolean values for models that inherit it:

Mixed format
deploymentName: claude-sonnet-4-20250514  # Default deployment
models:
  claude-sonnet-4: true  # Uses default deploymentName
  claude-3.5-haiku:
    deploymentName: claude-3-5-haiku@20241022  # Override for this model

Model Object Properties

KeyTypeDescriptionExample
deploymentNameStringThe actual Vertex AI model ID used for API calls.Required for each model unless using boolean `true` with a group-level default.

Example:

Model with deploymentName
models:
  claude-sonnet-4:
    deploymentName: claude-sonnet-4-20250514

Environment Variable Alternative

For simpler setups, you can configure Vertex AI using environment variables instead of YAML:

.env
# Enable Vertex AI mode
ANTHROPIC_USE_VERTEX=true
 
# Vertex AI region (optional, defaults to us-east5)
ANTHROPIC_VERTEX_REGION=global
 
# Path to service account key (optional, defaults to api/data/auth.json)
GOOGLE_SERVICE_KEY_FILE=/path/to/service-account.json

Note: When using environment variables, model mapping is not available. All known Claude models will be included automatically.


Complete Examples

Basic Setup

Minimal configuration using defaults (Vertex AI is enabled by the presence of the vertex section):

Basic Vertex AI Setup
endpoints:
  anthropic:
    vertex:
      region: us-east5

This uses:

  • Region: us-east5
  • Service key: api/data/auth.json (or GOOGLE_SERVICE_KEY_FILE env var)
  • Project ID: Auto-detected from service key
  • Models: All known Claude models

Production Setup with Model Mapping

Full configuration with custom model names and titles:

Production Vertex AI Setup
endpoints:
  anthropic:
    streamRate: 20
    titleModel: "haiku"
    titleMethod: "completion"
 
    vertex:
      region: "global"
      serviceKeyFile: "${GOOGLE_SERVICE_KEY_FILE}"
 
      models:
        opus:
          deploymentName: claude-opus-4-5@20251101
        sonnet:
          deploymentName: claude-sonnet-4-20250514
        haiku:
          deploymentName: claude-3-5-haiku@20241022

Multi-Region Setup

You can only configure one region per deployment. For multi-region needs, consider using separate LibreChat instances or custom endpoints.


Troubleshooting

Common Errors

”Could not load the default credentials”

  • Ensure the service account key file exists at the specified path
  • Check file permissions (must be readable by the LibreChat process)
  • Verify the JSON file is valid and not corrupted

”Permission denied” or “403 Forbidden”

  • Verify the service account has the Vertex AI User role
  • Ensure Claude models are enabled in your Vertex AI Model Garden
  • Check that the service account belongs to the correct project

”Model not found”

  • Check that the model ID in deploymentName is correct
  • Verify the model is available in your selected region
  • Ensure the model is enabled in your Vertex AI Model Garden

Region Issues

”Invalid region” or “Region not supported”

  • Use one of the supported regions listed above
  • Try using global region which provides automatic routing
  • Check Google Cloud’s documentation for the latest list of regions where Claude is available

”Model not available in region”

  • Not all Claude models are available in all regions
  • Try switching to global region for automatic routing to an available region
  • Check the Vertex AI Model Garden to see which models are available in your region
  • Consider using a different region that has broader model availability (e.g., us-east5)

Latency issues

  • If you’re experiencing high latency, try using a region geographically closer to your users
  • The global region automatically routes to the nearest available region
  • For production workloads with strict latency requirements, test different regions and choose the one with best performance for your use case

Verifying Setup

  1. Ensure your service account key is valid:

    gcloud auth activate-service-account --key-file=/path/to/key.json
    gcloud auth list
  2. Test Vertex AI access:

    gcloud ai models list --region=us-east5
  3. Verify Claude model access:

    curl -X POST \
      -H "Authorization: Bearer $(gcloud auth print-access-token)" \
      -H "Content-Type: application/json" \
      "https://us-east5-aiplatform.googleapis.com/v1/projects/YOUR_PROJECT/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku@20241022:rawPredict" \
      -d '{"anthropic_version": "vertex-2023-10-16", "max_tokens": 100, "messages": [{"role": "user", "content": "Hello"}]}'

Notes

  • Vertex AI and direct Anthropic API are mutually exclusive. When a vertex configuration section is present, the ANTHROPIC_API_KEY environment variable is ignored.
  • Web search functionality is fully supported with Vertex AI.
  • Prompt caching is supported via automatic header filtering for Vertex AI compatibility.
  • Function calling and tool use work the same as with the direct Anthropic API.