Skip to main content
LibreChat is joining ClickHouse to power the open-source Agentic Data Stack 🎉 Learn more
LibreChat

Bedrock Inference Profiles

Configure and use AWS Bedrock custom inference profiles with LibreChat for cross-region load balancing, cost allocation, and compliance controls.

This guide explains how to configure and use AWS Bedrock custom inference profiles with LibreChat, allowing you to route model requests through custom application inference profiles for better control, cost allocation, and cross-region load balancing.

Overview

AWS Bedrock inference profiles allow you to create custom routing configurations for foundation models. When you create a custom (application) inference profile, AWS generates a unique ARN that doesn't contain model name information:

arn:aws:bedrock:us-east-1:123456789012:application-inference-profile/abc123def456

LibreChat's inference profile mapping feature allows you to:

  1. Map friendly model IDs to custom inference profile ARNs
  2. Route requests through your custom profiles while maintaining model capability detection
  3. Use environment variables for secure ARN management

Why Use Custom Inference Profiles?

BenefitDescription
Cross-Region Load BalancingAutomatically distribute requests across multiple AWS regions
Cost AllocationTag and track costs per application or team
Throughput ManagementConfigure dedicated throughput for your applications
ComplianceRoute requests through specific regions for data residency
MonitoringTrack usage per inference profile in CloudWatch

Prerequisites

Before you begin, ensure you have:

  1. AWS Account with Bedrock access enabled
  2. AWS CLI installed and configured
  3. IAM Permissions:
    • bedrock:CreateInferenceProfile
    • bedrock:ListInferenceProfiles
    • bedrock:GetInferenceProfile
    • bedrock:InvokeModel / bedrock:InvokeModelWithResponseStream
  4. LibreChat with Bedrock endpoint configured (see AWS Bedrock Setup)

Creating Custom Inference Profiles

Important: Custom inference profiles can only be created via API (AWS CLI, SDK, etc.) and cannot be created from the AWS Console.

Step 1: List Available System Inference Profiles

# List all inference profiles
aws bedrock list-inference-profiles --region us-east-1
 
# Filter for Claude models
aws bedrock list-inference-profiles --region us-east-1 \
  --query "inferenceProfileSummaries[?contains(inferenceProfileId, 'claude')]"

Step 2: Create a Custom Inference Profile

# Get the system inference profile ARN to copy from
export SOURCE_PROFILE_ARN=$(aws bedrock list-inference-profiles --region us-east-1 \
  --query "inferenceProfileSummaries[?inferenceProfileId=='us.anthropic.claude-3-7-sonnet-20250219-v1:0'].inferenceProfileArn" \
  --output text)
 
# Create your custom inference profile
aws bedrock create-inference-profile \
  --inference-profile-name "MyApp-Claude-3-7-Sonnet" \
  --description "Custom inference profile for my application" \
  --model-source copyFrom="$SOURCE_PROFILE_ARN" \
  --region us-east-1

Step 3: Verify Creation

# List your custom profiles
aws bedrock list-inference-profiles --type-equals APPLICATION --region us-east-1
 
# Get details of a specific profile
aws bedrock get-inference-profile \
  --inference-profile-identifier "arn:aws:bedrock:us-east-1:123456789012:application-inference-profile/abc123" \
  --region us-east-1

Method 2: Python Script

import boto3
 
AWS_REGION = 'us-east-1'
 
def create_inference_profile(profile_name: str, source_model_id: str):
    """
    Create a custom inference profile for LibreChat.
 
    Args:
        profile_name: Name for your custom profile
        source_model_id: The system inference profile ID to copy from
                        (e.g., 'us.anthropic.claude-3-7-sonnet-20250219-v1:0')
    """
    bedrock = boto3.client('bedrock', region_name=AWS_REGION)
 
    profiles = bedrock.list_inference_profiles()
    source_arn = None
    for profile in profiles['inferenceProfileSummaries']:
        if profile['inferenceProfileId'] == source_model_id:
            source_arn = profile['inferenceProfileArn']
            break
 
    if not source_arn:
        raise ValueError(f"Source profile {source_model_id} not found")
 
    response = bedrock.create_inference_profile(
        inferenceProfileName=profile_name,
        description=f'Custom inference profile for {profile_name}',
        modelSource={'copyFrom': source_arn},
        tags=[
            {'key': 'Application', 'value': 'LibreChat'},
            {'key': 'Environment', 'value': 'Production'}
        ]
    )
 
    print(f"Created profile: {response['inferenceProfileArn']}")
    return response['inferenceProfileArn']
 
if __name__ == "__main__":
    create_inference_profile(
        "LibreChat-Claude-3-7-Sonnet",
        "us.anthropic.claude-3-7-sonnet-20250219-v1:0"
    )
    create_inference_profile(
        "LibreChat-Claude-Sonnet-4-5",
        "us.anthropic.claude-sonnet-4-5-20250929-v1:0"
    )

Configuring LibreChat

librechat.yaml Configuration

Add the bedrock endpoint configuration to your librechat.yaml. For full field reference, see AWS Bedrock Object Structure.

endpoints:
  bedrock:
    # List the models you want available in the UI
    models:
      - 'us.anthropic.claude-3-7-sonnet-20250219-v1:0'
      - 'us.anthropic.claude-sonnet-4-5-20250929-v1:0'
      - 'global.anthropic.claude-opus-4-5-20251101-v1:0'
    # Map model IDs to their custom inference profile ARNs
    inferenceProfiles:
      # Using environment variable (recommended for security)
      'us.anthropic.claude-3-7-sonnet-20250219-v1:0': '${BEDROCK_CLAUDE_37_PROFILE}'
      # Using direct ARN
      'us.anthropic.claude-sonnet-4-5-20250929-v1:0': 'arn:aws:bedrock:us-east-1:123456789012:application-inference-profile/abc123'
      # Another env variable example
      'global.anthropic.claude-opus-4-5-20251101-v1:0': '${BEDROCK_OPUS_45_PROFILE}'
    # Optional: Configure available regions for cross-region inference
    availableRegions:
      - 'us-east-1'
      - 'us-west-2'

Environment Variables

Add your Bedrock region, AWS authentication settings, and inference profile ARNs to your .env file:

#===================================#
# AWS Bedrock Configuration         #
#===================================#
 
BEDROCK_AWS_DEFAULT_REGION=us-east-1
 
# Option 1: Use an AWS profile
BEDROCK_AWS_PROFILE=your-profile-name
 
# Option 2: Omit BEDROCK_AWS_PROFILE and Bedrock-specific static credentials
# to use the AWS SDK default credential provider chain.
 
# Option 3: Static Bedrock credentials, if profiles or IAM roles are not suitable
# BEDROCK_AWS_ACCESS_KEY_ID=AKIAIOSFODNN7EXAMPLE
# BEDROCK_AWS_SECRET_ACCESS_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
# BEDROCK_AWS_SESSION_TOKEN=your-session-token
 
# Option 4: Bedrock API key (bearer auth)
# BEDROCK_AWS_BEARER_TOKEN=your-bedrock-api-key
 
# Inference Profile ARNs
BEDROCK_CLAUDE_37_PROFILE=arn:aws:bedrock:us-east-1:123456789012:application-inference-profile/abc123
BEDROCK_OPUS_45_PROFILE=arn:aws:bedrock:us-east-1:123456789012:application-inference-profile/def456

Setting Up Logging

To verify that your inference profiles are being used correctly, enable AWS Bedrock model invocation logging.

1. Create CloudWatch Log Group

aws logs create-log-group \
  --log-group-name /aws/bedrock/model-invocations \
  --region us-east-1

2. Create IAM Role for Bedrock Logging

Create the trust policy file (bedrock-logging-trust.json):

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "bedrock.amazonaws.com"
      },
      "Action": "sts:AssumeRole",
      "Condition": {
        "StringEquals": {
          "aws:SourceAccount": "YOUR_ACCOUNT_ID"
        },
        "ArnLike": {
          "aws:SourceArn": "arn:aws:bedrock:us-east-1:YOUR_ACCOUNT_ID:*"
        }
      }
    }
  ]
}

Create the role:

aws iam create-role \
  --role-name BedrockLoggingRole \
  --assume-role-policy-document file://bedrock-logging-trust.json

Attach CloudWatch Logs permissions:

aws iam put-role-policy \
  --role-name BedrockLoggingRole \
  --policy-name BedrockLoggingPolicy \
  --policy-document '{
    "Version": "2012-10-17",
    "Statement": [
      {
        "Effect": "Allow",
        "Action": [
          "logs:CreateLogStream",
          "logs:PutLogEvents"
        ],
        "Resource": "arn:aws:logs:us-east-1:YOUR_ACCOUNT_ID:log-group:/aws/bedrock/model-invocations:*"
      }
    ]
  }'

Create S3 bucket for large data (required):

aws s3 mb s3://bedrock-logs-YOUR_ACCOUNT_ID --region us-east-1
 
aws iam put-role-policy \
  --role-name BedrockLoggingRole \
  --policy-name BedrockS3Policy \
  --policy-document '{
    "Version": "2012-10-17",
    "Statement": [
      {
        "Effect": "Allow",
        "Action": ["s3:PutObject"],
        "Resource": "arn:aws:s3:::bedrock-logs-YOUR_ACCOUNT_ID/*"
      }
    ]
  }'

3. Enable Model Invocation Logging

aws bedrock put-model-invocation-logging-configuration \
  --logging-config '{
    "cloudWatchConfig": {
      "logGroupName": "/aws/bedrock/model-invocations",
      "roleArn": "arn:aws:iam::YOUR_ACCOUNT_ID:role/BedrockLoggingRole",
      "largeDataDeliveryS3Config": {
        "bucketName": "bedrock-logs-YOUR_ACCOUNT_ID",
        "keyPrefix": "large-data"
      }
    },
    "textDataDeliveryEnabled": true,
    "imageDataDeliveryEnabled": true,
    "embeddingDataDeliveryEnabled": true
  }' \
  --region us-east-1

Verify logging is enabled:

aws bedrock get-model-invocation-logging-configuration --region us-east-1

Verifying Your Configuration

View Logs via CLI

After making a request through LibreChat, check the logs:

# Tail logs in real-time
aws logs tail /aws/bedrock/model-invocations --follow --region us-east-1
 
# View recent logs
aws logs tail /aws/bedrock/model-invocations --since 5m --region us-east-1

What to Look For

In the log output, look for the modelId field:

{
  "timestamp": "2026-01-16T16:56:15Z",
  "accountId": "123456789012",
  "region": "us-east-1",
  "requestId": "a8b9d8c9-87b3-41ea-8a02-e8bfdba7782f",
  "operation": "ConverseStream",
  "modelId": "arn:aws:bedrock:us-east-1:123456789012:application-inference-profile/abc123",
  "inferenceRegion": "us-west-2"
}

Success indicators:

  • modelId shows your custom inference profile ARN (contains application-inference-profile)
  • inferenceRegion may vary (shows cross-region routing is working)

If mapping isn't working:

  • modelId will show the raw model ID instead of the ARN

View Logs via AWS Console

  1. Open CloudWatch in the AWS Console
  2. Navigate to Logs > Log groups
  3. Select /aws/bedrock/model-invocations
  4. Click on the latest log stream
  5. Search for your inference profile ID

Monitoring Usage

CloudWatch Metrics

View Bedrock metrics in CloudWatch:

aws cloudwatch list-metrics --namespace AWS/Bedrock --region us-east-1

AWS Console

  1. Bedrock Console > Inference profiles > Application tab
  2. Click on your custom profile
  3. View invocation metrics and usage statistics

Troubleshooting

Common Issues

IssueCauseSolution
Model not recognizedMissing model in models arrayAdd the model ID to models in librechat.yaml
ARN not being usedModel ID doesn't matchEnsure the model ID in inferenceProfiles exactly matches what's in models
Env variable not resolvedTypo or not setCheck .env file and ensure variable name matches ${VAR_NAME}
Access DeniedMissing IAM permissionsAdd bedrock:InvokeModel* permissions for the inference profile ARN
Model access deniedModel agreement missing or propagatingAccept the Bedrock model agreement and wait for availability to propagate
Profile not foundWrong regionEnsure you're creating/using profiles in the same region

Model Access Agreement Propagation

Creating an application inference profile does not automatically enable the underlying foundation model in your AWS account. If model access was just enabled, AWS may also need a short propagation window before requests through the inference profile succeed.

This can appear as an AccessDeniedException even when the inference profile exists and your IAM role has bedrock:InvokeModel permissions. The error may mention aws-marketplace:ViewSubscriptions, aws-marketplace:Subscribe, or ask you to try again after a few minutes.

Check the underlying model availability before debugging the LibreChat mapping:

aws bedrock get-foundation-model-availability \
  --region us-east-1 \
  --model-id us.anthropic.claude-sonnet-4-5-20250929-v1:0

Look for:

  • agreementAvailability.status set to AVAILABLE
  • authorizationStatus set to AUTHORIZED
  • entitlementAvailability set to AVAILABLE
  • regionAvailability set to AVAILABLE

If the agreement is missing, accept the model agreement in the Bedrock console or with an AWS principal that can manage Bedrock model agreements and Marketplace subscriptions. After it changes to AVAILABLE, wait a couple of minutes and retry invoking the application inference profile.

Debug Checklist

  1. Model ID is in the models array
  2. Model ID in inferenceProfiles exactly matches (case-sensitive)
  3. Environment variable is set (if using ${VAR} syntax)
  4. AWS credentials have permission to invoke the inference profile
  5. The underlying foundation model agreement is AVAILABLE in Bedrock
  6. LibreChat has been restarted after config changes

Verify Config Loading

Check that your config is being read correctly by examining the server logs when LibreChat starts.

Complete Example

librechat.yaml

version: 1.3.5
 
endpoints:
  bedrock:
    models:
      - 'us.anthropic.claude-3-7-sonnet-20250219-v1:0'
      - 'us.anthropic.claude-sonnet-4-5-20250929-v1:0'
      - 'global.anthropic.claude-opus-4-5-20251101-v1:0'
      - 'us.amazon.nova-pro-v1:0'
    inferenceProfiles:
      'us.anthropic.claude-3-7-sonnet-20250219-v1:0': '${BEDROCK_CLAUDE_37_PROFILE}'
      'us.anthropic.claude-sonnet-4-5-20250929-v1:0': '${BEDROCK_SONNET_45_PROFILE}'
      'global.anthropic.claude-opus-4-5-20251101-v1:0': '${BEDROCK_OPUS_45_PROFILE}'
    availableRegions:
      - 'us-east-1'
      - 'us-west-2'

.env

# AWS Bedrock
BEDROCK_AWS_DEFAULT_REGION=us-east-1
BEDROCK_AWS_PROFILE=your-profile-name
# Or use a Bedrock API key instead:
# BEDROCK_AWS_BEARER_TOKEN=your-bedrock-api-key
 
# Inference Profiles
BEDROCK_CLAUDE_37_PROFILE=arn:aws:bedrock:us-east-1:123456789012:application-inference-profile/abc123
BEDROCK_SONNET_45_PROFILE=arn:aws:bedrock:us-east-1:123456789012:application-inference-profile/def456
BEDROCK_OPUS_45_PROFILE=arn:aws:bedrock:us-east-1:123456789012:application-inference-profile/ghi789

Quick Setup Script

#!/bin/bash
 
REGION="us-east-1"
ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)
 
# Create inference profiles
for MODEL in "us.anthropic.claude-3-7-sonnet-20250219-v1:0" "us.anthropic.claude-sonnet-4-5-20250929-v1:0"; do
  PROFILE_NAME="LibreChat-${MODEL//[.:]/-}"
  SOURCE_ARN=$(aws bedrock list-inference-profiles --region $REGION \
    --query "inferenceProfileSummaries[?inferenceProfileId=='$MODEL'].inferenceProfileArn" \
    --output text)
  if [ -n "$SOURCE_ARN" ]; then
    echo "Creating profile for $MODEL..."
    aws bedrock create-inference-profile \
      --inference-profile-name "$PROFILE_NAME" \
      --model-source copyFrom="$SOURCE_ARN" \
      --region $REGION
  fi
done
 
# List created profiles
echo ""
echo "Your custom inference profiles:"
aws bedrock list-inference-profiles --type-equals APPLICATION --region $REGION \
  --query "inferenceProfileSummaries[].{Name:inferenceProfileName,ARN:inferenceProfileArn}" \
  --output table

How is this guide?