Bedrock Inference Profiles
Configure and use AWS Bedrock custom inference profiles with LibreChat for cross-region load balancing, cost allocation, and compliance controls.
This guide explains how to configure and use AWS Bedrock custom inference profiles with LibreChat, allowing you to route model requests through custom application inference profiles for better control, cost allocation, and cross-region load balancing.
Overview
AWS Bedrock inference profiles allow you to create custom routing configurations for foundation models. When you create a custom (application) inference profile, AWS generates a unique ARN that doesn't contain model name information:
LibreChat's inference profile mapping feature allows you to:
- Map friendly model IDs to custom inference profile ARNs
- Route requests through your custom profiles while maintaining model capability detection
- Use environment variables for secure ARN management
Why Use Custom Inference Profiles?
| Benefit | Description |
|---|---|
| Cross-Region Load Balancing | Automatically distribute requests across multiple AWS regions |
| Cost Allocation | Tag and track costs per application or team |
| Throughput Management | Configure dedicated throughput for your applications |
| Compliance | Route requests through specific regions for data residency |
| Monitoring | Track usage per inference profile in CloudWatch |
Prerequisites
Before you begin, ensure you have:
- AWS Account with Bedrock access enabled
- AWS CLI installed and configured
- IAM Permissions:
bedrock:CreateInferenceProfilebedrock:ListInferenceProfilesbedrock:GetInferenceProfilebedrock:InvokeModel/bedrock:InvokeModelWithResponseStream
- LibreChat with Bedrock endpoint configured (see AWS Bedrock Setup)
Creating Custom Inference Profiles
Important: Custom inference profiles can only be created via API (AWS CLI, SDK, etc.) and cannot be created from the AWS Console.
Method 1: AWS CLI (Recommended)
Step 1: List Available System Inference Profiles
Step 2: Create a Custom Inference Profile
Step 3: Verify Creation
Method 2: Python Script
Configuring LibreChat
librechat.yaml Configuration
Add the bedrock endpoint configuration to your librechat.yaml. For full field reference, see AWS Bedrock Object Structure.
Environment Variables
Add your AWS credentials and inference profile ARNs to your .env file:
Setting Up Logging
To verify that your inference profiles are being used correctly, enable AWS Bedrock model invocation logging.
1. Create CloudWatch Log Group
2. Create IAM Role for Bedrock Logging
Create the trust policy file (bedrock-logging-trust.json):
Create the role:
Attach CloudWatch Logs permissions:
Create S3 bucket for large data (required):
3. Enable Model Invocation Logging
Verify logging is enabled:
Verifying Your Configuration
View Logs via CLI
After making a request through LibreChat, check the logs:
What to Look For
In the log output, look for the modelId field:
Success indicators:
modelIdshows your custom inference profile ARN (containsapplication-inference-profile)inferenceRegionmay vary (shows cross-region routing is working)
If mapping isn't working:
modelIdwill show the raw model ID instead of the ARN
View Logs via AWS Console
- Open CloudWatch in the AWS Console
- Navigate to Logs > Log groups
- Select
/aws/bedrock/model-invocations - Click on the latest log stream
- Search for your inference profile ID
Monitoring Usage
CloudWatch Metrics
View Bedrock metrics in CloudWatch:
AWS Console
- Bedrock Console > Inference profiles > Application tab
- Click on your custom profile
- View invocation metrics and usage statistics
Troubleshooting
Common Issues
| Issue | Cause | Solution |
|---|---|---|
| Model not recognized | Missing model in models array | Add the model ID to models in librechat.yaml |
| ARN not being used | Model ID doesn't match | Ensure the model ID in inferenceProfiles exactly matches what's in models |
| Env variable not resolved | Typo or not set | Check .env file and ensure variable name matches ${VAR_NAME} |
| Access Denied | Missing IAM permissions | Add bedrock:InvokeModel* permissions for the inference profile ARN |
| Profile not found | Wrong region | Ensure you're creating/using profiles in the same region |
Debug Checklist
- Model ID is in the
modelsarray - Model ID in
inferenceProfilesexactly matches (case-sensitive) - Environment variable is set (if using
${VAR}syntax) - AWS credentials have permission to invoke the inference profile
- LibreChat has been restarted after config changes
Verify Config Loading
Check that your config is being read correctly by examining the server logs when LibreChat starts.
Complete Example
librechat.yaml
.env
Quick Setup Script
Related Resources
- AWS Bedrock Inference Profiles Documentation
- AWS Bedrock Object Structure - YAML config field reference
- AWS Bedrock Setup - Basic Bedrock configuration
- AWS Bedrock Model Invocation Logging
How is this guide?