Docs
Configuration
librechat.yaml
Custom AI Endpoints
Azure OpenAI

Azure OpenAI

Azure OpenAI Integration for LibreChat

LibreChat boasts compatibility with Azure OpenAI API services, treating the endpoint as a first-class citizen. To properly utilize Azure OpenAI within LibreChat, it’s crucial to configure the librechat.yaml file according to your specific needs. This document guides you through the essential setup process which allows seamless use of multiple deployments and models with as much flexibility as needed.

Example

Here’s a quick snapshot of what a comprehensive configuration might look like, including many of the options and features discussed below.

librechat.yaml
endpoints:
  azureOpenAI:
    # Endpoint-level configuration
    titleModel: "llama-70b-chat"
    plugins: true
    assistants: true
    groups:
    # Group-level configuration
    - group: "my-resource-westus"
      apiKey: "${WESTUS_API_KEY}"
      instanceName: "my-resource-westus"
      version: "2024-03-01-preview"
      # Model-level configuration
      models:
        gpt-4-vision-preview:
          deploymentName: gpt-4-vision-preview
          version: "2024-03-01-preview"
        gpt-3.5-turbo:
          deploymentName: gpt-35-turbo
        gpt-4-1106-preview:
          deploymentName: gpt-4-1106-preview
    # Group-level configuration
    - group: "mistral-inference"
      apiKey: "${AZURE_MISTRAL_API_KEY}"
      baseURL: "https://Mistral-large-vnpet-serverless.region.inference.ai.azure.com/v1/chat/completions"
      serverless: true
      # Model-level configuration
      models:
        mistral-large: true
    # Group-level configuration
    - group: "my-resource-sweden"
      apiKey: "${SWEDEN_API_KEY}"
      instanceName: "my-resource-sweden"
      deploymentName: gpt-4-1106-preview
      version: "2024-03-01-preview"
      assistants: true
      # Model-level configuration
      models:
        gpt-4-turbo: true

Here’s another working example configured according to the specifications of the Azure OpenAI Endpoint Configuration Docs:

Each level of configuration is extensively detailed in their respective sections:

  1. Endpoint-level config

  2. Group-level config

  3. Model-level config

Setup

  1. Open librechat.yaml for Editing: Use your preferred text editor or IDE to open and edit the librechat.yaml file.

    • Optional: use a remote or custom file path with the following environment variable:
    .env
    CONFIG_PATH="/alternative/path/to/librechat.yaml"
  2. Configure Azure OpenAI Settings: Follow the detailed structure outlined below to populate your Azure OpenAI settings appropriately. This includes specifying API keys, instance names, model groups, and other essential configurations.

  3. Make sure to Remove Legacy Settings: If you are using any of the legacy configurations, be sure to remove. The LibreChat server will also detect these and remind you.

  4. Save Your Changes: After accurately inputting your settings, save the librechat.yaml file.

  5. Restart LibreChat: For the changes to take effect, restart your LibreChat application. This ensures that the updated configurations are loaded and utilized.

Required Fields

To properly integrate Azure OpenAI with LibreChat, specific fields must be accurately configured in your librechat.yaml file. These fields are validated through a combination of custom and environmental variables to ensure the correct setup. Here are the detailed requirements based on the validation process:

Endpoint-Level Configuration

Here’s the conversion of the provided settings into the new option table format:

Global Azure Settings:

Title and Conversation Settings:

KeyTypeDescriptionExample
titleModelstringSpecifies the model to use for generating conversation titles. If not provided, the default model is set as `gpt-3.5-turbo`, which will result in no titles if lacking this model. You can also set this to dynamically use the current model by setting it to `current_model`.titleModel:
pluginsbooleanEnables the use of plugins through Azure. Set to `true` to activate Plugins endpoint support through your Azure config. Default: `false`.plugins:false
assistantsbooleanEnables the use of assistants through Azure. Set to `true` to activate Assistants endpoint through your Azure config. Default: `false`. Note: this requires an assistants-compatible region.assistants:false
summarizebooleanEnables conversation summarization for all Azure models. Set to `true` to activate summarization. Default: `false`.summarize:false
summaryModelstringSpecifies the model to use for generating conversation summaries. If not provided, the default behavior is to use the first model in the `default` array of the first group.summaryModel:
titleConvobooleanEnables conversation title generation for all Azure models. Set to `true` to activate title generation. Default: `false`.titleConvo:false
titleMethodstringSpecifies the method to use for generating conversation titles. Valid options are `"completion"` and `"functions"`. If not provided, the default behavior is to use the `"completion"` method.titleMethod:completion

Group Configuration:

KeyTypeDescriptionExample
groupsarraySpecifies the list of Azure OpenAI model groups. Each group represents a set of models with shared configurations. The groups field is an array of objects, where each object defines the settings for a specific group. This is a required field at the endpoint level, and at least one group must be defined. The group-level configurations are detailed in the Group-Level Configuration section.# groups:[]

Custom Order (Optional):

KeyTypeDescriptionExample
customOrdernumberAllows you to specify a custom order for the Azure endpoint in the user interface. Higher numbers will appear lower in the list. If not provided, the default order is determined by the order in which the endpoints are defined in the `librechat.yaml` file.customOrder:

Please note that the customOrder option is commented out, as it was mentioned as optional in the original text.

Here’s an example of how you can configure these endpoint-level settings in your librechat.yaml file:

librechat.yaml
endpoints:
  azureOpenAI:
    titleModel: "gpt-3.5-turbo-1106"
    plugins: true
    assistants: true
    summarize: true
    summaryModel: "gpt-3.5-turbo-1106"
    titleConvo: true
    titleMethod: "functions"
    groups:
      # ... (group-level and model-level configurations)

Group-Level Configuration

This is a breakdown of the fields configurable as defined for the Custom Config (librechat.yaml) file. For more information on each field, see the Azure OpenAI section in the Custom Config Docs.

Group-Level Configuration: Group Identification:

KeyTypeDescriptionExample
groupstringUnique identifier name for a group of models. Duplicate group names are not allowed and will result in validation errors.group: default

Authentication:

KeyTypeDescriptionExample
apiKeystringMust be a valid API key for Azure OpenAI services. It could be a direct key string or an environment variable reference (e.g., ${WESTUS_API_KEY}).apiKey: ${AZURE_API_KEY}

Azure OpenAI Instance:

KeyTypeDescriptionExample
instanceNamestringName of the Azure OpenAI instance. This field can also support environment variable references.instanceName: ${AZURE_OPENAI_INSTANCE}

Deployment Configuration:

KeyTypeDescriptionExample
deploymentNamestringThe deployment name at the group level is optional but required if any model within the group is set to true.deploymentName: my-deployment
versionstringThe Azure OpenAI API version at the group level is optional but required if any model within the group is set to true.version: 2023-03-15-preview

Advanced Settings:

KeyTypeDescriptionExample
baseURLstringCustom base URL for the Azure OpenAI API requests. Environment variable references are supported. This is optional and can be used for advanced routing scenarios.baseURL: https://my-custom-base-url.com
additionalHeadersobjectSpecifies any extra headers for Azure OpenAI API requests as key-value pairs. Environment variable references can be included as values.additionalHeaders: {Authorization: ${AUTH_HEADER}}
serverlessbooleanSpecifies if the group is a serverless inference chat completions endpoint from Azure Model Catalog, for which only a model identifier, baseURL, and apiKey are needed. For more info, see serverless inference endpoints.serverless: true
addParamsobjectAdds or overrides additional parameters for Azure OpenAI API requests. Useful for specifying API-specific options as key-value pairs.addParams: {temperature: 0.7}
dropParamsarrayAllows for the exclusion of certain default parameters from Azure OpenAI API requests. Useful for APIs that do not accept or recognize specific parameters. This should be specified as a list of strings.dropParams: [top_p, stop]
forcePromptbooleanDictates whether to send a prompt parameter instead of messages in the request body. This option is useful when needing to format the request in a manner consistent with OpenAI API expectations, particularly for scenarios preferring a single text payload.forcePrompt: true

Model Configuration:

KeyTypeDescriptionExample
modelsobjectSpecifies the mapping of model identifiers to their configurations within the group. The keys represent the model identifiers, which must match the corresponding OpenAI model names. The values can be either boolean (true) or objects containing model-specific settings. If a model is set to true, it inherits the group-level deploymentName and version. If a model is configured as an object, it can have its own deploymentName and version. This field is required, and at least one model must be defined within each group. More info heremodels: {gpt-3.5-turbo: true, text-davinci-003: {}}

Here’s an example of a group-level configuration in the librechat.yaml file

librechat.yaml
endpoints:
  azureOpenAI:
    # ... (endpoint-level configurations)
    groups:
      - group: "my-resource-group"
        apiKey: "${AZURE_API_KEY}"
        instanceName: "my-instance"
        deploymentName: "gpt-35-turbo"
        version: "2023-03-15-preview"
        baseURL: "https://my-instance.openai.azure.com/"
        additionalHeaders:
          CustomHeader: "HeaderValue"
        addParams:
          max_tokens: 2048
          temperature: 0.7
        dropParams:
          - "frequency_penalty"
          - "presence_penalty"
        forcePrompt: false
        models:
        # ... (model-level configurations)

Model-Level Configuration

Within each group, the models field contains a mapping of model identifiers to their configurations:

Model Identification:

KeyTypeDescriptionExample
Model IdentifierstringMust match the corresponding OpenAI model name. Can be a partial match.gpt-3.5-turbo: true

Model Configuration:

KeyTypeDescriptionExample
Model Configurationboolean/objectBoolean true: Uses the group-level deploymentName and version. Object: Specifies model-specific deploymentName and version. If not provided, inherits from the group.text-davinci-003: {deploymentName: my-model-deployment, version: 2023-03-15-preview}
deploymentNamestringThe deployment name for this specific model.deploymentName: my-model-deployment
versionstringThe Azure OpenAI API version for this specific model.version: 2023-03-15-preview

Serverless Inference Endpoints:

KeyTypeDescriptionExample
Serverless Inference EndpointsnoteFor serverless models, set the model to true.gpt-4: true
  • The model identifier must match its corresponding OpenAI model name in order for it to properly reflect its known context limits and/or function in the case of vision. For example, if you intend to use gpt-4-vision, it must be configured like so:
librechat.yaml
endpoints:
  azureOpenAI:
    # ... (endpoint-level configurations)
    groups:
    # ... (group-level configurations)
    - group: "example_group"
    models:
     # Model identifiers must match OpenAI Model name (can be a partial match)
      gpt-4-vision-preview:
      # Object setting: must include at least "deploymentName" and/or "version"
        deploymentName: "arbitrary-deployment-name"
        version: "2024-02-15-preview" # version can be any that supports vision
      # Boolean setting, must be "true"
      gpt-4-turbo: true
  • See Model Deployments for more examples.

  • If a model is set to true, it implies using the group-level deploymentName and version for this model. Both must be defined at the group level in this case.

  • If a model is configured as an object, it can specify its own deploymentName and version. If these are not provided, the model inherits the group’s deploymentName and version.

  • If the group represents a serverless inference endpoint, the singular model should be set to true to add it to the models list.

Special Considerations

  1. Unique Names: Both model and group names must be unique across the entire configuration. Duplicate names lead to validation failures.

  2. Missing Required Fields: Lack of required deploymentName or version either at the group level (for boolean-flagged models) or within the models’ configurations (if not inheriting or explicitly specified) will result in validation errors, unless the group represents a serverless inference endpoint.

  3. Environment Variable References: The configuration supports environment variable references (e.g., ${VARIABLE_NAME}). Ensure that all referenced variables are present in your environment to avoid runtime errors. The absence of defined environment variables referenced in the config will cause errors.${INSTANCE_NAME} and ${DEPLOYMENT_NAME} are unique placeholders, and do not correspond to environment variables, but instead correspond to the instance and deployment name of the currently selected model. It is not recommended you use INSTANCE_NAME and DEPLOYMENT_NAME as environment variable names to avoid any potential conflicts.

  4. Error Handling: Any issues in the config, like duplicate names, undefined environment variables, or missing required fields, will invalidate the setup and generate descriptive error messages aiming for prompt resolution. You will not be allowed to run the server with an invalid configuration.

  5. Model identifiers: An unknown model (to the project) can be used as a model identifier, but it must match a known model to reflect its known context length, which is crucial for message/token handling; e.g., gpt-7000 will be valid but default to a 4k token limit, whereas gpt-4-turbo will be recognized as having a 128k context limit.

Applying these setup requirements thoughtfully will ensure a correct and efficient integration of Azure OpenAI services with LibreChat through the librechat.yaml configuration. Always validate your configuration against the latest schema definitions and guidelines to maintain compatibility and functionality.

Model Deployments

The list of models available to your users are determined by the model groupings specified in your azureOpenAI endpoint config.

For example:

librechat.yaml
# Example Azure OpenAI Object Structure
endpoints:
  azureOpenAI:
    groups:
      - group: "my-westus" # arbitrary name
        apiKey: "${WESTUS_API_KEY}"
        instanceName: "actual-instance-name" # name of the resource group or instance
        version: "2023-12-01-preview"
        models:
          gpt-4-vision-preview:
            deploymentName: gpt-4-vision-preview
            version: "2024-02-15-preview"
          gpt-3.5-turbo: true
      - group: "my-eastus"
        apiKey: "${EASTUS_API_KEY}"
        instanceName: "actual-eastus-instance-name"
        deploymentName: gpt-4-turbo
        version: "2024-02-15-preview"
        models:
          gpt-4-turbo: true

The above configuration would enable gpt-4-vision-preview, gpt-3.5-turbo and gpt-4-turbo for your users in the order they were defined.

Using Assistants with Azure

To enable use of Assistants with Azure OpenAI, there are 2 main steps.

  1. Set the assistants field, under the azureOpenAI endpoint, i.e, at the Endpoint-level to true, like so:
librechat.yaml
endpoints:
  azureOpenAI:
  # Enable use of Assistants with Azure
    assistants: true
  1. Add the assistants field to groups compatible with Azure’s Assistants API integration.
librechat.yaml
endpoints:
  azureOpenAI:
    assistants: true
    groups:
      - group: "my-sweden-group"
        apiKey: "${SWEDEN_API_KEY}"
        instanceName: "actual-instance-name"
      # Mark this group as assistants compatible
        assistants: true
      # version must be "2024-02-15-preview" or later
        version: "2024-03-01-preview"
        models:
          # ... (model-level configuration)

Notes:

  • For credentials, rely on custom envrionment variables specified at each assistants-compatible group configuration.

  • If you mark multiple regions as assistants-compatible, assistants you create will be aggregated across regions to the main assistant selection list.

  • Files you upload to Azure OpenAI, whether at the message or assistant level, will only be available in the region the current assistant’s model is part of.

    • For this reason, it’s recommended you use only one region or resource group for Azure OpenAI Assistants, or you will experience an error.
    • Uploading to “OpenAI” is the default behavior for official code_interpeter and retrieval capabilities.
  • Downloading files that assistants generate will soon be supported.

  • As of May 19th 2024, retrieval and streaming are not yet supported through Azure OpenAI.

    • To avoid any errors with retrieval while it’s not supported, it’s recommended to disable the capability altogether through the azureAssistants endpoint config:
    librechat.yaml
    endpoints:
      azureOpenAI:
        # ...rest
     
      azureAssistants:
      # "retrieval" omitted.
        capabilities: ["code_interpreter", "actions", "tools"]
    • By default, all capabilities, except retrieval, are enabled.

Using Plugins with Azure

To use the Plugins endpoint with Azure OpenAI, you need a deployment supporting function calling. Otherwise, you need to set “Functions” off in the Agent settings. When you are not using “functions” mode, it’s recommend to have “skip completion” off as well, which is a review step of what the agent generated.

To use Azure with the Plugins endpoint, make sure the field plugins is set to true in your Azure OpenAI endpoing config:

librechat.yaml
# Example Azure OpenAI Object Structure
endpoints:
  azureOpenAI:
    plugins: true # <------- Set this
    groups:
    # omitted for brevity

Configuring the plugins field will configure Plugins to use Azure models.

NOTE: The current configuration through librechat.yaml uses the primary model you select from the frontend for Plugin use, which is not usually how it works without Azure, where instead the “Agent” model is used. The Agent model setting can be ignored when using Plugins through Azure.

Using a Specified Base URL with Azure

The base URL for Azure OpenAI API requests can be dynamically configured. This is useful for proxying services such as Cloudflare AI Gateway, or if you wish to explicitly override the baseURL handling of the app.

LibreChat will use the baseURL field for your Azure model grouping, which can include placeholders for the Azure OpenAI API instance and deployment names.

In the configuration, the base URL can be customized like so:

librechat.yaml
# librechat.yaml file, under an Azure group:
endpoints:
  azureOpenAI:
    groups:
      - group: "group-with-custom-base-url"
      baseURL: "https://example.azure-api.net/${INSTANCE_NAME}/${DEPLOYMENT_NAME}"
 
# OR
      baseURL: "https://${INSTANCE_NAME}.openai.azure.com/openai/deployments/${DEPLOYMENT_NAME}"
 
# Cloudflare example
      baseURL: "https://gateway.ai.cloudflare.com/v1/ACCOUNT_TAG/GATEWAY/azure-openai/${INSTANCE_NAME}/${DEPLOYMENT_NAME}"

NOTE: ${INSTANCE_NAME} and ${DEPLOYMENT_NAME} are unique placeholders, and do not correspond to environment variables, but instead correspond to the instance and deployment name of the currently selected model. It is not recommended you use INSTANCE_NAME and DEPLOYMENT_NAME as environment variable names to avoid any potential conflicts.

You can also omit the placeholders completely and simply construct the baseURL with your credentials:

librechat.yaml
      baseURL: "https://gateway.ai.cloudflare.com/v1/ACCOUNT_TAG/GATEWAY/azure-openai/my-secret-instance/my-deployment"

Lastly, you can specify the entire baseURL through a custom environment variable

librechat.yaml
      baseURL: "${MY_CUSTOM_BASEURL}"

Enabling Auto-Generated Titles with Azure

To enable titling for Azure, set titleConvo to true.

librechat.yaml
# Example Azure OpenAI Object Structure
endpoints:
  azureOpenAI:
    titleConvo: true # <------- Set this
    groups:
    # omitted for brevity

You can also specify the model to use for titling, with titleModel provided you have configured it in your group(s).

titleModel
    titleModel: "gpt-3.5-turbo"

Note: “gpt-3.5-turbo” is the default value, so you can omit it if you want to use this exact model and have it configured. If not configured and titleConvo is set to true, the titling process will result in an error and no title will be generated. You can also set this to dynamically use the current model by setting it to current_model.

titleModel
    titleModel: "current_model"

Using GPT-4 Vision with Azure

To use Vision (image analysis) with Azure OpenAI, you need to make sure gpt-4-vision-preview is a specified model in one of your groupings

This will work seamlessly as it does with the OpenAI endpoint (no need to select the vision model, it will be switched behind the scenes)

Generate images with Azure OpenAI Service (DALL-E)

Model IDFeature AvailabilityMax Request (characters)
dalle2East US1000
dalle3Sweden Central4000
  • First you need to create an Azure resource that hosts DALL-E
    • At the time of writing, dall-e-3 is available in the SwedenCentral region, dall-e-2 in the EastUS region.
  • Then, you need to deploy the image generation model in one of the above regions.
  • Configure your environment variables based on Azure credentials:

Here’s the updated layout for the DALL-E configuration options:

DALL-E:

API Keys:

KeyTypeDescriptionExample
DALLE_API_KEYstringThe OpenAI API key for DALL-E 2 and DALL-E 3 services.# DALLE_API_KEY=

API Keys (Version Specific):

KeyTypeDescriptionExample
DALLE3_API_KEYstringThe OpenAI API key for DALL-E 3.# DALLE3_API_KEY=
DALLE2_API_KEYstringThe OpenAI API key for DALL-E 2.# DALLE2_API_KEY=

System Prompts:

KeyTypeDescriptionExample
DALLE3_SYSTEM_PROMPTstringThe system prompt for DALL-E 3.# DALLE3_SYSTEM_PROMPT="Your DALL-E-3 System Prompt here"
DALLE2_SYSTEM_PROMPTstringThe system prompt for DALL-E 2.# DALLE2_SYSTEM_PROMPT="Your DALL-E-2 System Prompt here"

Reverse Proxy Settings:

KeyTypeDescriptionExample
DALLE_REVERSE_PROXYstringThe reverse proxy URL for DALL-E API requests.# DALLE_REVERSE_PROXY=

Base URLs:

KeyTypeDescriptionExample
DALLE3_BASEURLstringThe base URL for DALL-E 3 API endpoints.# DALLE3_BASEURL=https://<AZURE_OPENAI_API_INSTANCE_NAME>.openai.azure.com/openai/deployments/<DALLE3_DEPLOYMENT_NAME>/
DALLE2_BASEURLstringThe base URL for DALL-E 2 API endpoints.# DALLE2_BASEURL=https://<AZURE_OPENAI_API_INSTANCE_NAME>.openai.azure.com/openai/deployments/<DALLE2_DEPLOYMENT_NAME>/

Azure OpenAI Integration (Optional):

KeyTypeDescriptionExample
DALLE3_AZURE_API_VERSIONstringThe API version for DALL-E 3 with Azure OpenAI service.# DALLE3_AZURE_API_VERSION=the-api-version # e.g.: 2023-12-01-preview
DALLE2_AZURE_API_VERSIONstringThe API version for DALL-E 2 with Azure OpenAI service.# DALLE2_AZURE_API_VERSION=the-api-version # e.g.: 2023-12-01-preview

Remember to replace placeholder text with actual prompts or instructions and provide your actual API keys if you choose to include them directly in the file (though managing sensitive keys outside of the codebase is a best practice). Always review and respect OpenAI’s usage policies when embedding API keys in software.

Note: if you have PROXY set, it will be used for DALL-E calls also, which is universal for the app.

Serverless Inference Endpoints

Through the librechat.yaml file, you can configure Azure AI Studio serverless inference endpoints to access models from the Azure Model Catalog. Only a model identifier, baseURL, and apiKey are needed along with the serverless field to indicate the special handling these endpoints need.

librechat.yaml
endpoints:
  azureOpenAI:
    groups:
# serverless examples
    - group: "mistral-inference"
      apiKey: "${AZURE_MISTRAL_API_KEY}" # arbitrary env var name
      baseURL: "https://Mistral-large-vnpet-serverless.region.inference.ai.azure.com/v1/chat/completions"
      serverless: true
      models:
        mistral-large: true
    - group: "llama-70b-chat"
      apiKey: "${AZURE_LLAMA2_70B_API_KEY}" # arbitrary env var name
      baseURL: "https://Llama-2-70b-chat-qmvyb-serverless.region.inference.ai.azure.com/v1/chat/completions"
      serverless: true
      models:
        llama-70b-chat: true
    - group: "phi-3-medium-128k-instruct"
      apiKey: "${AZURE_PHI3_MEDIUM_API_KEY}" # arbitrary env var name
      baseURL: "https://Phi-3-medium-128k-instruct-abcde-serverless.eastus2.inference.ai.azure.com/v1/chat/completions"
      serverless: true
      models:
        phi-3-medium-128k-instruct: true

Notes:

  • Make sure to add the appropriate suffix for your deployment, either “/v1/chat/completions” or “/v1/completions”
  • If using “/v1/completions” (without “chat”), you need to set the forcePrompt field to true in your group config.
  • Compatibility with LibreChat relies on parity with OpenAI API specs, which at the time of writing, are typically “Pay-as-you-go” or “Models as a Service” (MaaS) deployments on Azure AI Studio, that are OpenAI-SDK-compatible with either v1/completions or v1/chat/completions endpoint handling.
  • All models that offer serverless deployments (“Serverless APIs”) are compatible from the Azure model catalog. You can filter by “Serverless API” under Deployment options and “Chat completion” under inference tasks to see the full list; however, real time endpoint models have not been tested.
  • These serverless inference endpoint/models are likely not compatible with OpenAI function calling, which enables the use of Plugins. As they have yet been tested, they are available on the Plugins endpoint, although they are not expected to work.