LibreChat is joining ClickHouse to power the open-source Agentic Data Stack 🎉 Learn more

配置 LibreChat YAML AI 端点 vLLM

vLLM

在 LibreChat 中将 vLLM 配置为自定义 endpoint。

vLLM 是一个用于 LLM 的高吞吐量、内存高效的推理和服务引擎。它提供了一个兼容 OpenAI 的 API，因此你可以在本地运行它，并将 LibreChat 指向你自己的服务器。

配置

本地 vLLM 部署不需要身份验证，因此 API key 仅作为占位符。将 baseURL 指向您正在运行的 vLLM 服务器。在 librechat.yaml 中的 endpoints.custom 下添加该 endpoint：

    - name: "vLLM"
      apiKey: "vllm"
      baseURL: "http://127.0.0.1:8023/v1"
      models:
        default: ['google/gemma-3-27b-it']
        fetch: true
      titleConvo: true
      titleModel: "current_model"
      titleMessageRole: "user"
      summarize: false
      summaryModel: "current_model"

注意事项

此示例连接到端口 8023 上的本地 vLLM 服务器，并以 Gemma 3 27B 作为默认模型。请将 baseURL 设置为您服务器运行的地址。
当设置 fetch: true 时，LibreChat 会加载 vLLM 服务器上可用的完整模型列表，因此 default 仅作为初始选择。
titleMessageRole: "user" 会覆盖用于生成标题的默认 system 角色。由于某些本地模型会拒绝 system 消息角色，因此将标题提示词作为 user 消息发送可以避免错误。

这篇指南怎么样？

在 GitHub 上编辑

TrueFoundry AI Gateway

在 LibreChat 中将 TrueFoundry AI Gateway 配置为自定义 endpoint。

Vultr Cloud Inference

在 LibreChat 中将 Vultr Cloud Inference 配置为自定义 endpoint。

本页内容

配置注意事项