For the Google Endpoint, you can either use the Generative Language API (for Gemini models), or the Vertex AI API (for Gemini, PaLM2 & Codey models).
The Generative Language API uses an API key, which you can get from Google AI Studio.
For Vertex AI, you need a Service Account JSON key file, with appropriate access configured.
Instructions for both are given below.
Generative Language API (Gemini)
See here for Gemini API pricing and rate limits
⚠️ While Google models are free, they are using your input/output to help improve the model, with data de-identified from your Google Account and API key. ⚠️ During this period, your messages “may be accessible to trained reviewers.”
To use Gemini models through Google AI Studio, you'll need an API key. If you don't already have one, create a key in Google AI Studio.
Get an API key here: aistudio.google.com
Once you have your key, provide the key in your .env file, which allows all users of your instance to use it.
Or, you can make users provide it from the frontend by setting the following:
Some reverse proxies do not support the X-goog-api-key header. You can configure LibreChat to use the Authorization header instead:
Since fetching the models list isn't yet supported, you should set the models you want to use in the .env file.
For your convenience, these are the latest models as of 5/18/24 that can be used with the Generative Language API:
Notes:
Notes:
- A gemini-pro model or
gemini-pro-visionare required in your list for attaching images. - Using LibreChat, PaLM2 and Codey models can only be accessed through Vertex AI, not the Generative Language API.
- Only models that support the
generateContentmethod can be used natively with LibreChat + the Gen AI API.
- Only models that support the
- Selecting
gemini-pro-visionfor messages with attachments is not necessary as it will be switched behind the scenes for you - Since
gemini-pro-visiondoes not accept non-attachment messages, messages without attachments are automatically switched to usegemini-pro(otherwise, Google responds with an error) - With the Google endpoint, you cannot use both Vertex AI and Generative Language API at the same time. You must choose one or the other.
- Some PaLM/Codey models and
gemini-pro-visionmay fail whenmaxOutputTokensis set to a high value. If you encounter this issue, try reducing the value through the conversation parameters.
Setting GOOGLE_KEY=user_provided in your .env file sets both the Vertex AI Service Account JSON key file and the Generative Language API key to be provided from the frontend like so:
Vertex AI
See here for Vertex API pricing and rate limits
To setup Google LLMs (via Google Cloud Vertex AI), first, signup for Google Cloud: cloud.google.com
You can usually get $300 starting credit, which makes this option free for 90 days.
- Once signed up, Enable the Vertex AI API on Google Cloud:
- Go to Vertex AI page on Google Cloud console
- Click on
Enable APIif prompted
- Create a Service Account with Vertex AI role:
- Click here to create a Service Account
- Select or create a project
- Enter a service account ID (required), name and description are optional
- Click on "Create and Continue" to give at least the "Vertex AI User" role
- Click on "Continue/Done"
- Create a JSON key to Save in your Project Directory:
- Go back to the Service Accounts page
- Select your service account
- Click on "Keys"
- Click on "Add Key" and then "Create new key"
- Choose JSON as the key type and click on "Create"
- Download the key file and rename it as 'auth.json'
- Save it within the project directory, in
/api/data/
Alternative: Using GOOGLE_SERVICE_KEY_FILE
Instead of saving the key file to /api/data/auth.json, you can use the GOOGLE_SERVICE_KEY_FILE environment variable to specify the path to your service account key file. This provides more flexibility in how you manage your credentials. See the environment variable section below for more details.
Saving your JSON key file in the project directory which allows all users of your LibreChat instance to use it.
Alternatively, you can make users provide it from the frontend by setting the following:
You can also specify the service account key file using the GOOGLE_SERVICE_KEY_FILE environment variable:
This is particularly useful for features that require Vertex AI authentication, such as OCR capabilities.
You can also specify the Google Cloud location for Vertex AI API requests:
Since fetching the models list isn't yet supported, you should set the models you want to use in the .env file.
For your convenience, these are the latest models as of 5/18/24 that can be used with the Generative Language API:
If you are using Docker
If you're using docker and want to provide the auth.json file, you will need to also mount the volume in docker-compose.override.yml
Google Safety Settings
To set safety settings for both Vertex AI and Generative Language API, you can set the following in your .env file:
You can also exclude safety settings by setting the following in your .env file, which will use the provider defaults. This can be helpful if you are having issues with specific safety settings.
NOTE: You do not have access to the BLOCK_NONE setting by default.
To use this restricted HarmBlockThreshold setting, you will need to either:
- (a) Get access through an allowlist via your Google account team
- (b) Switch your account type to monthly invoiced billing following this instruction: https://cloud.google.com/billing/docs/how-to/invoiced-billing
Notes:
- Google endpoint supports all Shared Endpoint Settings via the
librechat.yamlconfiguration file, includingstreamRate,titleModel,titleMethod,titlePrompt,titlePromptTemplate, andtitleEndpoint
How is this guide?