# OCR Config Object Structure (/docs/configuration/librechat_yaml/object_structure/ocr)

## Overview

The `ocr` object allows you to configure Optical Character Recognition (OCR) settings for the application, enabling the extraction of text from images. This section provides a detailed breakdown of the `ocr` object structure.

There are 4 main fields under `ocr`:

  - `mistralModel`
  - `apiKey`
  - `baseURL`
  - `strategy`

**Notes:**

- If using the Mistral OCR API, you don't need to edit your `librechat.yaml` file.
    - You only need the following environment variables to get started: `OCR_API_KEY` and `OCR_BASEURL`.
- OCR functionality allows the application to extract text from images, which can then be processed by AI models.
- The default strategy is `mistral_ocr`, which uses Mistral's OCR capabilities.
- You can also configure a custom OCR service by setting the strategy to `custom_ocr`.
- Azure-deployed Mistral OCR models can be used by setting the strategy to `azure_mistral_ocr`.
- Google Vertex AI-deployed Mistral OCR models can be used by setting the strategy to `vertexai_mistral_ocr`.
  - Requires the `GOOGLE_SERVICE_KEY_FILE` environment variable to be set with service account credentials
  - The service key can be provided as: file path, URL, base64 encoded JSON, or raw JSON string
  - Project ID and location are automatically extracted from the service account credentials
- Local text extraction is available via `document_parser`, which extracts text from PDF, DOCX, XLS/XLSX, and OpenDocument files without any external API.
  - Uses `pdfjs-dist`, `mammoth`, and `SheetJS` locally — no API key or base URL needed
  - Only the `strategy` field is required; `apiKey`, `baseURL`, and `mistralModel` are ignored
- If using the default Mistral OCR, you may optionally specify a specific Mistral model to use.
- Environment variable parsing is supported for `apiKey`, `baseURL`, and `mistralModel` parameters.
- A `user_provided` strategy option is planned for future releases but is not yet implemented.

## Automatic Document Parsing (No Configuration Required)

The built-in `document_parser` runs automatically for agent file uploads **even when no `ocr` block is configured** in your `librechat.yaml`. This means PDF, DOCX, XLS/XLSX, and ODS files are parsed out of the box without any setup.

The resolution logic works as follows:

1. **No `ocr` config exists** — When an agent context file is uploaded and its MIME type matches a supported document type (PDF, DOCX, Excel, ODS), the `document_parser` is used directly. No OCR capability check is required for the agent.

2. **`ocr` config exists** — The configured strategy (e.g., `mistral_ocr`) is tried first. If the configured strategy **fails at runtime**, the `document_parser` is used as a fallback so text extraction still succeeds for supported document types.

3. **Neither succeeds** — If both the configured strategy and the document parser fail (e.g., the file is an image-only PDF with no embedded text), an error is returned suggesting that an OCR service is needed.

<Callout type="info">
The `document_parser` handles text-based documents only. For image-based PDFs or scanned documents, you still need a configured OCR strategy (such as `mistral_ocr`) to extract text from the images within those files.
</Callout>

## Example

```yaml filename="ocr"
ocr:
  mistralModel: "mistral-ocr-latest"
  apiKey: "your-mistral-api-key"
  strategy: "mistral_ocr"
```

Example with custom OCR:

```yaml filename="ocr with custom OCR"
ocr:
  apiKey: "your-custom-ocr-api-key"
  baseURL: "https://your-custom-ocr-service.com/api"
  strategy: "custom_ocr"
```

Example with Azure Mistral OCR:

```yaml filename="ocr with Azure Mistral OCR"
ocr:
  mistralModel: "deployed-mistral-ocr-2503" # should match deployment name on Azure
  apiKey: "${AZURE_MISTRAL_OCR_API_KEY}" # arbitrary .env var reference
  baseURL: "https://your-deployed-endpoint.models.ai.azure.com/v1" # hardcoded, can also be .env var reference
  strategy: "azure_mistral_ocr"
```

Example with Google Vertex AI Mistral OCR:

```yaml filename="ocr with Google Vertex AI Mistral OCR"
ocr:
  mistralModel: "mistral-ocr-2505" # model name as deployed in Vertex AI
  strategy: "vertexai_mistral_ocr"
```

Example with local document parser (no external API needed):

```yaml filename="ocr with document parser"
ocr:
  strategy: "document_parser"
```

## mistralModel

<OptionTable
  options={[
    ['mistralModel', 'String', 'The Mistral model to use for OCR processing. For Azure deployments, this should match your deployment name. For Google Vertex AI, this should match the model name in your deployment.', 'Optional. Specifies which Mistral model should be used when the strategy is set to mistral_ocr, azure_mistral_ocr, or vertexai_mistral_ocr.'],
  ]}
/>

```yaml filename="ocr / mistralModel"
ocr:
  mistralModel: "mistral-ocr-latest"
```

For Azure deployments:

```yaml filename="ocr / mistralModel (Azure)"
ocr:
  mistralModel: "deployed-mistral-ocr-2503" # Your Azure deployment name
```

For Google Vertex AI deployments:

```yaml filename="ocr / mistralModel (Google Vertex AI)"
ocr:
  mistralModel: "mistral-ocr-2505" # Your Vertex AI model name
```

## apiKey

<OptionTable
  options={[
    ['apiKey', 'String', 'The API key for the OCR service. Not used for Google Vertex AI (uses service account authentication via GOOGLE_SERVICE_KEY_FILE).', 'Optional. Defaults to the environment variable OCR_API_KEY if not specified.'],
  ]}
/>

```yaml filename="ocr / apiKey"
ocr:
  apiKey: "your-ocr-api-key"
```

## baseURL

<OptionTable
  options={[
    ['baseURL', 'String', 'The base URL for the OCR service API. For Google Vertex AI, this is automatically constructed from the service account credentials.', 'Optional. Defaults to the environment variable OCR_BASEURL if not specified.'],
  ]}
/>

```yaml filename="ocr / baseURL"
ocr:
  baseURL: "https://your-ocr-service.com/api"
```


## strategy

<OptionTable
  options={[
    ['strategy', 'String', 'The OCR strategy to use.', 'Determines which OCR service to use. Options are "mistral_ocr", "azure_mistral_ocr", "vertexai_mistral_ocr", "document_parser", or "custom_ocr". Defaults to "mistral_ocr".'],
  ]}
/>

```yaml filename="ocr / strategy"
ocr:
  strategy: "custom_ocr"
```

**Available Strategies:**

- `mistral_ocr`: Uses Mistral's OCR capabilities via the standard [Mistral API](/docs/features/ocr#1-mistral-ocr-default).
- `azure_mistral_ocr`: Uses Mistral OCR models deployed on [Azure AI Foundry](/docs/features/ocr#2-azure-mistral-ocr).
- `vertexai_mistral_ocr`: Uses Mistral OCR models deployed on [Google Cloud Vertex AI](/docs/features/ocr#3-google-vertex-ai-mistral-ocr).
- `document_parser`: Uses local text extraction for PDF, DOCX, XLS/XLSX, and OpenDocument files. No external API needed. Also runs automatically for agent file uploads when no `ocr` config is present, and as a fallback when a configured OCR strategy fails.
- `custom_ocr`: Uses a custom OCR service specified by the `baseURL` [(not yet implemented)](/docs/features/ocr#future-enhancements).
