# Web Search Configuration (/docs/configuration/librechat_yaml/object_structure/web_search)

The `webSearch` configuration allows you to customize the web search functionality within LibreChat, including search providers, content scrapers, and result rerankers.

## Overview

The web search feature consists of three main components:

1. **Search Providers**: Services that perform the initial web search
2. **Scrapers**: Services that extract content from web pages
3. **Rerankers**: Services that reorder search results for better relevance

## Example

```yaml filename="webSearch"
webSearch:
  # Search Provider Configuration
  serperApiKey: "${SERPER_API_KEY}"
  searxngInstanceUrl: "${SEARXNG_INSTANCE_URL}"
  searxngApiKey: "${SEARXNG_API_KEY}"
  searchProvider: "serper" # Options: "serper", "searxng"

  # Scraper Configuration
  firecrawlApiKey: "${FIRECRAWL_API_KEY}"
  firecrawlApiUrl: "${FIRECRAWL_API_URL}"
  firecrawlVersion: "${FIRECRAWL_VERSION}"
  scraperProvider: "firecrawl" # Options: "firecrawl", "serper"

  # Reranker Configuration
  jinaApiKey: "${JINA_API_KEY}"
  jinaApiUrl: "${JINA_API_URL}"
  cohereApiKey: "${COHERE_API_KEY}"
  rerankerType: "jina" # Options: "jina", "cohere"

  # General Settings
  scraperTimeout: 7500 # Timeout in milliseconds for scraper requests (default: 7500)
  safeSearch: 1 # Options: 0 (OFF), 1 (MODERATE - default), 2 (STRICT)
```

## Search Providers

### searchProvider

<OptionTable
  options={[
    ['searchProvider', 'String', 'Specifies which search provider to use.', 'Options: "serper", "searxng"'],
  ]}
/>

### serperApiKey

<OptionTable
  options={[
    ['serperApiKey', 'String', 'Environment variable name for the Serper API key. If not set in .env, users will be prompted to provide it via UI.', '${SERPER_API_KEY}'],
  ]}
/>

**Note:** Get your API key from [Serper.dev](https://serper.dev/api-keys)


### searxngInstanceUrl

<OptionTable
  options={[
    ['searxngInstanceUrl', 'String', 'Environment variable name for the SearXNG instance URL. If not set in .env, users will be prompted to provide it via UI.', '${SEARXNG_INSTANCE_URL}'],
  ]}
/>

### searxngApiKey

<OptionTable
  options={[
    ['searxngApiKey', 'String', 'Environment variable name for the SearXNG API key. If not set in .env, users will be prompted to provide it via UI.', '${SEARXNG_API_KEY}'],
  ]}
/>

**Note:** This is optional and only needed if your SearXNG instance requires authentication.


## Scrapers

### firecrawlApiKey

<OptionTable
  options={[
    ['firecrawlApiKey', 'String', 'Environment variable name for the Firecrawl API key. If not set in .env, users will be prompted to provide it via UI.', '${FIRECRAWL_API_KEY}'],
  ]}
/>

**Note:** Get your API key from [Firecrawl.dev](https://docs.firecrawl.dev/introduction#api-key)

### firecrawlApiUrl

<OptionTable
  options={[
    ['firecrawlApiUrl', 'String', 'Environment variable name for the Firecrawl API URL. If not set in .env, users will be prompted to provide it via UI.', '${FIRECRAWL_API_URL}'],
  ]}
/>

**Note:** This is optional and only needed if you're using a custom Firecrawl instance.

### firecrawlVersion

<OptionTable
  options={[
    ['firecrawlVersion', 'String', 'Environment variable name for the Firecrawl API version (v0 or v1).', '${FIRECRAWL_VERSION}'],
  ]}
/>

### scraperProviderider

<OptionTable
  options={[
    ['scraperProvider', 'String', 'Specifies which scraper service to use.', 'Options: "firecrawl", "serper"'],
  ]}
/>

### firecrawlOptions

<OptionTable
  options={[
    ['firecrawlOptions', 'Object', 'Advanced configuration options for Firecrawl scraper.', ''],
  ]}
/>

**Subkeys:**

#### formats

<OptionTable
  options={[
    ['formats', 'Array of Strings', 'Formats to include in the output.', ''],
  ]}
/>

#### includeTags

<OptionTable
  options={[
    ['includeTags', 'Array of Strings', 'Tags to include in the output.', ''],
  ]}
/>

#### excludeTags

<OptionTable
  options={[
    ['excludeTags', 'Array of Strings', 'Tags to exclude from the output.', ''],
  ]}
/>

#### headers

<OptionTable
  options={[
    ['headers', 'Object', 'Headers to send with the request. Can be used to send cookies, user-agent, etc.', ''],
  ]}
/>

#### waitFor

<OptionTable
  options={[
    ['waitFor', 'Number', 'Specify a delay in milliseconds before fetching the content, allowing the page sufficient time to load.', ''],
  ]}
/>

#### timeout

<OptionTable
  options={[
    ['timeout', 'Number', 'Timeout in milliseconds for the scraping request.', 'Default: 7500'],
  ]}
/>

#### maxAge

<OptionTable
  options={[
    ['maxAge', 'Number', 'Returns a cached version of the page if it is younger than this age in milliseconds. If a cached version of the page is older than this value, the page will be scraped.', ''],
  ]}
/>

**Note:** If you do not need extremely fresh data, enabling this can speed up your scrapes by 500%.

#### mobile

<OptionTable
  options={[
    ['mobile', 'Boolean', 'Emulate scraping from a mobile device.', ''],
  ]}
/>

#### skipTlsVerification

<OptionTable
  options={[
    ['skipTlsVerification', 'Boolean', 'Skip TLS certificate verification when making requests.', ''],
  ]}
/>

#### blockAds

<OptionTable
  options={[
    ['blockAds', 'Boolean', 'Enables ad-blocking and cookie popup blocking.', ''],
  ]}
/>

#### removeBase64Images

<OptionTable
  options={[
    ['removeBase64Images', 'Boolean', 'Removes all base 64 images from the output, which may be overwhelmingly long. The image\'s alt text remains in the output, but the URL is replaced with a placeholder.', ''],
  ]}
/>

#### parsePDF

<OptionTable
  options={[
    ['parsePDF', 'Boolean', 'Controls how PDF files are processed during scraping.', ''],
  ]}
/>

#### storeInCache

<OptionTable
  options={[
    ['storeInCache', 'Boolean', 'If true, the page will be stored in the Firecrawl index and cache. Setting this to false is useful if your scraping activity may have data protection concerns. Using some parameters associated with sensitive scraping (headers) will force this parameter to be false.', ''],
  ]}
/>

#### zeroDataRetention

<OptionTable
  options={[
    ['zeroDataRetention', 'Boolean', 'If true, this will enable zero data retention for this scrape (requires prior setup on Firecrawl).', ''],
  ]}
/>

#### location

<OptionTable
  options={[
    ['location', 'Object', 'Geographic location and language settings for scraping.', ''],
  ]}
/>

#### onlyMainContent

<OptionTable
  options={[
    ['onlyMainContent', 'Boolean', 'Only return the main content of the page excluding headers, navs, footers, etc.', ''],
  ]}
/>

#### changeTrackingOptions

<OptionTable
  options={[
    ['changeTrackingOptions', 'Object', 'Configuration for tracking changes in scraped content.', ''],
  ]}
/>

**Example:**
```yaml filename="webSearch"
webSearch:
  firecrawlApiKey: "${FIRECRAWL_API_KEY}"
  firecrawlOptions:
    formats: ["markdown", "rawHtml"]
    includeTags: ["main", "article", ".content"]
    excludeTags: ["nav", "footer", ".ads"]
    waitFor: 2000
    timeout: 10000
    mobile: false
    blockAds: true
    onlyMainContent: true
    location:
      country: "US"
      languages: ["en"]
```

**Note:** For detailed information about Firecrawl scraper options and defaults, see the [Firecrawl API Documentation](https://docs.firecrawl.dev/api-reference/endpoint/scrape).

## Rerankers

### jinaApiKey

<OptionTable
  options={[
    ['jinaApiKey', 'String', 'Environment variable name for the Jina API key. If not set in .env, users will be prompted to provide it via UI.', '${JINA_API_KEY}'],
  ]}
/>

**Note:** Get your API key from [Jina.ai](https://jina.ai/api-dashboard/)

### jinaApiUrl

<OptionTable
  options={[
    ['jinaApiUrl', 'String', 'Environment variable name for the Jina API URL. If not set in .env, users will be prompted to provide it via UI.', '${JINA_API_URL}'],
  ]}
/>

**Note:** This is optional and only needed if you're using a custom Jina instance.

### cohereApiKey

<OptionTable
  options={[
    ['cohereApiKey', 'String', 'Environment variable name for the Cohere API key. If not set in .env, users will be prompted to provide it via UI.', '${COHERE_API_KEY}'],
  ]}
/>

**Note:** Get your API key from [Cohere Dashboard](https://dashboard.cohere.com/welcome/login)

### rerankerType

<OptionTable
  options={[
    ['rerankerType', 'String', 'Specifies which reranker service to use.', 'Options: "jina", "cohere"'],
  ]}
/>

## General Settings

### scraperTimeout

<OptionTable
  options={[
    ['scraperTimeout', 'Number', 'Timeout in milliseconds for scraper requests.', 'Default: 7500'],
  ]}
/>

### safeSearch

<OptionTable
  options={[
    ['safeSearch', 'Number', 'Safe search filtering level. 0 = OFF (no filtering), 1 = MODERATE (default), 2 = STRICT (maximum filtering).', 'Default: 1 (MODERATE)'],
  ]}
/>

**Note:** Safe search levels align with standard search API conventions. MODERATE filtering is enabled by default to provide reasonable content filtering while maintaining search effectiveness.

## Notes

- API keys can be configured in two ways:
  1. Set the environment variables specified in the YAML configuration
  2. If environment variables are not set, users will be prompted to provide the API keys via the UI
- The configuration supports multiple services for each component (providers, scrapers, rerankers)
- If a specific service type is not specified, the system will try all available services in that category
- Safe search provides three levels of content filtering: OFF (0), MODERATE (1), and STRICT (2)
- Never put actual API keys in the YAML configuration - only use environment variable names 

## Setting Up SearXNG

SearXNG is a privacy-focused meta search engine that you can self-host. 
For more information, see the [official SearXNG documentation](https://docs.searxng.org/). 

Here are the steps to set up your own SearXNG instance for use with LibreChat:

### Using Docker Desktop

1. **Search for the official SearXNG image**
   - Open Docker Desktop
   - Search for `searxng/searxng` in the **Images** tab
   - Click **Run** on the official image to automatically pull down and run a container of the image

2. **Running the container**
   - Expand the **Optional Settings** dropdown in the proceeding panel that appears once the download completes
   - Set your desired configuration details (port number, container name, etc.)
   - Click **Run** to start the container

3. **Configure SearXNG for LibreChat**
   - Navigate to the `Files` tab in Docker Desktop
   - Go to `/etc/searxng/settings.yaml`
   - Open the file editor
   - Navigate to the `formats` section
   - Add `json` as an acceptable format so that LibreChat can communicate with your instance
   - Save the file

4. **Restart the container**
   - Restart the container for the changes to take effect

#### Video Guide:

Here's a video to guide you through the process from start to finish in about a minute:

     <video
       muted
       controls
     >
       <source src="/videos/min_searxng_docker_setup_hq.mp4" />
     </video>

**Note:** In this example, the instance URL is http://localhost:55011 (port number found under the container name in the top left at the end of the video)

### Configuring LibreChat to use SearXNG

You can configure SearXNG in LibreChat within the UI or through `librechat.yaml`.

#### UI Configuration

1. **Open the tools dropdown in the chat input bar**
![Tools configuration button](/images/web-search/tools_highlight.png)

2. **Click on the gear icon next to Web Search**
![Tools configuration section](/images/web-search/tools_config_highlight.png)

3. **Select SearXNG from the Search Provider dropdown**
![SearXNG dropdown selection](/images/web-search/searxng_dropdown_highlight.png)

4. **Enter your configuration details (e.g. instance URL, scraper type, etc.) and click save**
![Save web search configuration](/images/web-search/web_search_save.png)

5. **Click on the Web Search option in the tools dropdown**
![Web search badge in chat interface](/images/web-search/websearch_highlight.png)

6. **The Web Search badge should now be enabled, meaning your queries can now utilize the web search functionality**
![Web search badge confirmation](/images/web-search/search_badge_confirm.png)

