Upload as Text

Upload as Text allows you to upload documents and have their full content included directly in your conversation with the AI. This feature works out-of-the-box using text parsing methods, with optional OCR enhancement for improved extraction quality.

Overview

No OCR required - Uses text parsing with fallback methods by default
Enhanced by OCR - If OCR is configured, extraction quality improves for images and scanned documents
Full document content - Entire file content available to the model in the conversation
Works with all models - No special tool capabilities needed
Token limit control - Configurable via fileTokenLimit to manage context usage

The `context` Capability

Upload as Text is controlled by the context capability in your LibreChat configuration.

# librechat.yaml
endpoints:
  agents:
    capabilities:
      - "context"  # Enables "Upload as Text"

Default: The context capability is included by default. You only need to explicitly add it if you’ve customized the capabilities list.

How It Works

When you upload a file using “Upload as Text”:

LibreChat checks the file MIME type against fileConfig patterns
Processing method determined by precedence: OCR > STT > text parsing
If file matches fileConfig.ocr.supportedMimeTypes AND OCR is configured: Use OCR
If file matches fileConfig.stt.supportedMimeTypes AND STT is configured: Use STT
If file matches fileConfig.text.supportedMimeTypes: Use text parsing
Otherwise: Fallback to text parsing
Text is truncated to fileConfig.fileTokenLimit before prompt construction
Full extracted text included in conversation context

Text Processing Methods

Text Parsing (Default):

Uses a robust parsing library (same as the RAG API)
Handles PDFs, Word docs, text files, code files, and more
No external service required
Works immediately without configuration
Fallback method if no other match

OCR Enhancement (Optional):

Improves extraction from images, scanned documents, and complex PDFs
Requires OCR service configuration
Automatically used for files matching fileConfig.ocr.supportedMimeTypes when available
See OCR Configuration

STT Processing (Optional):

Converts audio files to text
Requires STT service configuration
See Speech-to-Text Configuration

Usage

Click the attachment icon in the chat input
Select “Upload as Text” from the menu
Choose your file
File content is extracted and included in your message

Note: If you don’t see “Upload as Text”, ensure the context capability is enabled in your endpoints.agents.capabilities configuration.

Configuration

Basic Configuration

The context capability is enabled by default. No additional configuration is required for basic text parsing functionality.

File Handling Configuration

Control text processing behavior with fileConfig:

fileConfig:
  # Maximum tokens from text files before truncation
  fileTokenLimit: 100000
  
  # Files matching these patterns use OCR (if configured)
  ocr:
    supportedMimeTypes:
      - "^image/(jpeg|gif|png|webp|heic|heif)$"
      - "^application/pdf$"
      - "^application/vnd\\.openxmlformats-officedocument\\.(wordprocessingml\\.document|presentationml\\.presentation|spreadsheetml\\.sheet)$"
      - "^application/vnd\\.ms-(word|powerpoint|excel)$"
      - "^application/epub\\+zip$"
  
  # Files matching these patterns use text parsing
  text:
    supportedMimeTypes:
      - "^text/(plain|markdown|csv|json|xml|html|css|javascript|typescript|x-python|x-java|x-csharp|x-php|x-ruby|x-go|x-rust|x-kotlin|x-swift|x-scala|x-perl|x-lua|x-shell|x-sql|x-yaml|x-toml)$"
  
  # Files matching these patterns use STT (if configured)
  stt:
    supportedMimeTypes:
      - "^audio/(mp3|mpeg|mpeg3|wav|wave|x-wav|ogg|vorbis|mp4|x-m4a|flac|x-flac|webm)$"

Processing Priority: OCR > STT > text parsing > fallback

For more details, see File Config Object Structure.

Optional: Configure OCR for Enhanced Extraction

OCR is not required but enhances extraction quality when configured:

# librechat.yaml
ocr:
  strategy: "mistral_ocr"
  apiKey: "${OCR_API_KEY}"
  baseURL: "https://api.mistral.ai/v1"
  mistralModel: "mistral-ocr-latest"

See OCR Configuration for full details.

When to Use Each Upload Option

LibreChat offers three different ways to upload files, each suited for different use cases:

Use “Upload as Text” when:

✅ You want the AI to read the complete document content
✅ Working with smaller files that fit in context
✅ You need “chat with files” functionality
✅ Using models without tool capabilities
✅ You want direct content access without semantic search

Use “Upload for File Search” when:

✅ Working with large documents or multiple files
✅ You want to optimize token usage
✅ You need semantic search for relevant sections
✅ Building knowledge bases
✅ The file_search capability is enabled and toggled ON

Use standard “Upload Files” when:

✅ Using vision models to analyze images
✅ Using code interpreter to execute code
✅ Files don’t need text extraction

Supported File Types

Text Files (text parsing)

Plain text, Markdown, CSV, JSON, XML, HTML
Programming languages (Python, JavaScript, Java, C++, etc.)
Configuration files (YAML, TOML, INI, etc.)
Shell scripts, SQL files

Documents (text parsing or OCR)

PDF documents
Word documents (.docx, .doc)
PowerPoint presentations (.pptx, .ppt)
Excel spreadsheets (.xlsx, .xls)
EPUB books

Images (OCR if configured)

JPEG, PNG, GIF, WebP
HEIC, HEIF (Apple formats)
Screenshots, photos of documents, scanned images

Audio (STT if configured)

MP3, WAV, OGG, FLAC
M4A, WebM
Voice recordings, podcasts

File Processing Priority

LibreChat processes files based on MIME type matching with the following priority order:

OCR - If file matches ocr.supportedMimeTypes AND OCR is configured
STT - If file matches stt.supportedMimeTypes AND STT is configured
Text Parsing - If file matches text.supportedMimeTypes
Fallback - Text parsing as last resort

Processing Examples

PDF file with OCR configured:

Matches ocr.supportedMimeTypes
Uses OCR to extract text
Better quality for scanned PDFs

PDF file without OCR configured:

Matches text.supportedMimeTypes (fallback)
Uses text parsing library
Works well for digital PDFs

Python file:

Matches text.supportedMimeTypes
Uses text parsing (no OCR needed)
Direct text extraction

Audio file with STT configured:

Matches stt.supportedMimeTypes
Uses STT to transcribe

Token Limits

Files are truncated to fileTokenLimit tokens to manage context window usage:

fileConfig:
  fileTokenLimit: 100000  # Default: 100,000 tokens

Truncation happens at runtime before prompt construction
Helps prevent exceeding model context limits
Configurable based on your needs and model capabilities
Larger limits allow more content but use more tokens

Comparison with Other File Features

Feature	Capability	Requires Service	Persistence	Best For
Upload as Text	`context`	No (enhanced by OCR)	Single conversation	Temporary document questions
Agent File Context	`context`	No (enhanced by OCR)	Agent system instructions	Specialized agent knowledge
File Search	`file_search`	Yes (vector DB)	Stored in vector store	Large documents, semantic search

Upload as Text vs Agent File Context

Upload as Text (context):

Available in any chat conversation
Content included in current conversation only
No OCR service required (text parsing by default)
Best for one-off document questions

Agent File Context (context):

Only available in Agent Builder
Content stored in agent’s system instructions
No OCR service required (text parsing by default)
Best for creating specialized agents with persistent knowledge
See OCR for Documents

Upload as Text vs File Search

Upload as Text (context):

Full document content in conversation context
Direct access to all text
Token usage: entire file (up to limit)
Works without RAG API configuration

File Search (file_search):

Semantic search over documents
Returns relevant chunks via tool use
Token usage: only relevant sections
Requires RAG API and vector store configuration
See RAG API

Example Use Cases

Document Analysis: Upload contracts, reports, or articles for analysis
Code Review: Upload source files for review and suggestions
Data Extraction: Extract information from structured documents
Translation: Translate document contents
Summarization: Summarize articles, papers, or reports
Research: Discuss academic papers or technical documentation
Troubleshooting: Share log files for analysis
Content Editing: Review and edit written content
Data Processing: Work with CSV or JSON data files

Troubleshooting

”Upload as Text” option not appearing

Solution: Ensure the context capability is enabled:

endpoints:
  agents:
    capabilities:
      - "context"  # Add this if missing

File content not extracted properly

Solutions:

Check if file type is supported (matches fileConfig patterns)
For images/scanned documents: Configure OCR for better extraction
For audio files: Configure STT service
Verify file is not corrupted

Content seems truncated

Solution: Increase the token limit:

fileConfig:
  fileTokenLimit: 150000  # Increase as needed

Poor extraction quality from images

Solution: Configure OCR to enhance extraction:

ocr:
  strategy: "mistral_ocr"
  apiKey: "${OCR_API_KEY}"

See OCR Configuration for details.

Related Features

File Context - Files used as Agent Context
OCR for Documents - Learn about and configure OCR services
File Configuration - Configure file handling

Upload as Text provides a simple, powerful way to work with documents in LibreChat without requiring complex configuration or external services.

Image Generation OCR for Documents