# Resumable Streams (/docs/features/resumable_streams)

LibreChat features a resilient streaming architecture that ensures you never lose AI-generated content. Whether your connection drops, you switch tabs, or you pick up on another device, your responses are always preserved and synchronized.

## Why It Matters

Traditional chat applications lose all streaming content when your connection drops. With resumable streams, LibreChat:

- **Preserves every response** — Network hiccups, browser refreshes, or server restarts won't cause data loss
- **Keeps multiple tabs in sync** — Open the same conversation in two browser tabs and watch them update together in real-time
- **Enables seamless device switching** — Start a conversation on your desktop and continue on your phone
- **Lets you multitask freely** — Start a generation, browse other tabs, and come back to a complete response

## How It Works

When you send a message to an AI model, LibreChat creates a generation job that tracks all streamed content. The magic happens when something interrupts your connection:

1. **Automatic detection** — The client detects the disconnection instantly
2. **State reconstruction** — Upon reconnecting, the server rebuilds all previously streamed content
3. **Seamless sync** — Missing content is delivered via a sync event
4. **Transparent continuation** — Streaming resumes from the current position

This all happens automatically—no user action required.

### Multi-Tab & Multi-Device Experience

One of the most powerful aspects of resumable streams is **real-time synchronization**:

- **Same chat, multiple windows** — Open a conversation in two browser tabs and both receive updates simultaneously
- **Cross-device continuity** — Start a long generation on your laptop, then check the result on your phone
- **Team collaboration** — In shared conversations, all viewers see content appear in real-time

## Deployment Modes

LibreChat supports two deployment configurations:

### Single-Instance Mode (Default)

Uses in-memory storage with Node.js `EventEmitter` for pub/sub. Perfect for:
- Local development
- Single-server deployments
- Docker Compose setups

**No configuration required** — Works out of the box.

### Redis Mode (Production)

Uses Redis Streams and Pub/Sub for cross-instance communication. Essential for:
- Horizontally scaled deployments
- Load-balanced production environments
- High-availability setups
- Kubernetes clusters

With Redis mode, a user can start a generation on one server instance and seamlessly resume on another—perfect for rolling deployments and auto-scaling.

**Note:** If you only run a single LibreChat instance, Redis for resumable streams is typically unnecessary—the in-memory mode handles everything. Redis becomes valuable when you have multiple LibreChat instances behind a load balancer. That said, Redis is still useful for other features like caching and session storage even in single-instance deployments.

## Configuration

### Enabling Redis Streams

When Redis is enabled (`USE_REDIS=true`), resumable streams automatically use Redis. You can also explicitly enable it:

```bash filename=".env"
USE_REDIS=true
REDIS_URI=redis://localhost:6379
# Resumable streams will use Redis automatically when USE_REDIS=true
# To explicitly control it:
USE_REDIS_STREAMS=true
```

### Redis Cluster Support

For Redis Cluster deployments:

```bash filename=".env"
USE_REDIS_STREAMS=true
USE_REDIS_CLUSTER=true
REDIS_URI=redis://node1:7001,redis://node2:7002,redis://node3:7003
```

LibreChat automatically uses hash-tagged keys to ensure multi-key operations stay within the same cluster slot.

## Use Cases

### Unstable Networks
On spotty WiFi or cellular connections, responses automatically resume when connectivity returns. No need to re-send your prompt.

### Mobile Users
Switch from WiFi to cellular (or vice versa) without losing your response. The stream picks up exactly where it left off.

### Long-Running Generations
For complex prompts that generate lengthy responses, feel free to check other tabs or apps. Your response will be waiting when you return.

### Multi-Device Workflows
Start a conversation on your work computer, commute home, and check the result on your phone—the full response is there.

### Production Deployments
Scale horizontally across multiple server instances while maintaining stream continuity. Rolling deployments won't interrupt active generations.

## Technical Details

### Content Reconstruction

The system aggregates all streamed delta events to rebuild:
- Message content (text, tool calls, citations)
- Agent run steps and intermediate reasoning
- Metadata and state information

### Performance Optimizations

**Memory-first approach**: When reconnecting to the same server instance, LibreChat uses local cache for zero-latency content recovery, avoiding unnecessary Redis round trips.

**Automatic cleanup**: Stale job entries are removed during queries to prevent memory leaks. Completed streams expire automatically.

**Efficient storage**: In-memory mode uses `WeakRef` for graph storage, enabling automatic garbage collection when conversations end.

### Data Flow

| Component | Storage Mechanism |
|-----------|-------------------|
| Chunks | Redis Streams (`XADD`/`XRANGE`) |
| Job metadata | Redis Hash structures |
| Real-time events | Redis Pub/Sub channels |
| Expiration | Automatic TTL after stream completion |

## Testing Resumable Streams

You can verify the feature is working:

1. Start a streaming conversation with any AI model
2. **Tab test**: Open the same chat in a new browser tab—both should sync
3. **Disconnect test**: Turn off your network briefly, then reconnect
4. **Navigation test**: Navigate away mid-stream, then return

In all cases, you should see the complete response with no data loss.

## Troubleshooting

### Streams not resuming?

**Check Redis connectivity:**
```bash
docker exec -it librechat-redis redis-cli ping
# Should return: PONG
```

**Verify environment variables:**
```bash
# Ensure USE_REDIS_STREAMS is set
echo $USE_REDIS_STREAMS
```

### Content appears duplicated?

This typically indicates a client version mismatch. Ensure you're running the latest version of LibreChat.

### High memory usage in single-instance mode?

Completed streams are automatically garbage collected. If you're seeing high memory usage, check for:
- Very long-running streams that haven't completed
- Streams that errored without proper cleanup

## Related Documentation

- [Redis Configuration](/docs/configuration/redis) — Setting up Redis for caching and horizontal scaling
- [Agents](/docs/features/agents) — AI agents with tool use capabilities
- [Docker Deployment](/docs/local/docker) — Container-based deployment guide

---

For implementation details, see [PR #10926](https://github.com/danny-avila/LibreChat/pull/10926).
