Summarization Configuration
Overview
The summarization configuration provides centralized control over conversation summarization and context pruning. This replaces the per-endpoint summarize and summaryModel fields that were previously available on custom and Azure OpenAI endpoints.
When a conversation exceeds the model's context window, the summarization system automatically compresses older messages into a concise checkpoint summary. This allows conversations to continue indefinitely without losing important context. The system also includes context pruning, which progressively degrades large tool results in older messages to reclaim token space before summarization is needed.
Example
provider
| Key | Type | Description | Example |
|---|---|---|---|
| provider | String | The LLM provider to use for summarization calls. If omitted, uses the agent's own provider. | provider: "openAI" |
Default: Agent's own provider
model
| Key | Type | Description | Example |
|---|---|---|---|
| model | String | The model to use for summarization calls. If omitted, uses the agent's own model. | model: "gpt-4o-mini" |
Default: Agent's own model
parameters
| Key | Type | Description | Example |
|---|---|---|---|
| parameters | Object | Additional LLM parameters for summarization requests (e.g., temperature, top_p). | parameters: { temperature: 0.3 } |
prompt
| Key | Type | Description | Example |
|---|---|---|---|
| prompt | String | Custom prompt for initial summarization. Replaces the built-in checkpoint prompt. |
Default: A structured checkpoint prompt that produces sections for Goal, Constraints & Preferences, Progress, Key Decisions, Next Steps, and Critical Context.
updatePrompt
| Key | Type | Description | Example |
|---|---|---|---|
| updatePrompt | String | Custom prompt for re-compaction when a prior summary already exists. Used when the summary needs to be updated with new conversation content. |
Default: A built-in prompt that merges new messages into the existing checkpoint, compresses older details, and gives recent actions more detail.
maxSummaryTokens
| Key | Type | Description | Example |
|---|---|---|---|
| maxSummaryTokens | Number | Maximum number of output tokens for the summarization model response. | maxSummaryTokens: 4096 |
reserveRatio
| Key | Type | Description | Example |
|---|---|---|---|
| reserveRatio | Number | Fraction of the token budget reserved as headroom (0–1). Prevents the context from being filled to absolute capacity. | reserveRatio: 0.05 |
Default: 0.05 (5% headroom)
trigger
| Key | Type | Description | Example |
|---|---|---|---|
| trigger | Object | Defines when summarization is activated. If omitted, summarization fires whenever message pruning drops any messages. |
trigger Sub-keys
| Key | Type | Description | Example |
|---|---|---|---|
| type | String | The trigger strategy. Options: `"token_ratio"`, `"remaining_tokens"`, `"messages_to_refine"`. | type: "token_ratio" |
| value | Number | The threshold value for the chosen trigger type. | value: 0.8 |
Trigger Types
| Type | Value | Fires When |
|---|---|---|
token_ratio | 0.0–1.0 | The fraction of context tokens used reaches or exceeds the value |
remaining_tokens | Number | The remaining context tokens drops to or below the value |
messages_to_refine | Number | The count of messages eligible for summarization reaches or exceeds the value |
| (not set) | — | Whenever pruning drops any messages (default behavior) |
Example:
contextPruning
| Key | Type | Description | Example |
|---|---|---|---|
| contextPruning | Object | Configures position-based tool result degradation. Large tool results in older messages are progressively trimmed or cleared to reclaim token space. |
Context pruning is an opt-in feature that operates independently of summarization. It targets large tool call results in older messages, applying two progressive stages:
- Soft trim — Truncates tool results to keep only the head and tail portions, with an ellipsis in between
- Hard clear — Replaces the entire tool result with a short placeholder
Both stages are position-based: messages closer to the beginning of the conversation (older) are pruned first.
contextPruning Sub-keys
| Key | Type | Description | Example |
|---|---|---|---|
| enabled | Boolean | Enables position-based tool result degradation. | enabled: true |
| keepLastAssistants | Number | Number of recent assistant turns to protect from any pruning. | keepLastAssistants: 3 |
| softTrimRatio | Number | Age ratio (0–1) at which soft-trim activates. Messages older than this ratio of the conversation are candidates for soft-trimming. | softTrimRatio: 0.3 |
| hardClearRatio | Number | Age ratio (0–1) at which hard-clear activates. Messages older than this ratio are candidates for full replacement. | hardClearRatio: 0.5 |
| minPrunableToolChars | Number | Minimum character count of a tool result before pruning applies. Smaller results are left untouched. | minPrunableToolChars: 50000 |
| softTrim | Object | Configuration for the soft-trim stage. | |
| hardClear | Object | Configuration for the hard-clear stage. |
Defaults:
| Field | Default |
|---|---|
enabled | false |
keepLastAssistants | 3 |
softTrimRatio | 0.3 |
hardClearRatio | 0.5 |
minPrunableToolChars | 50000 |
softTrim Sub-keys
| Key | Type | Description | Example |
|---|---|---|---|
| maxChars | Number | Maximum total characters after soft-trimming a tool result. | maxChars: 4000 |
| headChars | Number | Number of characters to preserve from the beginning of the tool result. | headChars: 1500 |
| tailChars | Number | Number of characters to preserve from the end of the tool result. | tailChars: 1500 |
Defaults: maxChars: 4000, headChars: 1500, tailChars: 1500
hardClear Sub-keys
| Key | Type | Description | Example |
|---|---|---|---|
| enabled | Boolean | Whether the hard-clear stage is active. When disabled, only soft-trim is applied. | enabled: true |
| placeholder | String | Placeholder text that replaces the full tool result content when hard-cleared. | placeholder: "[Old tool result content cleared]" |
Defaults: enabled: true, placeholder: "[Old tool result content cleared]"
Example:
Complete Configuration Example
Migration from Per-Endpoint Settings
If you previously used summarize and summaryModel on custom or Azure OpenAI endpoints:
These fields have been removed. Use the top-level summarization configuration instead:
Notes
- Summarization is configured globally rather than per-endpoint
- The
summarizeandsummaryModelfields on custom endpoints and Azure OpenAI endpoints are no longer supported - When
providerandmodelare omitted, the agent's own provider and model are used for summarization - Context pruning is disabled by default and must be explicitly enabled with
contextPruning.enabled: true - Context pruning only affects tool call results that exceed
minPrunableToolChars— smaller results are never pruned - The
keepLastAssistantssetting protects recent turns from pruning regardless of the trim/clear ratios - Custom
promptandupdatePromptvalues fully replace the built-in prompts — use with care - Set
AGENT_DEBUG_LOGGING=truein your.envfile to enable verbose logging of token counts and context pruning diagnostics
How is this guide?