Commit graph

55 commits

Author SHA1 Message Date
Bennet Bo Fenner
3a742b5e0d
language_models: Remove unused cache_configuration API (#56884)
Release Notes:

- N/A
2026-05-15 16:27:11 +00:00
Richard Feldman
800a795545
bedrock: Add system-prompt cache anchor on caching-capable models (#56474)
The Bedrock Converse API supports placing `CachePoint` blocks inside the
`system` field, but we were sending the system prompt as a single
`SystemContentBlock::Text`, which leaves the system tokens dependent on
whatever message-level breakpoint happens to fall within the 20-block
lookback window.

This widens `bedrock::Request.system` from `Option<String>` to
`Vec<BedrockSystemContentBlock>` and has `into_bedrock` emit
`[Text(system), CachePoint(Default)]` whenever the model supports prompt
caching. The system prompt now anchors its own cache prefix, on top of
the existing tool-list anchor and per-message breakpoint, so a stable
system prompt keeps producing cache hits even when earlier conversation
turns change.

Bedrock does not support automatic caching or the 1-hour TTL, so the
default 5-minute ephemeral cache is the only option for this provider.

Release Notes:

- Improved Bedrock prompt cache utilization by anchoring the system
prompt as its own cache prefix
2026-05-12 22:46:29 +00:00
Gunner Kwon
5d3f275cbf
bedrock: Add Guardrail configuration support (#50084)
## Background: Amazon Bedrock Guardrails

AWS Bedrock Guardrails enables configurable safety and compliance
controls for generative AI applications:
- Evaluates both user inputs and model responses against policies.
:contentReference[oaicite:8]{index=8}
- Can block or filter content based on harmful categories, denied
topics, PII, or hallucination criteria.
:contentReference[oaicite:9]{index=9}
- Guardrails are applied during inference API calls by specifying
`guardrailIdentifier` and `guardrailVersion` in the request.
:contentReference[oaicite:10]{index=10}

Relevant AWS documentation:
- User Guide:
https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails.html
- How Guardrails Works:
https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails-how.html
- API Reference (GuardrailConfiguration):
https://docs.aws.amazon.com/bedrock/latest/APIReference/API_GuardrailConfiguration.html


Some AWS environments enforce IAM policies that require a guardrail to
be specified on every Bedrock API call (via a `StringEquals` condition
on `bedrock:GuardrailIdentifier`). Without this, Zed returns
`AccessDenied` and Bedrock models are completely unusable in those
environments.

This adds two optional settings, `guardrail_identifier` and
`guardrail_version`, to the Bedrock provider config. When set, a
`GuardrailStreamConfiguration` is attached to every `converse_stream`
request. When unset, behaviour is identical to before.

```json
{
  "language_models": {
    "bedrock": {
      "guardrail_identifier": "arn:aws:bedrock:us-east-1:123456789012:guardrail/abc123",
      "guardrail_version": "DRAFT"
    }
  }
}
```

`guardrail_version` defaults to `"DRAFT"` if omitted.

Release Notes:

- agent: Added `guardrail_identifier` and `guardrail_version` settings
for AWS Bedrock, enabling use in environments where IAM policies require
a guardrail on all model requests

---------

Co-authored-by: Bennet Bo Fenner <bennetbo@gmx.de>
2026-05-11 17:43:54 +00:00
Bennet Bo Fenner
b3c65f9410
bedrock: Always use 1M context window for anthropic models (#56195)
Closes #49617

Release Notes:

- bedrock: Always use 1M context windows for Anthropic models
2026-05-08 15:43:52 +00:00
Finn Evers
9b40411c6a
Fix bad GitHub merge queue merge (#54721)
No, sadly, the title is not a typo. See
https://www.githubstatus.com/incidents/zsg1lk7w13cf for the context.
I'll read with joy and popcorn through that root cause analysis.

It makes literally zero sense what happened here, but for some completly
bonkers reason GitHub completely messed up the merge queue with
https://github.com/zed-industries/zed/pull/54632.

I have no idea how it happened. It makes literally zero sense. A PR
going into the merge queue should have the same LoC when getting out of
it. GitHub obviously does not check this. GitHub causes extra work with
a feature that is supposed to save time.

Thanks, I guess.

Release Notes:

- N/A

---------

Co-authored-by: Danilo Leal <daniloleal09@gmail.com>
2026-04-23 23:47:30 +00:00
Danilo Leal
0ab64d6414
branch_picker: Add button to filter remote branches (#54632)
This PR brings back the button to filter remote branches when accessing
the title bar's branch picker with the mouse. It was unintentionally
removed when we introduced the new worktree picker.

Release Notes:

- N/A
2026-04-23 18:26:44 +00:00
Eric Holk
4a5fbf6a3f
Request summarized thinking for Claude Opus 4.7 (#54217)
Starting with Claude Opus 4.7, Anthropic omits thinking content from
responses by default; callers must pass `display: "summarized"` to keep
seeing thinking summaries. Without opting in, the agent UI shows a long
pause with no visible thinking, and users get no progress indication
during extended reasoning.

This extends the adaptive-thinking wire type with an optional `display`
field and requests `Summarized` from every call site that builds an
adaptive thinking request (direct Anthropic, Copilot Chat proxy, Zed
Cloud, and Bedrock).

## Notes

- Applied at the adaptive-thinking layer rather than special-casing Opus
4.7. The `display` parameter is accepted by every
adaptive-thinking-capable model, and the previous behavior (visible
summaries) is what users already see on Opus 4.6 / Sonnet 4.6, so there
is no behavior change for those models.

Release Notes:

- Restored thinking summaries for Claude Opus 4.7.
2026-04-23 15:43:02 +00:00
Bennet Bo Fenner
8ff9302f04
bedrock: Fix wrong model ID for Opus 4.7 (#54554)
Self-Review Checklist:

- [x] I've reviewed my own diff for quality, security, and reliability
- [x] Unsafe blocks (if any) have justifying comments
- [x] The content is consistent with the [UI/UX
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)
- [ ] Tests cover the new/changed behavior
- [x] Performance impact has been considered and is acceptable

Closes #54547 

Release Notes:

- bedrock: Fixed an issue where Opus 4.7 would not work because of an
invalid model ID
2026-04-22 22:18:46 +00:00
Richard Feldman
87b47a4b08
Add Claude Opus 4.7 BYOK (#54077)
<img width="767" height="428" alt="Screenshot 2026-04-16 at 11 29 13 AM"
src="https://github.com/user-attachments/assets/e8b450fa-aefc-4dec-a286-b211bd492011"
/>

Add Claude Opus 4.7 (`claude-opus-4-7`) to the anthropic, bedrock, and
opencode provider crates.

Key specs:
- 1M token context window
- 128k max output tokens
- Adaptive thinking support
- AWS Bedrock cross-region inference (global, US, EU, AU)

Release Notes:

- Added Claude Opus 4.7 as an available language model
2026-04-17 10:51:32 -04:00
Shardul Vaidya
9c731640c7
bedrock: Add new Bedrock models (NVIDIA, Z.AI, Mistral, MiniMax) (#53043)
Add 9 new models across 3 new providers (NVIDIA, Z.AI) and expanded
coverage for existing providers (Mistral, MiniMax):

- NVIDIA Nemotron Super 3 120B, Nemotron Nano 3 30B
- Mistral Devstral 2 123B, Ministral 14B
- MiniMax M2.1, M2.5
- Z.AI GLM 5, GLM 4.7, GLM 4.7 Flash

Self-Review Checklist:

- [x] I've reviewed my own diff for quality, security, and reliability
- [x] Unsafe blocks (if any) have justifying comments
- [x] The content is consistent with the [UI/UX
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)
- [x] Tests cover the new/changed behavior
- [x] Performance impact has been considered and is acceptable

Closes #ISSUE

Release Notes:

- bedrock: Added 9 new models across 3 new providers (NVIDIA, Z.AI) and
expanded coverage for existing providers (Mistral, MiniMax)
2026-04-07 11:59:12 +02:00
Shardul Vaidya
945f642478
bedrock: Make thinking toggle toggle thinking (#50673)
Release Notes:

- Support for Native Thinking toggle instead of model variants

---------

Co-authored-by: Ona <no-reply@ona.com>
Co-authored-by: Bennet Bo Fenner <bennetbo@gmx.de>
Co-authored-by: Marshall Bowers <git@maxdeviant.com>
2026-03-19 15:32:04 +00:00
Shardul Vaidya
03416097a8
bedrock: Add Claude Sonnet 4.6 (#49439)
Release Notes:

- Added support for Anthropic Claude Sonnet 4.6

Co-authored-by: Ona <no-reply@ona.com>
2026-02-19 07:23:27 +01:00
Richard Feldman
5670e66fa7
Add support for Claude Sonnet 4.6 (#49386)
<img width="435" height="211" alt="Screenshot 2026-02-17 at 1 32 48 PM"
src="https://github.com/user-attachments/assets/136c188d-5001-4526-961e-9f7faccc5f7a"
/>


Add support for the new Claude Sonnet 4.6 model across the anthropic,
bedrock, and language_models crates. Includes base, thinking, and 1M
context variants.

Closes AI-39

Release Notes:

- Added BYOK support for Claude Sonnet 4.6
2026-02-17 18:45:12 +00:00
Shardul Vaidya
6f8023530c
bedrock: Model streamlining and cleanup (#49287)
Release Notes:

- Improved Bedrock error messages: region-locked models ask the user to
try a different region, rate limits and access errors are reported
cleanly instead of as raw API responses
- Streamlined Bedrock model list to 39 curated models
- Fixed API errors when using non-tool models in agent threads

---------

Co-authored-by: Ona <no-reply@ona.com>
2026-02-17 09:22:25 +00:00
gitarth
13a9386a29
language_models: Add image support for Bedrock (#47673)
Closes #N/A (no existing issue - implemented to enable image input for
Bedrock models)

This PR enables the "@" image mention feature for Bedrock models that
support vision capabilities.

**Changes:**
- Added `supports_images()` method to Bedrock `Model` enum
- Wired up image support in the Bedrock language model provider
- Added `MessageContent::Image` handling to convert base64 images to
Bedrock's expected format
- Added tool result image support

**Supported models:** Claude 3/3.5/4 family, Amazon Nova Pro/Lite, Meta
Llama 3.2 Vision, Mistral Pixtral

Release Notes:

- Added image input support for Amazon Bedrock models with vision
capabilities
2026-02-13 12:41:14 +01:00
Shardul Vaidya
5026280131
bedrock: Enable 1M context window (#48542)
Release Notes:

- Added `allow_extended_context` to the Bedrock settings which enables
1M context windows on models that support it

---------

Co-authored-by: Ona <no-reply@ona.com>
Co-authored-by: Bennet Bo Fenner <bennetbo@gmx.de>
2026-02-12 14:32:24 +00:00
Shardul Vaidya
1137b3c0f7
bedrock: Add Claude Opus 4.6 (#48525)
Release Notes:

- Added Claude Opus 4.6 and 4.6 Thinking with Cross region inference for
US, EU, and Global endpoints.

---------

Co-authored-by: Ona <no-reply@ona.com>
2026-02-09 09:19:51 +00:00
Shardul Vaidya
edf21a38c1
bedrock: Add Bedrock API key authentication support (#41393) 2025-12-17 12:54:57 +01:00
Shardul Vaidya
0f0017dc8e
bedrock: Support global endpoints and new regional endpoints (#44103)
Closes #43598

Release Notes:

- bedrock: Added opt-in `allow_global` which enables global endpoints
- bedrock: Updated cross-region-inference endpoint and model list
- bedrock: Fixed Opus 4.5 access on Bedrock, now only accessible through the `allow_global` setting
2025-12-04 12:14:31 +01:00
Mikayla Maki
bd2c1027fa
Add support for Opus 4.5 (#43425)
Adds support for Opus 4.5
- [x] BYOK
- [x] Amazon Bedrock

Release Notes:

- Added support for Opus 4.5

Co-authored-by: Richard Feldman <oss@rtfeldman.com>
2025-11-24 20:01:43 +00:00
Shardul Vaidya
207a202477
bedrock: Add support for Claude Haiku 4.5 model (#41045)
Release Notes:

- bedrock: Added support for Claude Haiku 4.5

---------

Co-authored-by: Ona <no-reply@ona.com>
2025-10-29 16:41:43 +01:00
Julia Ryan
ef5b8c6fed
Remove workspace-hack (#40216)
We've been considering removing workspace-hack for a couple reasons:
- Lukas ran into a situation where its build script seemed to be causing
spurious rebuilds. This seems more likely to be a cargo bug than an
issue with workspace-hack itself (given that it has an empty build
script), but we don't necessarily want to take the time to hunt that
down right now.
- Marshall mentioned hakari interacts poorly with automated crate
updates (in our case provided by rennovate) because you'd need to have
`cargo hakari generate && cargo hakari manage-deps` after their changes
and we prefer to not have actions that make commits.

Currently removing workspace-hack causes our workspace to grow from
~1700 to ~2000 crates being built (depending on platform), which is
mainly a problem when you're building the whole workspace or running
tests across the the normal and remote binaries (which is where
feature-unification nets us the most sharing). It doesn't impact
incremental times noticeably when you're just iterating on `-p zed`, and
we'll hopefully get these savings back in the future when
rust-lang/cargo#14774 (which re-implements the functionality of hakari)
is finished.

Release Notes:

- N/A
2025-10-17 18:58:14 +00:00
Richard Feldman
3ae65153db
Default to Sonnet 4.5 in BYOK (#39132)
<img width="381" height="204" alt="Screenshot 2025-09-29 at 2 29 58 PM"
src="https://github.com/user-attachments/assets/c7aaf0b0-b09b-4ed9-8113-8d7b18eefc2f"
/>


Release Notes:

- Claude Sonnet 4.5 and 4.5 Thinking are now the recommended Anthropic
models
2025-09-29 18:56:03 +00:00
Richard Feldman
4fc4707cfc
Add Sonnet 4.5 support (#39127)
Release Notes:

- Added support for Claude Sonnet 4.5 for Bring-Your-Own-Key (BYOK)
2025-09-29 14:21:58 -04:00
Shardul Vaidya
a70cf3f1d4
bedrock: Inference Config updates (#35808)
Fixes #36866

- Updated internal naming for Claude 4 models to be consistent.
- Corrected max output tokens for Anthropic Bedrock models to match docs

Shoutout to @tlehn for noticing the bug, and finding the resolution.

Release Notes:

- bedrock: Fixed inference config errors causing Opus 4 Thinking and
Opus 4.1 Thinking to fail (thanks [@tlehn](https://github.com/tlehn) and
[@5herlocked](https://github.com/5herlocked])
- bedrock: Fixed an issue which prevented Rules / System prompts not
functioning with Bedrock models (thanks
[@tlehn](https://github.com/tlehn) and
[@5herlocked](https://github.com/5herlocked])
2025-08-29 18:13:06 -04:00
Piotr Osiewicz
05fc0c432c
Fix a bunch of other low-hanging style lints (#36498)
- **Fix a bunch of low hanging style lints like unnecessary-return**
- **Fix single worktree violation**
- **And the rest**

Release Notes:

- N/A
2025-08-19 21:26:17 +02:00
Richard Feldman
0b5592d788
Add Claude Opus 4.1 (#35653)
<img width="348" height="427" alt="Screenshot 2025-08-05 at 1 55 35 PM"
src="https://github.com/user-attachments/assets/52af17a5-0095-4ad9-9afe-ff27aab90e03"
/>

Release Notes:

- Added support for Claude Opus 4.1

Co-authored-by: Marshall Bowers <git@maxdeviant.com>
2025-08-05 18:16:47 +00:00
Shardul Vaidya
0d809c21ba
bedrock: Fix bedrock not streaming (#28281)
Closes #26030 

Release Notes:

- Fixed Bedrock bug causing streaming responses to return as one big
chunk

---------

Co-authored-by: Peter Tripp <peter@zed.dev>
2025-07-01 12:51:09 +03:00
Vladimir Kuznichenkov
0905255fd1
bedrock: Add prompt caching support (#33194)
Closes https://github.com/zed-industries/zed/issues/33221

Bedrock has similar to anthropic caching api, if we want to cache
messages up to a certain point, we should add a special block into that
message.

Additionally, we can cache tools definition by adding cache point block
after tools spec.

See: [Bedrock User Guide: Prompt
Caching](https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-caching.html#prompt-caching-models)

Release Notes:

- bedrock: Added prompt caching support

---------

Co-authored-by: Oleksiy Syvokon <oleksiy@zed.dev>
2025-06-25 17:15:13 +03:00
Willem
6b4c607331
bedrock: Support Claude 3.7 in APAC (#33068)
In ap-northeast-1 we have access to 3.7 and 4.0

Release Notes:

- N/A

---------

Co-authored-by: Peter Tripp <peter@zed.dev>
2025-06-22 20:08:50 +00:00
Peter Tripp
595f61f0d6
bedrock: Use Claude 3.0 Haiku where Haiku 3.5 is not available (#33214)
Closes: https://github.com/zed-industries/zed/issues/33183

@kuzaxak Can you confirm this works for you?

Release Notes:

- bedrock: Use Anthropic Haiku 3.0 in AWS regions where Haiku 3.5 is
unavailable
2025-06-22 15:15:20 -04:00
Vladimir Kuznichenkov
1047d8adec
bedrock: Add Sonnet 4 to cross-region model list (eu/apac) (#33192)
Closes #31946

Sonnet 4 is [now
available](https://docs.aws.amazon.com/bedrock/latest/userguide/inference-profiles-support.html)
via Bedrock in EU aws regions.

Release Notes:

- bedrock: Add cross-region usage of Sonnet 4 in EU/APAC AWS regions
2025-06-22 15:15:05 -04:00
Richard Feldman
5405c2c2d3
Standardize on u64 for token counts (#32869)
Previously we were using a mix of `u32` and `usize`, e.g. `max_tokens:
usize, max_output_tokens: Option<u32>` in the same `struct`.

Although [tiktoken](https://github.com/openai/tiktoken) uses `usize`,
token counts should be consistent across targets (e.g. the same model
doesn't suddenly get a smaller context window if you're compiling for
wasm32), and these token counts could end up getting serialized using a
binary protocol, so `usize` is not the right choice for token counts.

I chose to standardize on `u64` over `u32` because we don't store many
of them (so the extra size should be insignificant) and future models
may exceed `u32::MAX` tokens.

Release Notes:

- N/A
2025-06-17 10:43:07 -04:00
Burak Varlı
16853acbb1
Enable cross-region inference for Claude 4 family models on Amazon Bedrock provider (#32235)
These models require cross-region inference, and it currently fails if
you try to use them:
```
Invocation of model ID anthropic.claude-sonnet-4-20250514-v1:0 with on-demand throughput isn’t supported. 
```

Release Notes:

- Enable cross-region inference for Claude 4 family models on Amazon
Bedrock provider

Signed-off-by: Burak Varlı <burakvar@amazon.co.uk>
2025-06-09 23:38:39 -07:00
Umesh Yadav
071e684be4
bedrock: Fix ci failure due model enum and model name mismatch (#32049)
Release Notes:

- N/A
2025-06-04 10:41:12 +03:00
Shardul Vaidya
2280594408
bedrock: Allow users to pick Thinking vs. Non-Thinking models (#31600)
Release Notes:

- bedrock: Added ability to pick between Thinking and Non-Thinking models
2025-06-04 09:00:41 +03:00
Shardul Vaidya
09a1d51e9a
bedrock: Fix Claude 4 output token bug (#31599)
Release Notes:

- Fixed an issue preventing the use of Claude 4 Thinking models with Bedrock
2025-06-04 08:57:31 +03:00
Shardul Vaidya
e13b494c9e
bedrock: Fix cross-region inference (#30659)
Closes #30535

Release Notes:

- AWS Bedrock: Add support for Meta Llama 4 Scout and Maverick models.
- AWS Bedrock: Fixed cross-region inference for all regions.
- AWS Bedrock: Updated all models available through Cross Region
inference.

---------

Co-authored-by: Marshall Bowers <git@maxdeviant.com>
2025-06-03 15:46:35 +00:00
Ben Brandt
119beb210a
Update default models to newer versions (#31415)
Follow up to: https://github.com/zed-industries/zed/pull/31209
Changes default models across multiple providers:
- Zed.dev Default Models in settings: claude-3-7-sonnet-latest →
claude-4-sonnet-latest
- Bedrock Default Model: Claude 3.5 Sonnet v2 → Claude Sonnet 4
- Google AI Default Fast Model: Gemini 1.5 Flash → Gemini 2.0 Flash

Release Notes:

- N/A
2025-05-27 10:54:42 +02:00
Shardul Vaidya
e3b6fa2c30
bedrock: Support Claude 4 models (#31214)
Release Notes:

- AWS Bedrock: Added support for Claude 4.
2025-05-22 21:59:23 +00:00
Kirill Bulatov
16366cf9f2
Use anyhow more idiomatically (#31052)
https://github.com/zed-industries/zed/issues/30972 brought up another
case where our context is not enough to track the actual source of the
issue: we get a general top-level error without inner error.

The reason for this was `.ok_or_else(|| anyhow!("failed to read HEAD
SHA"))?; ` on the top level.

The PR finally reworks the way we use anyhow to reduce such issues (or
at least make it simpler to bubble them up later in a fix).
On top of that, uses a few more anyhow methods for better readability.

* `.ok_or_else(|| anyhow!("..."))`, `map_err` and other similar error
conversion/option reporting cases are replaced with `context` and
`with_context` calls
* in addition to that, various `anyhow!("failed to do ...")` are
stripped with `.context("Doing ...")` messages instead to remove the
parasitic `failed to` text
* `anyhow::ensure!` is used instead of `if ... { return Err(...); }`
calls
* `anyhow::bail!` is used instead of `return Err(anyhow!(...));`

Release Notes:

- N/A
2025-05-20 23:06:07 +00:00
Marshall Bowers
a1d8e50ec1
bedrock: Fix Claude 3.5 Haiku support (#30560)
This PR corrects a mistake introduced in
https://github.com/zed-industries/zed/pull/28523.

https://github.com/zed-industries/zed/pull/28523#issuecomment-2872369707

Release Notes:

- N/A
2025-05-12 12:45:35 +00:00
Shardul Vaidya
8d79226445
bedrock: Add support for Mistral - Pixtral Large (#28274)
Release Notes:

- AWS Bedrock: Added support for Pixtral Large 25.02 v1

---------

Co-authored-by: Peter Tripp <peter@zed.dev>
Co-authored-by: Marshall Bowers <git@maxdeviant.com>
2025-05-12 09:13:37 +00:00
Shardul Vaidya
d867897746
bedrock: Support cross-region inference for US Claude 3.5 Haiku (#28523)
Release Notes:

- Added Cross-Region inference support for US Claude 3.5 Haiku

Co-authored-by: Peter Tripp <peter@zed.dev>
Co-authored-by: Marshall Bowers <git@maxdeviant.com>
2025-05-12 08:41:45 +00:00
Shardul Vaidya
1f58ce80f2
bedrock: Support Amazon Nova Premier (#29720)
Release Notes:

- Bedrock: Added support for Amazon Nova Premier.


https://aws.amazon.com/blogs/aws/amazon-nova-premier-our-most-capable-model-for-complex-tasks-and-teacher-for-model-distillation/

Co-authored-by: Marshall Bowers <git@maxdeviant.com>
2025-05-12 08:15:18 +00:00
Shardul Vaidya
559725d8f5
bedrock: Support Writer Palmyra models (#29719)
Release Notes:

- Added support for Writer Palmyra X4, and X5


https://writer.com/engineering/long-context-palmyra-x5/

Co-authored-by: Marshall Bowers <git@maxdeviant.com>
2025-05-12 07:54:09 +00:00
Antonio Scandurra
9f6809a28d
Reuse conversation cache when streaming edits (#30245)
Release Notes:

- Improved latency when the agent applies edits.
2025-05-08 14:36:34 +02:00
Shardul Vaidya
fa40353fc5
bedrock: Preserve thinking blocks for Bedrock (#29602)
Fixes a regression from #29055, resolves #29290

Release Notes:

- agent: Fixed a regression that rendered Claude 3.7 Thinking unusable
on Bedrock.
2025-04-29 12:18:32 -04:00
Michael Sloan
fbf7caf93e
Default to fast model for thread summaries and titles + don't include system prompt / context / thinking segments (#29102)
* Adds a fast / cheaper model to providers and defaults thread
summarization to this model. Initial motivation for this was that
https://github.com/zed-industries/zed/pull/29099 would cause these
requests to fail when used with a thinking model. It doesn't seem
correct to use a thinking model for summarization.

* Skips system prompt, context, and thinking segments.

* If tool use is happening, allows 2 tool uses + one more agent response
before summarizing.

Downside of this is that there was potential for some prefix cache reuse
before, especially for title summarization (thread summarization omitted
tool results and so would not share a prefix for those). This seems fine
as these requests should typically be fairly small. Even for full thread
summarization, skipping all tool use / context should greatly reduce the
token use.

Release Notes:

- N/A
2025-04-19 23:26:29 +00:00
Shardul Vaidya
525755c28e
bedrock: Add support for tool use, cross-region inference, and Claude 3.7 Thinking (#28137)
Closes #27223
Merges: #27996, #26734, #27949 

Release Notes:

- AWS Bedrock: Added advanced authentication strategies with:
  - Short lived credentials with Session Tokens 
  - AWS Named Profile
  - EC2 Identity, Pod Identity, Web Identity
- AWS Bedrock: Added Claude 3.7 Thinking support.
- AWS Bedrock: Adding Cross Region Inference for all combinations of
regions and model availability.
- Agent Beta: Added support for AWS Bedrock.

---------

Co-authored-by: Marshall Bowers <git@maxdeviant.com>
2025-04-05 11:16:26 -04:00