goose/documentation/blog/2026-01-15-why-tool-descriptions-arent-enough/index.md at 3f5277538d5f6df1a88faca5e77cecc0ee208ae1

vrr/goose

Fork 0

mirror of https://github.com/block/goose.git synced 2026-04-28 03:29:36 +00:00

Adewale Abati 3f5277538d

Canary / Prepare Version (push) Has been cancelled

Details

Unused Dependencies / machete (push) Has been cancelled

Details

CI / changes (push) Has been cancelled

Details

CI / Build Rust Project on Windows (push) Has been cancelled

Details

Deploy Documentation / deploy (push) Has been cancelled

Details

Live Provider Tests / check-fork (push) Has been cancelled

Details

Publish Ask AI Bot Docker Image / docker (push) Has been cancelled

Details

Publish Docker Image / docker (push) Has been cancelled

Details

Scorecard supply-chain security / Scorecard analysis (push) Has been cancelled

Details

Canary / build-cli (push) Has been cancelled

Details

Canary / Upload Install Script (push) Has been cancelled

Details

Canary / bundle-desktop (push) Has been cancelled

Details

Canary / bundle-desktop-intel (push) Has been cancelled

Details

Canary / bundle-desktop-linux (push) Has been cancelled

Details

Canary / bundle-desktop-windows (push) Has been cancelled

Details

Live Provider Tests / changes (push) Has been cancelled

Details

Canary / Release (push) Has been cancelled

Details

CI / Check Rust Code Format (push) Has been cancelled

Details

CI / Build and Test Rust Project (push) Has been cancelled

Details

CI / Lint Rust Code (push) Has been cancelled

Details

CI / Check Generated Schemas are Up-to-Date (push) Has been cancelled

Details

CI / Test and Lint Electron Desktop App (push) Has been cancelled

Details

Live Provider Tests / Build Binary (push) Has been cancelled

Details

Live Provider Tests / Smoke Tests (push) Has been cancelled

Details

Live Provider Tests / Smoke Tests (Code Execution) (push) Has been cancelled

Details

Live Provider Tests / Compaction Tests (push) Has been cancelled

Details

Live Provider Tests / goose server HTTP integration tests (push) Has been cancelled

Details

docs: blog layout update (#8472 )

Signed-off-by: Angie Jones <jones.angie@gmail.com>
Co-authored-by: Angie Jones <jones.angie@gmail.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-04-10 23:26:49 +00:00

4.7 KiB

Raw Blame History

title

description

image

authors

Why Tool Descriptions Aren’t Enough

I thought better tool descriptions would solve everything. They didn’t. Here’s what finally made MCP sampling click for me.

/img/blog/tool-descriptions-banner.png

ebony

The first question I had when I heard about MCP sampling was:

“Can’t I just write better tool descriptions and tell the tool it’s an expert?”

Because honestly, that’s what I was already doing.

If a tool wasn’t behaving how I expected, I’d tweak the wording. Add more detail. Clarify intent. Be more explicit. And sure, that helped a little.

But something still felt off.

The tools still weren’t really thinking. They were fetching data, returning text, and leaving all the heavy reasoning to my LLM. That’s when I realized the issue wasn’t my descriptions. It was how the system actually worked under the hood.

That’s where MCP sampling came in. Not as a magic feature, but as a different way of structuring how tools and the LLM actually collaborate.

What actually changed my understanding

Once I realized the issue wasn’t my tool descriptions but how the system itself was structured, I needed a clearer way to understand the difference.

This is the distinction that helped it click for me:

Tool descriptions influence how a tool is used Sampling changes how a tool participates in reasoning

That might still sound a little abstract, so I mapped it out visually below.

Without sampling, the tool mostly acts like a messenger. It fetches data, returns content, and all the real reasoning happens at the top level in the LLM.

With sampling, the behavior changes. The tool gathers its data, then uses the same LLM you already configured in Goose to ask a targeted question from its own context before returning anything. Instead of just passing information upward, it’s now contributing to the thinking.

It’s the same model and the same agent, but the behavior changes completely.

Where Council of Mine fits in

Seeing the flow change helped me understand sampling conceptually. Council of Mine helped me understand it viscerally.

It’s not MCP sampling itself. It’s an example of what becomes possible once sampling exists.

Instead of making a single request to the LLM, Council of Mine uses sampling repeatedly and intentionally. Each perspective is its own conversation with the same LLM, framed by a different point of view. Those responses are then compared, debated, and synthesized into a final answer.

The server handles the orchestration. The LLM does the reasoning. Sampling is what allows that back-and-forth to happen at all.

What made this click for me was watching one question turn into multiple independent perspectives, then seeing how those perspectives shaped the final output. It took sampling from an abstract idea to something concrete.

What I landed on

Good tool descriptions still matter. This isn’t a replacement for them.

But on their own, they won’t get you to truly agentic behavior. Descriptions shape behavior at the surface. Sampling changes how the reasoning itself is structured.

That distinction was the missing piece for me. And once I could actually see the flow, everything else started to make more sense.

If this helped make things click, I’d recommend trying the Council of Mine extension for yourself. It’s one of the clearest ways to see MCP sampling in action.

4.7 KiB Raw Blame History Unescape Escape

What actually changed my understanding

Where Council of Mine fits in

What I landed on

4.7 KiB

Raw Blame History