unsloth/studio
Daniel Han e4d1499230
fix(studio): prevent small models from stalling on tool-calling tasks (#4769)
* fix(studio): prevent small models from stalling on tool-calling tasks

Small GGUF models (< 9B params) in "Think, Search, Code" mode would
often describe what they planned to do ("Let me create this dashboard")
and then stop generating without ever calling a tool.

Three changes:

1. Simplify web_tips for small models: remove the "fetch its full content
   by calling web_search with the url parameter" guidance for models < 9B.
   This multi-step instruction causes small models to plan elaborate
   search-then-fetch-then-code sequences they cannot reliably execute.

2. Add "always call tools directly" imperative to the system prompt nudge
   so models act immediately instead of narrating their intentions.

3. Add plan-without-action re-prompt in the agentic loop: when the model
   emits planning text (matching patterns like "let me", "I'll", etc.)
   without calling any tool, inject a nudge asking it to call the tool
   and continue the loop. Capped at 2 re-prompts per request.

Benchmarked with Qwen3.5-4B-GGUF (N=5 trials per variant):
- Baseline: 40% of requests had any tool call
- Combined fix: 100% of requests had at least one tool call

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Co-authored-by: Daniel Han <danielhanchen@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2026-04-02 02:11:07 -07:00
..
backend fix(studio): prevent small models from stalling on tool-calling tasks (#4769) 2026-04-02 02:11:07 -07:00
frontend fix(chat): correct loading text for cached models during inference (#4764) 2026-04-01 20:24:48 -07:00
__init__.py Final cleanup 2026-03-12 18:28:04 +00:00
install_llama_prebuilt.py Fix custom llama.cpp source builds and macos metal source builds (#4762) 2026-04-01 14:06:39 -05:00
install_python_stack.py studio: unify Windows installer/setup logging style, verbosity controls, and startup messaging (#4651) 2026-03-30 00:53:23 -07:00
LICENSE.AGPL-3.0 Add AGPL-3.0 license to studio folder 2026-03-09 19:36:25 +00:00
setup.bat Final cleanup 2026-03-12 18:28:04 +00:00
setup.ps1 Resolve latest usable published llama.cpp release instead of fixed pinned tag (#4741) 2026-04-01 06:06:17 -07:00
setup.sh Fix custom llama.cpp source builds and macos metal source builds (#4762) 2026-04-01 14:06:39 -05:00
Unsloth_Studio_Colab.ipynb Allow install_python_stack to run on Colab (#4633) 2026-03-27 00:29:27 +04:00