[doc]: add prerequisite note for GLM-5.1 tutorial (#1932)
Some checks failed
Book-CI / test (push) Has been cancelled
Book-CI / test-1 (push) Has been cancelled
Book-CI / test-2 (push) Has been cancelled
Deploy / deploy (macos-latest) (push) Has been cancelled
Deploy / deploy (ubuntu-latest) (push) Has been cancelled
Deploy / deploy (windows-latest) (push) Has been cancelled

This commit is contained in:
Benjamin F 2026-04-14 15:07:08 +08:00 committed by GitHub
parent a9411f1d72
commit 06ee9f62f3
No known key found for this signature in database
GPG key ID: B5690EEEBB952194

View file

@ -6,16 +6,17 @@ GLM-5.1 introduces thinking mode (enabled by default), interleaved and preserved
## Table of Contents
- [Table of Contents](#table-of-contents)
- [Prerequisites](#prerequisites)
- [Step 1: Download Model Weights](#step-1-download-model-weights)
- [Step 2: Launch SGLang Server](#step-2-launch-sglang-server)
- [Step 3: Send Inference Requests](#step-3-send-inference-requests)
- [Option A: Interactive Chat with KT CLI](#option-a-interactive-chat-with-kt-cli)
- [Option B: OpenAI-Compatible API](#option-b-openai-compatible-api)
- [Thinking Mode](#thinking-mode)
- [Recommended Parameters](#recommended-parameters)
- [Additional Resources](#additional-resources)
- [Running GLM-5.1 with SGLang and KT-Kernel](#running-glm-51-with-sglang-and-kt-kernel)
- [Table of Contents](#table-of-contents)
- [Prerequisites](#prerequisites)
- [Step 1: Download Model Weights](#step-1-download-model-weights)
- [Step 2: Launch SGLang Server](#step-2-launch-sglang-server)
- [Step 3: Send Inference Requests](#step-3-send-inference-requests)
- [Option A: Interactive Chat with KT CLI](#option-a-interactive-chat-with-kt-cli)
- [Option B: OpenAI-Compatible API](#option-b-openai-compatible-api)
- [Thinking Mode](#thinking-mode)
- [Recommended Parameters](#recommended-parameters)
- [Additional Resources](#additional-resources)
## Prerequisites
@ -41,8 +42,14 @@ Before starting, ensure you have:
cd kt-kernel && ./install.sh
```
3. **CUDA toolkit** - CUDA 12.0+ recommended (12.8+ for best FP8 support)
4. **Hugging Face CLI** - For downloading models:
3. **Transformers 5.3.0** — GLM-5 and GLM-5.1 require exactly `transformers==5.3.0` (the default pip install gives 4.x, which will not work):
```bash
pip install transformers==5.3.0
```
> **Note:** `transformers==5.3.0` is **not** compatible with some older models (e.g., DeepSeek). If you need to run those models, switch back to a 4.x release. Consider using a separate virtual environment for GLM-5/5.1 to avoid conflicts.
4. **CUDA toolkit** - CUDA 12.0+ recommended (12.8+ for best FP8 support)
5. **Hugging Face CLI** - For downloading models:
```bash
pip install -U huggingface-hub
```