fix local_chat and update balance-serve and SUMMARY doc

This commit is contained in:
dongjw 2025-04-03 12:19:43 +08:00
parent 8acb270c90
commit ab0b0f4ea1
3 changed files with 6 additions and 1 deletions

View file

@ -22,6 +22,7 @@ Our vision for KTransformers is to serve as a flexible platform for experimentin
<h2 id="Updates">🔥 Updates</h2>
* **Apr 2, 2025**: Support Multi-concurrency. ([Tutorial](./en/balance-serve.md)).
* **Mar 27, 2025**: Support Multi-concurrency.
* **Mar 15, 2025**: Support ROCm on AMD GPU ([Tutorial](./en/ROCm.md)).
* **Mar 5, 2025**: Support unsloth 1.58/2.51 bits weights and [IQ1_S/FP8 hybrid](./en/fp8_kernel.md) weights. Support 139K [Longer Context](./en/DeepseekR1_V3_tutorial.md#v022-longer-context) for DeepSeek-V3 and R1 in 24GB VRAM.

View file

@ -11,6 +11,7 @@
- [Multi-GPU Tutorial](en/multi-gpu-tutorial.md)
- [Use FP8 GPU Kernel](en/fp8_kernel.md)
- [Use AMD GPU](en/ROCm.md)
- [Use Multi-concurrency](en/balance-serve.md)
# Server
- [Server](en/api/server/server.md)
- [Website](en/api/server/website.md)

View file

@ -74,6 +74,9 @@ strings ~/anaconda3/envs/ktransformers/lib/libstdc++.so.6 | grep GLIBCXX
```bash
sudo apt install libtbb-dev libssl-dev libcurl4-openssl-dev libaio1 libaio-dev libfmt-dev libgflags-dev zlib1g-dev patchelf
pip3 install packaging ninja cpufeature numpy openai
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu126
```
### 3. Build ktransformers
@ -87,7 +90,7 @@ git submodule update --init --recursive
# Install single NUMA dependencies
USE_BALANCE_SERVE=1 bash ./install.sh
# Or Install Dual NUMA dependencies
# For those who have two cpu and 1T RAMDual NUMA:
USE_BALANCE_SERVE=1 USE_NUMA=1 bash ./install.sh
```