mirror of
https://github.com/kvcache-ai/ktransformers.git
synced 2025-09-09 05:54:06 +00:00
Merge pull request #916 from kvcache-ai/patch_v0.2.3post2
📝 fix typo ktransformer->ktransformers
This commit is contained in:
commit
e788248364
5 changed files with 7 additions and 7 deletions
|
@ -163,9 +163,9 @@ If you are interested in our design principles and the implementation of the inj
|
|||
|
||||
<h2 id="ack">Acknowledgment and Contributors</h2>
|
||||
|
||||
The development of KTransformer is based on the flexible and versatile framework provided by Transformers. We also benefit from advanced kernels such as GGUF/GGML, Llamafile, Marlin, sglang and flashinfer. We are planning to contribute back to the community by upstreaming our modifications.
|
||||
The development of KTransformers is based on the flexible and versatile framework provided by Transformers. We also benefit from advanced kernels such as GGUF/GGML, Llamafile, Marlin, sglang and flashinfer. We are planning to contribute back to the community by upstreaming our modifications.
|
||||
|
||||
KTransformer is actively maintained and developed by contributors from the <a href="https://madsys.cs.tsinghua.edu.cn/">MADSys group</a> at Tsinghua University and members from <a href="http://approaching.ai/">Approaching.AI</a>. We welcome new contributors to join us in making KTransformer faster and easier to use.
|
||||
KTransformers is actively maintained and developed by contributors from the <a href="https://madsys.cs.tsinghua.edu.cn/">MADSys group</a> at Tsinghua University and members from <a href="http://approaching.ai/">Approaching.AI</a>. We welcome new contributors to join us in making KTransformers faster and easier to use.
|
||||
|
||||
|
||||
<h2 id="ack">Discussion</h2>
|
||||
|
|
|
@ -152,9 +152,9 @@ YAML 文件中的每个规则都有两部分:`match` 和 `replace`。`match`
|
|||
|
||||
<h2 id="ack">致谢和贡献者</h2>
|
||||
|
||||
KTransformer 的开发基于 Transformers 提供的灵活和多功能框架。我们还受益于 GGUF/GGML、Llamafile 、 Marlin、sglang和flashinfer 等高级内核。我们计划通过向上游贡献我们的修改来回馈社区。
|
||||
KTransformers 的开发基于 Transformers 提供的灵活和多功能框架。我们还受益于 GGUF/GGML、Llamafile 、 Marlin、sglang和flashinfer 等高级内核。我们计划通过向上游贡献我们的修改来回馈社区。
|
||||
|
||||
KTransformer 由清华大学 <a href="https://madsys.cs.tsinghua.edu.cn/">MADSys group</a> 小组的成员以及 <a href="http://approaching.ai/">Approaching.AI</a> 的成员积极维护和开发。我们欢迎新的贡献者加入我们,使 KTransformer 更快、更易于使用。
|
||||
KTransformers 由清华大学 <a href="https://madsys.cs.tsinghua.edu.cn/">MADSys group</a> 小组的成员以及 <a href="http://approaching.ai/">Approaching.AI</a> 的成员积极维护和开发。我们欢迎新的贡献者加入我们,使 KTransformers 更快、更易于使用。
|
||||
|
||||
|
||||
<h2 id="ack">讨论</h2>
|
||||
|
|
|
@ -1,4 +1,4 @@
|
|||
# Ktransformer
|
||||
# Ktransformers
|
||||
|
||||
[Introduction](./README.md)
|
||||
# Install
|
||||
|
|
|
@ -9,7 +9,7 @@ There is a Docker image available for our project, you can pull the docker image
|
|||
```
|
||||
docker pull approachingai/ktransformers:0.2.1
|
||||
```
|
||||
**Notice**: In this image, we compile the ktransformers in AVX512 instuction CPUs, if your cpu not support AVX512, it is suggested to recompile and install ktransformer in the /workspace/ktransformers directory within the container.
|
||||
**Notice**: In this image, we compile the ktransformers in AVX512 instuction CPUs, if your cpu not support AVX512, it is suggested to recompile and install ktransformers in the /workspace/ktransformers directory within the container.
|
||||
|
||||
## Building docker image locally
|
||||
- Download Dockerfile in [there](../../Dockerfile)
|
||||
|
|
|
@ -118,7 +118,7 @@ From: https://github.com/kvcache-ai/ktransformers/issues/374
|
|||
|
||||
1. First, download the latest source code using git.
|
||||
2. Then, modify the DeepSeek-V3-Chat-multi-gpu-4.yaml in the source code and all related yaml files, replacing all instances of KLinearMarlin with KLinearTorch.
|
||||
3. Next, you need to compile from the ktransformer source code until it successfully compiles on your local machine.
|
||||
3. Next, you need to compile from the ktransformers source code until it successfully compiles on your local machine.
|
||||
4. Then, install flash-attn. It won't be used, but not installing it will cause an error.
|
||||
5. Then, modify local_chat.py, replacing all instances of flash_attention_2 with eager.
|
||||
6. Then, run local_chat.py. Be sure to follow the official tutorial's commands and adjust according to your local machine's parameters.
|
||||
|
|
Loading…
Add table
Reference in a new issue