Commit graph

6 commits

Author SHA1 Message Date
ErvinXie
eefc8cf98d
更新 Kimi-K2-Thinking-Native.md (#1684) 2025-12-08 19:58:20 +08:00
Jiaqi Liao
f20e5d1da5
Revise prefill strategy and performance metrics (#1675)
Some checks failed
Book-CI / test (push) Has been cancelled
Book-CI / test-1 (push) Has been cancelled
Book-CI / test-2 (push) Has been cancelled
Deploy / deploy (macos-latest) (push) Has been cancelled
Deploy / deploy (ubuntu-latest) (push) Has been cancelled
Deploy / deploy (windows-latest) (push) Has been cancelled
Updated the prefill strategy descriptions and performance benchmarks in the documentation.
2025-12-06 15:36:04 +08:00
Jiaqi Liao
1d62ac21f7
Update Kimi-K2-Thinking-Native.md (#1673)
Some checks are pending
Book-CI / test (push) Waiting to run
Book-CI / test-1 (push) Waiting to run
Book-CI / test-2 (push) Waiting to run
Deploy / deploy (macos-latest) (push) Waiting to run
Deploy / deploy (ubuntu-latest) (push) Waiting to run
Deploy / deploy (windows-latest) (push) Waiting to run
2025-12-05 23:08:02 +08:00
Jiaqi Liao
69fa7b1a57
Revise installation steps in Kimi-K2 documentation (#1672)
Updated installation instructions and added steps for cloning the repository.
2025-12-05 23:05:24 +08:00
Jiaqi Liao
721b6c4c94
[docs] Update Native Kimi-K2-Thinking documentation and kt-kernel parameters (#1671) 2025-12-05 22:46:16 +08:00
ErvinXie
71f683acec
Support Native Kimi K2 Thinking (#1663)
* [feat]: fix k2 prefill

* Update Kimi-K2-Thinking.md

* Create Kimi-K2-Thinking-Native.md

* Update Kimi-K2-Thinking.md

* Update Kimi-K2-Thinking.md

* Update Kimi-K2-Thinking-Native.md

* [perf] optimize K2 MoE weight loading with per-expert pointers

- Avoid expensive torch.stack().contiguous() in Python (was ~6.6s)
- Use per-expert pointer arrays (gate_projs) instead of contiguous memory
- C++ worker pool performs parallel memcpy for TP slicing
- Add LOAD_TIME_PROFILE for load_weights timing analysis

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: ouqingliang <1692110604@qq.com>
Co-authored-by: Claude <noreply@anthropic.com>
2025-12-05 21:53:05 +08:00