ErvinXie
|
eefc8cf98d
|
更新 Kimi-K2-Thinking-Native.md (#1684)
|
2025-12-08 19:58:20 +08:00 |
|
Jiaqi Liao
|
f20e5d1da5
|
Revise prefill strategy and performance metrics (#1675)
Book-CI / test (push) Has been cancelled
Book-CI / test-1 (push) Has been cancelled
Book-CI / test-2 (push) Has been cancelled
Deploy / deploy (macos-latest) (push) Has been cancelled
Deploy / deploy (ubuntu-latest) (push) Has been cancelled
Deploy / deploy (windows-latest) (push) Has been cancelled
Updated the prefill strategy descriptions and performance benchmarks in the documentation.
|
2025-12-06 15:36:04 +08:00 |
|
Jiaqi Liao
|
1d62ac21f7
|
Update Kimi-K2-Thinking-Native.md (#1673)
Book-CI / test (push) Waiting to run
Book-CI / test-1 (push) Waiting to run
Book-CI / test-2 (push) Waiting to run
Deploy / deploy (macos-latest) (push) Waiting to run
Deploy / deploy (ubuntu-latest) (push) Waiting to run
Deploy / deploy (windows-latest) (push) Waiting to run
|
2025-12-05 23:08:02 +08:00 |
|
Jiaqi Liao
|
69fa7b1a57
|
Revise installation steps in Kimi-K2 documentation (#1672)
Updated installation instructions and added steps for cloning the repository.
|
2025-12-05 23:05:24 +08:00 |
|
Jiaqi Liao
|
721b6c4c94
|
[docs] Update Native Kimi-K2-Thinking documentation and kt-kernel parameters (#1671)
|
2025-12-05 22:46:16 +08:00 |
|
ErvinXie
|
71f683acec
|
Support Native Kimi K2 Thinking (#1663)
* [feat]: fix k2 prefill
* Update Kimi-K2-Thinking.md
* Create Kimi-K2-Thinking-Native.md
* Update Kimi-K2-Thinking.md
* Update Kimi-K2-Thinking.md
* Update Kimi-K2-Thinking-Native.md
* [perf] optimize K2 MoE weight loading with per-expert pointers
- Avoid expensive torch.stack().contiguous() in Python (was ~6.6s)
- Use per-expert pointer arrays (gate_projs) instead of contiguous memory
- C++ worker pool performs parallel memcpy for TP slicing
- Add LOAD_TIME_PROFILE for load_weights timing analysis
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
---------
Co-authored-by: ouqingliang <1692110604@qq.com>
Co-authored-by: Claude <noreply@anthropic.com>
|
2025-12-05 21:53:05 +08:00 |
|