mirror of
https://github.com/kvcache-ai/ktransformers.git
synced 2026-04-30 04:39:51 +00:00
align sft branch with main: revert worker_pool, strip sft_timer, fix inference defaults
- Revert worker_pool.cpp/.h to main (remove RDTSC timer, Chrome Trace, sft_timer namespace, ITT API, extended do_work_stealing_job API) - Strip all sft_timer instrumentation from sft-only files (sft_moe.hpp, moe-sft-tp.hpp, avx_kernels.hpp) - Restore pin_memory=True in KExpertsCPUBuffer (inference path) - Restore fused tensor transpose logic in convert_cpu_weights.py (main layout) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
168e10f254
commit
a789729923
7 changed files with 159 additions and 766 deletions
|
|
@ -90,7 +90,7 @@ class KExpertsCPUBuffer:
|
|||
hidden_size = hidden_states.shape[-1]
|
||||
batch_size = hidden_states.shape[0]
|
||||
|
||||
pin_memory = False
|
||||
pin_memory = True
|
||||
|
||||
if batch_size in cls.capture_buffers:
|
||||
return cls.capture_buffers[batch_size]
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue