Commit graph

  • 63b3f193f6
    Merge c53f63accb into 9f34ef46e6 Aliez Ren 2026-04-27 22:55:44 +08:00
  • 9f34ef46e6
    [fix](Qwen3 series): fix gibberish output by correcting RoPE write-back (#31) (#1959) main Benjamin F 2026-04-27 22:04:29 +08:00
  • 1cb73c7100 [fix](Qwen3 series): fix gibberish output by correcting RoPE write-back (#31) bump-sglang-pr31 yyj 2026-04-27 21:59:38 +08:00
  • 6fea6b7d99 deploy: 0656e01ac1 gh-pages JimmyPeilinLi 2026-04-26 16:46:01 +00:00
  • 0656e01ac1
    [docs]: refresh KT install commands (#1958) Peilin Li 2026-04-27 00:45:43 +08:00
  • d93ea7e21e [docs]: refresh KT install commands docs-v061-refresh JimmyPeilinLi 2026-04-26 16:29:20 +00:00
  • a7a575d41e
    [perf](kt-kernel): MXFP4 MoE add mat-mat 4×4 tile, refine mat-vec reduce (#1957) fp4-moe-amx Benjamin F 2026-04-26 17:34:08 +08:00
  • 7a0d8695c6 [perf](kt-kernel): MXFP4 MoE add mat-mat 4×4 tile, refine mat-vec reduce yyj 2026-04-26 15:10:42 +08:00
  • 07e274467a
    [build]: flatten ktransformers package shim (#1955) Peilin Li 2026-04-25 22:08:52 +08:00
  • d143cf3209 [build]: flatten ktransformers package shim flatten-ktransformers-shim JimmyPeilinLi 2026-04-25 14:01:20 +00:00
  • bfbd0e9352
    [chore]: archive kt-sft package (#1954) Peilin Li 2026-04-25 21:49:21 +08:00
  • bdf01a24b2 [chore]: archive kt-sft package archive-kt-sft JimmyPeilinLi 2026-04-25 13:42:50 +00:00
  • 85f1ab530b
    [ci]: use hosted runner for sglang-kt release Peilin Li 2026-04-25 21:05:18 +08:00
  • 248a1b5408 [ci]: use hosted runner for sglang-kt release JimmyPeilinLi 2026-04-25 13:04:09 +00:00
  • bc7afff13b
    [chore]: sync sglang-kt packaging fix Peilin Li 2026-04-25 21:02:25 +08:00
  • 54b1d24303 [chore]: sync sglang-kt packaging fix JimmyPeilinLi 2026-04-25 12:59:12 +00:00
  • 8484ef8b16
    [feat](kt-kernel): adapt MXFP4 MoE backend for DeepSeek-V4-Flash (#1950) Benjamin F 2026-04-25 18:11:53 +08:00
  • 1af8055cb2 [feat](kt-kernel): adapt MXFP4 MoE backend for DeepSeek-V4-Flash yyj 2026-04-25 17:17:30 +08:00
  • c53f63accb
    Merge branch 'main' into upstream-pr/rawint4-moe Aliez Ren 2026-04-25 00:49:58 +09:00
  • 4cc0f93307
    Merge dfe1cf34fb into eeaeb7bfd7 Valentin Lobstein 2026-04-24 23:45:26 +08:00
  • eeaeb7bfd7
    [build]: align kt-kernel torch support with v0.6.1 release (#1948) Peilin Li 2026-04-24 23:45:15 +08:00
  • 0e60e94b10 [build]: align kt-kernel torch support with v0.6.1 release kt-kernel-torch-range-v061 JimmyPeilinLi 2026-04-24 15:42:52 +00:00
  • 85308615b9
    [build] prepare v0.6.1 SFT wheel packaging on main (#1945) v0.6.1 Peilin Li 2026-04-24 12:08:38 +08:00
  • c7bf1be712 [build]: finalize py311+ wheel packaging defaults sft-whl JimmyPeilinLi 2026-04-24 04:01:51 +00:00
  • dfe1cf34fb
    Fix: Address code review - RestrictedUnpickler for C++ types, fix HMAC consistency, fix secret sharing Chocapikk 2026-04-23 23:15:27 +02:00
  • 96f86bf74f
    Fix: Replace pickle deserialization with HMAC auth + safetensors + JSON (CVE-2026-26210) Chocapikk 2026-04-23 23:09:38 +02:00
  • 161547cbe5 [build]: prepare 0.6.1 SFT wheel packaging on main JimmyPeilinLi 2026-04-23 09:22:57 +00:00
  • 1a409f69a4 [feat](kt-kernel): add AVX2/AVX-VNNI RAWINT4 MoE backend Aliez Ren 2026-04-23 15:57:51 +09:00
  • 747cb1cc01 [fix](kt-kernel): keep upstream ext_bindings changes Aliez Ren 2026-04-23 14:57:51 +09:00
  • f4bfee3820 [feat](kt-kernel): add AVX2/AVX-VNNI RAWINT4 MoE backend Aliez Ren 2026-04-23 14:49:05 +09:00
  • 4cd8cded34 [docs]: align install guides with explicit package flow sft JimmyPeilinLi 2026-04-22 09:44:01 +00:00
  • da53870bcb [chore]: remove integration patch package from top-level release JimmyPeilinLi 2026-04-22 09:33:35 +00:00
  • c41553a595 Revert "clean up dev artifacts: remove SFT design docs, debug examples, bench scripts" JimmyPeilinLi 2026-04-22 09:21:54 +00:00
  • ddfe92a07d [fix]: restore build wiring and track integration package JimmyPeilinLi 2026-04-22 09:21:49 +00:00
  • c9264e155c [build]: release v0.6.1 JimmyPeilinLi 2026-04-22 06:41:18 +00:00
  • 9544a8960d
    feat(sft): AMX MoE SFT backend with LoRA support (#1936) mrhaoxx 2026-04-22 11:27:01 +08:00
  • 948c75e76a
    remove dev version stamps from ext_bindings, sft_moe, moe-sft-tp mrhaoxx 2026-04-21 22:56:36 +08:00
  • a9bcee509c
    clean up dev artifacts: remove SFT design docs, debug examples, bench scripts mrhaoxx 2026-04-21 22:53:44 +08:00
  • 250e4fe52e
    merge: integrate origin/main into sft branch mrhaoxx 2026-04-21 22:40:07 +08:00
  • c4e88fb5af
    revert CMakeLists.txt to main: remove debug flags and cpptrace dep mrhaoxx 2026-04-21 20:56:02 +08:00
  • a789729923
    align sft branch with main: revert worker_pool, strip sft_timer, fix inference defaults mrhaoxx 2026-04-21 17:39:56 +08:00
  • 6e45d02ebe deploy: 22e9915ec9 yyj6666667 2026-04-21 07:52:11 +00:00
  • 22e9915ec9
    docs: add GOSIM 2026 announcement and update roadmap link to Q2 (#1937) ErvinXie 2026-04-21 15:51:08 +08:00
  • 5c5d7d48c0 [feat](kt-kernel): add MXFP4 MoE operator with E2M1 weights × BF16 activations ouqingliang 2026-04-21 02:53:04 +00:00
  • 168e10f254 [fix](sft): align Python API with C++ backend after v5 refactor JimmyPeilinLi 2026-04-20 16:44:09 +00:00
  • 00f9f8c0ef docs: add GOSIM 2026 announcement and update roadmap link to Q2 docs/gosim-2026-and-roadmap-link xwy 2026-04-20 22:25:02 +08:00
  • dd1da65d90
    feat(sft): add Qwen3.5 MoE support + fused checkpoint loading mrhaoxx 2026-04-20 17:19:15 +08:00
  • 58d7eabb9b
    feat(sft): support transformers v5 fused expert format mrhaoxx 2026-04-20 13:21:29 +08:00
  • c3fa14f9c5 deploy: e327db58be ovowei 2026-04-18 13:30:33 +00:00
  • e327db58be
    Update README.md (#1935) Jianwei Dong 2026-04-18 21:30:13 +08:00
  • 17d9e49dd0
    Update README.md ovowei-patch-2 Jianwei Dong 2026-04-18 21:28:46 +08:00
  • 92874ce177 deploy: a9f28d495b ovowei 2026-04-18 13:10:47 +00:00
  • a9f28d495b
    Update README.md (#1934) Jianwei Dong 2026-04-18 21:10:25 +08:00
  • b284e58f41
    Update README.md ovowei-patch-1 Jianwei Dong 2026-04-18 21:10:07 +08:00
  • 06ee9f62f3
    [doc]: add prerequisite note for GLM-5.1 tutorial (#1932) Benjamin F 2026-04-14 15:07:08 +08:00
  • d320e04765 [doc]: add prerequisite note for GLM-5.1 tutorial yyj 2026-04-14 15:01:25 +08:00
  • a9411f1d72
    Supports vnni-256 for GPTQ INT4 (#1926) callmegaga 2026-04-13 17:59:59 +08:00
  • f42e94a527
    [fix](cli): handle edge cases with empty NUMA nodes (#1929) Andy18650 2026-04-13 16:45:41 +08:00
  • 612b34b446 [fix](cli): handle edge cases with empty NUMA nodes Andy18650 2026-04-12 12:36:20 +08:00
  • b6a7692597 add flake Aliez Ren 2026-04-10 22:46:42 +09:00
  • 228094a94e add flake Aliez Ren 2026-04-10 22:46:26 +09:00
  • cb1dcdd157 add flake Aliez Ren 2026-04-10 22:22:26 +09:00
  • ad0d363df5 add flake Aliez Ren 2026-04-10 22:21:52 +09:00
  • 6d4632b8c7
    fix: add missing gpu_experts_mask=None to KTMoEWrapper call in SFT wrapper mrhaoxx 2026-04-10 02:18:40 +08:00
  • 44d9df9e62 [refactor](kt-kernel): Optimize the issues raised in the review callmegaga 2026-04-09 22:07:57 +08:00
  • 8d9b6ba239
    Merge branch 'kvcache-ai:main' into main callmegaga 2026-04-09 21:10:58 +08:00
  • 5bfcb5f784 refactor(sft): share_backward_bb default True, share_cache_pool auto-derived mrhaoxx 2026-04-09 20:10:38 +08:00
  • 279c920a69
    Revert "kt-kernel: enable CPUInfer stream bridge for ROCm (#1918)" (#1925) ErvinXie 2026-04-09 18:43:03 +08:00
  • 3d184a248e
    Revert "kt-kernel: enable CPUInfer stream bridge for ROCm (#1918)" revert-1918-fix/rocm-cpuinfer-stream-bridge ErvinXie 2026-04-09 18:42:46 +08:00
  • 2877b3a353
    Merge branch 'kvcache-ai:main' into main callmegaga 2026-04-09 15:17:19 +08:00
  • 020eb929f7 refactor(sft): unify KTConfig field names with kt_ prefix, add share_cache_pool, remove dead code mrhaoxx 2026-04-09 14:17:50 +08:00
  • 1dd0a78899
    kt-kernel: enable CPUInfer stream bridge for ROCm (#1918) guanjiawei 2026-04-09 12:20:04 +08:00
  • 9b2d3b687b
    fix: remove broken symlink in archive/ktransformers/ (#1906) acture 2026-04-09 11:42:19 +08:00
  • ad19a3e653
    Chore/kt layerwise prefill main (#1920) Oql 2026-04-09 11:28:37 +08:00
  • 7fd1b9dfe8 [chore]: use sglang main with KT layerwise prefill logs chore/kt-layerwise-prefill-main ouqingliang 2026-04-09 03:27:06 +00:00
  • dd59c7ebde [chore]: sync sglang submodule with KT layerwise prefill log rename ouqingliang 2026-04-09 03:25:44 +00:00
  • 38e95e3581 [chore]: update sglang submodule for KT layerwise prefill logs chore/kt-layerwise-prefill-label ouqingliang 2026-04-09 03:19:00 +00:00
  • a98d544833 merge: integrate origin/main into sft branch mrhaoxx 2026-04-08 23:19:28 +08:00
  • f36699affd feat(sft): AMX MoE SFT backend with LoRA support mrhaoxx 2026-04-08 23:11:00 +08:00
  • 07fd9328fa refactor(sft): move SFT logic into kt_kernel.sft submodule sft-rel mrhaoxx 2026-04-08 23:07:41 +08:00
  • 5011b25694 kt-kernel: enable CPUInfer stream bridge for ROCm guanjiawei 2026-04-08 19:20:04 +08:00
  • 891c5c0a13
    Support glm5.1 (#1916) Jianwei Dong 2026-04-07 11:29:32 +08:00
  • 4f43de3169 fix support-glm51-djw ovowei 2026-04-07 11:26:26 +08:00
  • 0357113610 support glm5.1 ovowei 2026-04-07 11:18:34 +08:00
  • 4606bf19fd [feat](kt-kernel): support avx-vnni-256 for gptq int4 callmegaga 2026-04-05 14:27:49 +08:00
  • 8a427c9321
    [feat]: add AVX512F+BW fallback for FP8 and BF16 under AMX backend (#1908) Jim James 2026-04-03 00:46:22 -04:00
  • db9326302b
    chore: bump version to 0.5.3 (#1909) v0.5.3 Jianwei Dong 2026-04-01 18:58:48 +08:00
  • b21c848922 chore: bump version to 0.5.3 release/v0.5.3 ovowei 2026-04-01 18:57:11 +08:00
  • 5a5a7e93da [feat]: add AVX512F+BW fallback for FP8 and BF16 under AMX backend Jim James 2026-03-31 20:15:10 -04:00
  • 665e2ce825
    fix: remove broken symlink in archive/ktransformers/ Acture 2026-03-31 19:11:40 +08:00
  • 9e6484a538
    [fix]: fix --numa-nodes handling (#1904) Oql 2026-03-31 17:50:22 +08:00
  • 5cf573307e [chore]: unify kt-run numa handling chore/kt-run-numa-nodes-unified ouqingliang 2026-03-31 09:47:57 +00:00
  • bfc4c3cb04 [feat]: unify --numa-nodes handling and include submodule pointer ouqingliang 2026-03-31 09:32:22 +00:00
  • 76332c3e68
    Delete doc/en/kt-kernel/GPT-OSS-120B-Kernel-Work.md Benjamin F 2026-03-31 16:23:48 +08:00
  • cdc867c864
    [chore]: sync sglang submodule (#1903) Oql 2026-03-31 11:21:09 +08:00
  • 8be5bd365e [chore]: sync sglang submodule mtp ouqingliang 2026-03-31 02:43:58 +00:00
  • 24cd4fc055
    feat(kt-kernel): Add utility script to merge loose layer weights to safetensors (#1886) Doctor Shotgun 2026-03-30 19:41:07 -07:00
  • 9c18b60556
    feat: CPU weight conversion for GLM-5 and MiniMax-M2.5 (#1853) alin899992 2026-03-31 10:39:48 +08:00
  • 3903c9afcc
    (kt-kernel): add numa_nodes parameter for explicit NUMA node mapping (#1891) ErvinXie 2026-03-31 10:27:50 +08:00
  • bdf4bb76c5
    Fix worker pool idle CPU usage (#1902) Doctor Shotgun 2026-03-30 05:29:17 -07:00