convert : fix RuntimeError when stripping FP8 KV-cache scales (#22818)

* convert : fix RuntimeError when stripping FP8 KV-cache scales In ModelBase._generate_nvfp4_tensors the final cleanup loop iterates self.model_tensors.keys() and calls del on the same dict, which raises RuntimeError: dictionary changed size during iteration when a ModelOpt NVFP4 model also has FP8 KV-cache scales (e.g. mmangkad/Qwen3.6-35B-A3B-NVFP4 and any modelopt config with kv_cache_quant_algo: FP8). Wrap the keys view in list() so the deletions happen on a snapshot. * re-add another accidentally removed list --------- Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
2026-05-19 08:00:25 +00:00 · 2026-05-08 05:55:48 +02:00 · 2026-05-08 05:55:48 +02:00 · 1d72d87349
commit 1d72d87349
parent 6a2a2513dc
1 changed files with 2 additions and 2 deletions
--- a/convert_hf_to_gguf.py
+++ b/convert_hf_to_gguf.py
@ -710,7 +710,7 @@ class ModelBase:
                self._repack_nvfp4(name, weight, scale, scale2, input_scale)

        # Flush any remaining experts (fallback if n_experts was unknown)
-        for bid, proj_type in expert_blocks.keys():
+        for bid, proj_type in list(expert_blocks.keys()):
            self._flush_nvfp4_experts((bid, proj_type), expert_blocks, expert_scales, expert_input_scales, expert_shapes, bid, proj_type)

        # Remove consumed tensors so get_tensors/modify_tensors won't see them
@ -718,7 +718,7 @@ class ModelBase:
            self.model_tensors.pop(name, None)

        # Remove any remaining unused auxiliary tensors
-        for name in self.model_tensors.keys():
+        for name in list(self.model_tensors.keys()):
            if name.endswith((".k_scale", ".v_scale")):
                del self.model_tensors[name]