convert : fix RuntimeError when stripping FP8 KV-cache scales (#22818)

* convert : fix RuntimeError when stripping FP8 KV-cache scales

In ModelBase._generate_nvfp4_tensors the final cleanup loop iterates
self.model_tensors.keys() and calls del on the same dict, which raises
RuntimeError: dictionary changed size during iteration when a ModelOpt
NVFP4 model also has FP8 KV-cache scales (e.g. mmangkad/Qwen3.6-35B-A3B-NVFP4
and any modelopt config with kv_cache_quant_algo: FP8).

Wrap the keys view in list() so the deletions happen on a snapshot.

* re-add another accidentally removed list

---------

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
This commit is contained in:
Michał Piszczek 2026-05-08 05:55:48 +02:00 committed by GitHub
parent 6a2a2513dc
commit 1d72d87349
No known key found for this signature in database
GPG key ID: B5690EEEBB952194

View file

@ -710,7 +710,7 @@ class ModelBase:
self._repack_nvfp4(name, weight, scale, scale2, input_scale)
# Flush any remaining experts (fallback if n_experts was unknown)
for bid, proj_type in expert_blocks.keys():
for bid, proj_type in list(expert_blocks.keys()):
self._flush_nvfp4_experts((bid, proj_type), expert_blocks, expert_scales, expert_input_scales, expert_shapes, bid, proj_type)
# Remove consumed tensors so get_tensors/modify_tensors won't see them
@ -718,7 +718,7 @@ class ModelBase:
self.model_tensors.pop(name, None)
# Remove any remaining unused auxiliary tensors
for name in self.model_tensors.keys():
for name in list(self.model_tensors.keys()):
if name.endswith((".k_scale", ".v_scale")):
del self.model_tensors[name]