INT8 weights get re-quantized as plain tensorwise on LoRA offload (convrot, per-channel params dropped)

快速结论：该报错通常发生在 ComfyUI 使用 INT8 量化模型加载 LoRA 时，尤其是在动态显存管理（非 --disable-dynamic-vram 模式）或大分辨率场景下。优先排查方向是检查 re-quant 调用是否正确继承原有 convrot、per_channel 参数。

问题场景

用户在 ComfyUI 中加载 Ideogram 4 或其他使用 *_convrot_simple 的 INT8 量化模型，并同时加载一个普通 LoRA。在默认动态显存管理下生成大图（如 2048×2048）时，图像质量明显下降，而使用非 INT8 模型或去掉 LoRA 时无此问题。小分辨率（如 512×512）时表现正常。

报错原文

INT8 weights get re-quantized as plain tensorwise on LoRA offload (convrot, per-channel params dropped)

对应代码中 QuantizedTensor.from_float(weight, "TensorWiseINT8Layout", scale="recalculate", ...) 被调用时未传递 convrot、per_channel、convrot_groupsize 参数。

原因分析

在 comfy/ops.py 中，两处 re-quant 调用（set_weight 函数和 resolve_cast_module_with_vbar/post_cast 函数）使用了默认参数构造新的 QuantizedTensor，导致原始量化参数中的 convrot（旋转）、per_channel（按通道缩放）被重置为 False/256，使得 per-channel 权重退化为了标量缩放。具体来说：TensorWiseINT8Layout.quantize 在未收到显式参数时，默认将 convrot 设为 False、per_channel 设为 False、convrot_groupsize 设为 256，导致从 per-channel/convrot 格式回到普通 tensor-wise INT8。

环境排查

ComfyUI 版本：基于 commit 7cb784e0（2025年期间的版本）
自定义节点：comfy-kitchen 0.2.12（如有装且怀疑相关，可先禁用测试）
PyTorch 版本：2.11.0+cu130
CUDA 版本：匹配 cu130（即 CUDA 13.0 驱动）
显存：测试中 512×512 正常，2048×2048 出错，说明显存压力较大时触发 offload 路径

解决步骤

确认问题是否由动态显存 offload 触发。 尝试在命令行启动 ComfyUI 时加入 --disable-dynamic-vram 参数，重新加载 INT8 模型+LoRA。如果问题消失（或转向其他退化），则说明 re-quant 路径是根本原因。
应用临时 monkey-patch。 下载 Issue 评论区附件中的 __init__.py（由 Claude 编写），覆盖或注入到 ComfyUI 相关模块中。该补丁在 set_weight 和 post_cast 处从原始 QuantizedTensor（self.weight 或 orig）继承 _params，例如：
```
p = self.weight._params
QuantizedTensor.from_float(
    weight, self.layout_type,
    per_channel=(p.scale.numel() > 1),
    convrot=getattr(p, "convrot", False),
    convrot_groupsize=getattr(p, "convrot_groupsize", 256),
)
```
尝试使用非量化模型（如 fp16/bf16）加载相同的 LoRA， 验证图像质量是否正常。如果正常，则进一步确认是 INT8 的 re-quant 问题。
关注官方修复 PR。 相关修复可能在 Comfy-Org/ComfyUI#14650 或后续 PR 中合并。建议升级到最新 commit 重试。

验证方法

使用相同的 INT8 模型+LoRA ，分别生成小图（512×512）和大图（2048×2048）。如果两张图质量接近，且大图不再出现明显退化（如模糊、颜色失真），则问题已修复。如需更精确验证，可在补丁前后对比 QuantizedTensor 的 _params 中 convrot 与 per_channel 是否保持一致。