issue: reasoning_content is stripped from assistant tool call messages, breaking multi-turn tool calling with reasoning models (Kimi K2.5, e

快速结论：该报错发生在 Open WebUI 使用推理模型（如 Kimi K2.5、DeepSeek、MiMo）进行多轮工具调用（multi-turn tool calling）时。优先排查 reasoning_content 是否在 assistant 消息的 tool_calls 中被剥离，并考虑应用 Filter 插件或升级到 dev 分支修复版本。

问题场景

用户在 Open WebUI v0.8.12（Docker 部署，Debian 13）中配置了推理模型（如 kimi-k2.5 通过 Moonshot API 或 OpencodeGO 提供商），启用了本地函数调用（function_calling: native）和至少一个工具（如网络搜索、代码解释器）。在发起需要多轮工具调用的对话时（例如 “Search for the current weather in New York and Tokyo, then calculate the temperature difference”），模型第一次推理 + 工具调用成功，但第二次推理时上游 API 返回 400 错误。

报错原文

HTTP/1.1 400 Bad Request
{
  "is_bifrost_error": false,
  "status_code": 400,
  "error": {
    "message": "thinking is enabled but reasoning_content is missing in assistant tool call message at index 2",
    "type": "invalid_request_error"
  }
}

原因分析

根本原因是 misc.py 中的 convert_output_to_messages() 函数从未在 assistant 消息字典中设置 reasoning_content 字段。推理文本仅被折叠进 content 字段作为标记文本（如 ...），但在重建对话历史时，没有保留 reasoning_content 字段。

具体来说，middleware.convert_output_to_messages 接受一个 reasoning_format 参数，该参数控制前序推理内容是否存活在重建的 assistant 消息中。get_reasoning_format(model) 仅当 model['provider'] == 'llama.cpp' 时才返回 'reasoning_content'，其他情况返回 None，导致推理内容在下一轮请求前被丢弃。

此问题影响多轮调用中的两种场景：回合内（同一回合中多次工具调用循环）和跨回合（后续用户回合中需要保持上下文）。Moonshot 官方文档明确要求：“在多步工具调用中，必须在上下文中保留当前回合助手消息中的 reasoning_content，否则将抛出错误”。

环境排查

Open WebUI 版本：当前为 v0.8.12（Issue 已关闭，可能存在 dev 分支修复，或需要后续稳定版回传）
部署方式：Docker（Debian 13）
模型与提供商：推理模型（如 Kimi K2.5/Moonshot、DeepSeek、MiMo 等），使用 OpenAI 兼容 API
函数调用设置：启用 function_calling: native，至少配置一个工具（如 web_search）
上游 API 要求：必须保留 reasoning_content 字段（参考 Moonshot 文档）、MiMo 文档（passing-back-reasoning_content）也有同等要求

解决步骤

确认问题范围：如果是 dev 分支（已标记 Likely addressed in dev），可暂不处理；若使用稳定版，参考以下方法。
（可优先尝试）应用 Userspace Filter 插件（Function）：
- 在 Open WebUI 管理面板 → 管理 → Functions 中创建新 Function 插件
- 插件核心逻辑：monkey-patch get_reasoning_format，使对所有非 ollama 模型返回 'reasoning_content' 而不是 None。注意：patch 只针对 provider != 'llama.cpp'，ollama 模型（使用 'think_tags'）保留原行为
- 使用 excluded_model_ids 值（valve）列表指定不应用此 patch 的模型 ID（如 Gemma 4 禁止在历史中保留推理内容）
- 注意：禁用该 Filter 不会撤销 patch，需要重启容器
升级到 dev 分支修复版本：
- Issue 修复 PR #23742 已合入 dev 分支，但后续因与部分提供商不兼容被回退（提交记录显示已被 revert）
- 当前 dev 分支认为该修复应由外部处理
- 步骤 2 的 Filter 是目前可用的外部处理方案
查看具体修复代码（仅参考）：
- 原 PR #23742 修改了 convert_output_to_messages()，添加 pending_reasoning 累加器，在发出带有 tool_calls 的 assistant 消息时保留 reasoning_content
- 建议不要直接修改核心文件，而是使用 Filter 机制

验证方法

重新执行多轮工具调用任务，例如发起提词：“Search for the current weather in New York and Tokyo, then calculate the temperature difference.” 观察是否：

第一轮推理 → 工具调用 → 工具结果成功返回
第二轮推理正常发生，不再出现 400 错误
如果使用 Filter，可进一步验证跨回合记忆：在后续回合中要求模型重复之前推理出的数值或完整推理文本，确认模型能读取到 reasoning_content 字段中的内容

参考来源

open-webui/open-webui #23175

想把多个 AI 模型放在一个入口？

GamsGo AI 集成 ChatGPT、DeepSeek、Gemini、Claude、Midjourney、Veo 等常用模型，适合写作、绘图、视频和日常 AI 工作流。

了解 GamsGo AI

推广链接：通过此链接购买，我可能获得佣金，不影响你的价格。

issue: reasoning_content is stripped from assistant tool call messages, breaking multi-turn tool calling with reasoning models (Kimi K2.5, e