[Bug]: from_openai_message misses vLLM-served Qwen3 reasoning field (uses ‘reasoning’ instead of ‘reasoning_content’)

快速结论：当 LlamaIndex 通过 OpenAI 兼容接口调用 vLLM 0.20.x 服务的 Qwen3 推理模型时，模型生成的思考链（Chain-of-Thought）被静默丢弃。优先排查 from_openai_message 函数是否同时读取了 reasoning 与 reasoning_content 两个字段。

问题场景

用户使用 LlamaIndex 的 OpenAI 兼容 LLM 客户端（llama-index-llms-openai）连接 vLLM 服务器（>=0.20.x），服务模型为 Qwen3/Qwen3.5/Qwen3.6 系列（如 Qwen/Qwen3.6-35B-A3B，并启用 --reasoning-parser qwen3）。模型成功输出了 reasoning 字段，但 LlamaIndex 未能将其转换为 ThinkingBlock，导致下游的 workflow、agent、evaluator 等组件无法取得推理内容。

报错原文

# 非报错，而是数据丢失现象：模型返回的 reasoning 字段被 from_openai_message 忽略
# 实际报错/现象：ThinkingBlock 从未被附加到 assistant message 中
# 原始 API 响应中包含 message.reasoning 而非 message.reasoning_content

原因分析

在 llama-index-integrations/llms/llama-index-llms-openai/llama_index/llms/openai/utils.py 的 from_openai_message 函数中，仅检查了 reasoning_content 字段：

reasoning_content = getattr(openai_message, "reasoning_content", None)
if isinstance(reasoning_content, str) and reasoning_content:
    blocks.append(ThinkingBlock(content=reasoning_content))

这一设计符合 OpenAI/DeepSeek 的惯例，但 vLLM 0.20.x 的 OpenAI 兼容接口在服务 Qwen3 推理模型时，将 reasoning 字段暴露为 message.reasoning，而非 message.reasoning_content。因此 reasoning_content 为空时函数直接跳过，推理内容被静默丢弃。

环境排查

LlamaIndex 版本：llama-index-llms-openai latest / llama-index-core latest
vLLM 版本：>= 0.20.x（如 0.20.1+cu129）
模型：Qwen3、Qwen3.5、Qwen3.6 系列，需启用 --reasoning-parser qwen3
确认 API 原始响应中 message 对象包含的是 reasoning 还是 reasoning_content 字段

解决步骤

确认问题：查看 vLLM 返回的 API 响应结构，确认 response.choices[0].message 中实际包含 reasoning 而非 reasoning_content。
临时 workaround（推荐优先尝试）：在调用前通过 monkey-patch 为 from_openai_message 增加 fallback，或自行 subclass OpenAILike 覆盖转换逻辑。虽然脆弱，但可立即验证修复方向。
官方修复（等待合并）：社区已有多个 PR 尝试修复此问题，包括 #21076 与 #21220。核心改动是在 from_openai_message 中增加 fallback：

reasoning_content = getattr(openai_message, "reasoning_content", None)
if not reasoning_content:
    reasoning_content = getattr(openai_message, "reasoning", None)
if isinstance(reasoning_content, str) and reasoning_content:
    blocks.append(ThinkingBlock(content=reasoning_content))

注意：同一 fallback 也需应用于 stream 路径（stream_chat / astream_chat）中的 delta 对象。

验证方法

在应用修复后，调用模型并检查 AssistantMessage 中是否包含 ThinkingBlock。可以在 workflow 或 agent 中打印消息块，或直接检查 response 对象的 additional_kwargs / blocks 属性。若能看到 reasoning 内容被正确解析，则问题解决。