[Bug]: stream_chunk_size on bedrock/invoke models leaks into the request body and Bedrock rejects it with ValidationException

快速结论：该报错发生在 LiteLLM 代理（Proxy）配置了 `bedrock/invoke/…` 模型且客户端请求中携带 `stream_chunk_size` 参数时。优先排查是否在客户端请求中显式传入了 `stream_chunk_size`，或者 LiteLLM 配置中是否意外设置了该参数。

问题场景

用户使用 LiteLLM Proxy 路由到 Bedrock 的 invoke 模型（如 Claude Sonnet 4），在流式推理请求中设置 `stream_chunk_size` 参数（{"stream":true,"stream_chunk_size":2048}），导致 Bedrock 返回 ValidationException，请求被拒绝。

报错原文

litellm.BadRequestError: BedrockException - {"message":"stream_chunk_size: Extra inputs are not permitted"}

# 直接通过 boto3 invoke_model_with_response_stream 验证时：
ValidationException: An error occurred (ValidationException) when calling the InvokeModelWithResponseStream operation: stream_chunk_size: Extra inputs are not permitted

原因分析

stream_chunk_size 是 LiteLLM 内部用于控制客户端侧 HTTP 响应流重新分块的参数，不应当被发送到 Bedrock 的 invoke 请求体中。但在 AmazonInvokeConfig.transform_request 方法中（位于 litellm/llms/bedrock/chat/invoke_transformations/base_invoke_transformation.py），代码仅从 optional_params 中弹出（pop）了 stream 参数，然后将其余所有参数直接序列化到发送给 Bedrock 的 JSON 请求体中。因此 stream_chunk_size （以及其他 LiteLLM 内部参数）被泄漏到了 Bedrock 的 InvokeModel 请求体，Bedrock 不识别该字段并拒绝请求。

注意：即使没有引发 400 错误，stream_chunk_size 在 invoke 路由上也并未生效——invoke 的流式封装器（get_sync_custom_stream_wrapper / get_async_custom_stream_wrapper）在调用 make_call / make_sync_call 时未传递 stream_chunk_size，因此只有默认值被使用。

环境排查

确认 LiteLLM 版本（Issue 中为 main 分支）。
确认模型配置为 bedrock/invoke/... 路由（而非 converse 路由）。
确认客户端请求（如 curl）中是否显式传入了 stream_chunk_size 参数。

解决步骤

修复已在 PR #30240 中实现：在 invoke 分发器（dispatcher）和 Claude messages 格式的请求构建器中删除 stream_chunk_size 参数，使其不会进入 Bedrock 请求体。请更新到包含此修复的 LiteLLM 版本。
临时回避方法：确保客户端请求中不包含 stream_chunk_size 参数。如果必须使用该参数，请考虑将模型配置为 converse 路由（bedrock/... 而非 bedrock/invoke/...），因为 converse 路由已构建了自定义字段集合，不会直接透传所有参数。
注意更广泛的寄生参数泄露问题：Issue 指出，任何使用 splat 方式透传推理参数的 invoke 提供商分支（包括 Anthropic、Cohere、Mistral、Titan、Meta/Llama、AI21）都可能受影响。后续审计 Issue #30301 已被创建，用于系统性地检查此类问题。作为临时措施，可关注 litellm_core_utils/core_helpers.py 中的 filter_internal_params 机制（目前仅用于 MCP handler 键），并考虑在 invoke 变换中调用该过滤函数。

验证方法

安装修复版本后，发送带有 stream_chunk_size 参数的流式请求（如 curl -X POST ... -d '{"model":"claude-invoke","stream":true,"stream_chunk_size":2048,...}'），确认不再返回 ValidationException。不带该参数的流式请求此前已正常工作无变化。