Misc. bug: Performance from sycl with server-intel on Qwen 3.6 dropped from 32t/s to 25t/s after update server-intel-b9159

用户通过 Docker Compose 部署 ghcr.io/ggml-org/llama.cpp:server-intel 镜像,使用 SYCL 后端在 Intel Arc Pro B50 显卡上运行 Qwen 3.6 35B 模型(Q4_K_XL_MTP 量化)。镜像从 server-intel

![[Bug]: Hardcoded RERANK_LIMIT logic causes API failures (400) and ignores UI Top-K settings](https://www.chat-gpts.plus/wp-content/uploads/2026/06/14081-e59c8dd0-768x403.jpg)

![[Bug]: Gemma4-31B-it deployed on vLLM cannot process images in tool message](https://www.chat-gpts.plus/wp-content/uploads/2026/06/41452-f7e27533-768x403.jpg)
![[Bug]: [DeepSeek-V4-Flash][MTP] CUDA invalid argument during profile_run with DP4 + EP + MegaMoE on B_00](https://www.chat-gpts.plus/wp-content/uploads/2026/06/45099-8bfd8616-768x403.jpg)



