[BugFix][KVCache] Fix v0 cache manager incorrectly started when ENABLE_V1_KVCACHE_MANAGER is enabled#7528
Conversation
…E_V1_KVCACHE_MANAGER is enabled ## Motivation When `ENABLE_V1_KVCACHE_MANAGER=1`, the v0 cache manager should not be started. However, there were two code paths where v0 cache manager processes could still be launched: 1. `EngineService.start_cache_service()` - missing early return guard 2. `ExpertService` - `start_cache_service` called without checking `ENABLE_V1_KVCACHE_MANAGER` ## Modifications - `fastdeploy/engine/common_engine.py`: Add early return in `start_cache_service()` when `ENABLE_V1_KVCACHE_MANAGER` is enabled - `fastdeploy/engine/expert_service.py`: Wrap `start_cache_service` call with `not envs.ENABLE_V1_KVCACHE_MANAGER` guard to prevent v0 cache manager from starting ## Usage or Command ```bash # Enable Cache Manager V1 to verify v0 cache manager is not started export ENABLE_V1_KVCACHE_MANAGER=1 python -m fastdeploy.entrypoints.openai.api_server \ --model <model_path> \ --tensor-parallel-size 1 ```
|
Thanks for your contribution! |
PaddlePaddle-bot
left a comment
There was a problem hiding this comment.
🤖 AI Code Review |
2026-04-21 15:39:16
📋 Review 摘要
PR 概述:修复当 ENABLE_V1_KVCACHE_MANAGER=1 时,v0 cache manager 仍被错误启动的 bug
变更范围:engine/common_engine.py、engine/expert_service.py
影响面 Tag:KVCache Engine
问题
| 级别 | 文件 | 概述 |
|---|---|---|
| 🟡 建议 | common_engine.py:2178 |
early return 处可补充 debug 日志提升可观测性 |
总体评价
修复方案清晰正确。在 start_cache_service() 方法内部添加 V1 模式的 early return 守卫,同时在 ExpertService.start() 调用方也增加了条件判断,形成双重防护。经核查代码库中所有 start_cache_service 调用点(common_engine.py 2处、engine.py 3处、expert_service.py 1处),其中 engine.py:155 的调用路径虽未在本 PR 中添加调用方守卫,但已被方法内部的 early return 兜底保护,不会产生 bug。整体变更最小化且有效,未发现阻塞性问题。
| threading.Thread(target=decode_loop, daemon=True).start() | ||
|
|
||
| def start_cache_service(self, device_ids, ipc_signal_suffix): | ||
| if envs.ENABLE_V1_KVCACHE_MANAGER: |
There was a problem hiding this comment.
🟡 建议 方法级守卫 + 调用方守卫构成了双重防护,逻辑正确。但建议在 early return 处添加一行 debug 日志,方便排查 V1 模式下是否意外进入此路径,提升可观测性。
if envs.ENABLE_V1_KVCACHE_MANAGER:
console_logger.debug("Skip v0 cache manager launch: V1 KVCache manager is enabled.")
return []
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #7528 +/- ##
==========================================
Coverage ? 71.50%
==========================================
Files ? 419
Lines ? 57477
Branches ? 9003
==========================================
Hits ? 41097
Misses ? 13609
Partials ? 2771
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Motivation
When
ENABLE_V1_KVCACHE_MANAGER=1, the v0 cache manager should not be started. However, there were two code paths where v0 cache manager processes could still be launched:EngineService.start_cache_service()— missing early return guard for V1 modeExpertService—start_cache_servicewas called without checkingENABLE_V1_KVCACHE_MANAGERflagThis caused v0 cache manager processes to be incorrectly spawned alongside v1, leading to conflicts and unexpected behavior.
Modifications
fastdeploy/engine/common_engine.py: Add early returnreturn []instart_cache_service()whenENABLE_V1_KVCACHE_MANAGERis enabledfastdeploy/engine/expert_service.py: Wrapstart_cache_servicecall withnot envs.ENABLE_V1_KVCACHE_MANAGERcondition guard to prevent v0 cache manager from starting when v1 is activeUsage or Command
Checklist
pre-commitbefore commit.