Skip to content

[BugFix][KVCache] Fix v0 cache manager incorrectly started when ENABLE_V1_KVCACHE_MANAGER is enabled#7528

Open
kevincheng2 wants to merge 1 commit intoPaddlePaddle:developfrom
kevincheng2:fix/cache_manager_v1_bug_pr
Open

[BugFix][KVCache] Fix v0 cache manager incorrectly started when ENABLE_V1_KVCACHE_MANAGER is enabled#7528
kevincheng2 wants to merge 1 commit intoPaddlePaddle:developfrom
kevincheng2:fix/cache_manager_v1_bug_pr

Conversation

@kevincheng2
Copy link
Copy Markdown
Collaborator

Motivation

When ENABLE_V1_KVCACHE_MANAGER=1, the v0 cache manager should not be started. However, there were two code paths where v0 cache manager processes could still be launched:

  1. EngineService.start_cache_service() — missing early return guard for V1 mode
  2. ExpertServicestart_cache_service was called without checking ENABLE_V1_KVCACHE_MANAGER flag

This caused v0 cache manager processes to be incorrectly spawned alongside v1, leading to conflicts and unexpected behavior.

Modifications

  • fastdeploy/engine/common_engine.py: Add early return return [] in start_cache_service() when ENABLE_V1_KVCACHE_MANAGER is enabled
  • fastdeploy/engine/expert_service.py: Wrap start_cache_service call with not envs.ENABLE_V1_KVCACHE_MANAGER condition guard to prevent v0 cache manager from starting when v1 is active

Usage or Command

# Enable Cache Manager V1 — verify v0 cache manager is NOT started
export ENABLE_V1_KVCACHE_MANAGER=1
python -m fastdeploy.entrypoints.openai.api_server \
  --model <model_path> \
  --tensor-parallel-size 1

Checklist

  • Add at least a tag in the PR title.
  • Format your code, run pre-commit before commit.
  • Add unit tests. The fix is a guard condition to prevent incorrect code path execution; existing tests cover the affected functions.
  • Provide accuracy results. N/A — this is a bug fix that prevents incorrect process spawning, not a model computation change.

…E_V1_KVCACHE_MANAGER is enabled

## Motivation

When `ENABLE_V1_KVCACHE_MANAGER=1`, the v0 cache manager should not be
started. However, there were two code paths where v0 cache manager processes
could still be launched:
1. `EngineService.start_cache_service()` - missing early return guard
2. `ExpertService` - `start_cache_service` called without checking `ENABLE_V1_KVCACHE_MANAGER`

## Modifications

- `fastdeploy/engine/common_engine.py`: Add early return in `start_cache_service()` when `ENABLE_V1_KVCACHE_MANAGER` is enabled
- `fastdeploy/engine/expert_service.py`: Wrap `start_cache_service` call with `not envs.ENABLE_V1_KVCACHE_MANAGER` guard to prevent v0 cache manager from starting

## Usage or Command

```bash
# Enable Cache Manager V1 to verify v0 cache manager is not started
export ENABLE_V1_KVCACHE_MANAGER=1
python -m fastdeploy.entrypoints.openai.api_server \
  --model <model_path> \
  --tensor-parallel-size 1
```
@paddle-bot
Copy link
Copy Markdown

paddle-bot Bot commented Apr 21, 2026

Thanks for your contribution!

Copy link
Copy Markdown

@PaddlePaddle-bot PaddlePaddle-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Code Review | 2026-04-21 15:39:16

📋 Review 摘要

PR 概述:修复当 ENABLE_V1_KVCACHE_MANAGER=1 时,v0 cache manager 仍被错误启动的 bug
变更范围engine/common_engine.pyengine/expert_service.py
影响面 TagKVCache Engine

问题

级别 文件 概述
🟡 建议 common_engine.py:2178 early return 处可补充 debug 日志提升可观测性

总体评价

修复方案清晰正确。在 start_cache_service() 方法内部添加 V1 模式的 early return 守卫,同时在 ExpertService.start() 调用方也增加了条件判断,形成双重防护。经核查代码库中所有 start_cache_service 调用点(common_engine.py 2处、engine.py 3处、expert_service.py 1处),其中 engine.py:155 的调用路径虽未在本 PR 中添加调用方守卫,但已被方法内部的 early return 兜底保护,不会产生 bug。整体变更最小化且有效,未发现阻塞性问题。

threading.Thread(target=decode_loop, daemon=True).start()

def start_cache_service(self, device_ids, ipc_signal_suffix):
if envs.ENABLE_V1_KVCACHE_MANAGER:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 建议 方法级守卫 + 调用方守卫构成了双重防护,逻辑正确。但建议在 early return 处添加一行 debug 日志,方便排查 V1 模式下是否意外进入此路径,提升可观测性。

if envs.ENABLE_V1_KVCACHE_MANAGER:
    console_logger.debug("Skip v0 cache manager launch: V1 KVCache manager is enabled.")
    return []

@codecov-commenter
Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 0% with 3 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (develop@4c6364e). Learn more about missing BASE report.

Files with missing lines Patch % Lines
fastdeploy/engine/common_engine.py 0.00% 1 Missing and 1 partial ⚠️
fastdeploy/engine/expert_service.py 0.00% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             develop    #7528   +/-   ##
==========================================
  Coverage           ?   71.50%           
==========================================
  Files              ?      419           
  Lines              ?    57477           
  Branches           ?     9003           
==========================================
  Hits               ?    41097           
  Misses             ?    13609           
  Partials           ?     2771           
Flag Coverage Δ
GPU 71.50% <0.00%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants