fix(core): merge IDE context into user prompt#3980
Conversation
Code Coverage Summary
CLI Package - Full Text ReportCore Package - Full Text ReportFor detailed HTML reports, please see the 'coverage-reports-22.x-ubuntu-latest' artifact from the main CI run. |
Dogfooding / Manual IDE Smoke TestI used the following manual flow to dogfood the IDE-mode behavior against this PR branch:
Expected result: IDE context is attached to the current user request as a Observed result: manual IDE dogfooding with the PR bundle behaved as expected. The lower-level request-shape cases, including pending tool-call skipping, arena cancellation, and Test result |
There was a problem hiding this comment.
Pull request overview
This PR updates core request construction so IDE/editor context is wrapped in a <system-reminder> envelope and merged into the current user request, instead of being injected as a separate user-history entry via chat.addHistory(). This keeps API history turn shapes stable in IDE mode and adds test coverage for the revised assembly and guard behavior.
Changes:
- Add helpers to wrap IDE context and prepend it into the first text part of the outgoing request.
- Delay IDE-context state updates until after the pre-turn arena cancellation check to avoid marking unsent context as sent.
- Update/sendMessageStream unit tests to assert merged-context behavior, closing-tag escaping, pending-tool-call skipping, and arena cancellation behavior.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| packages/core/src/core/client.ts | Merges IDE context into the outgoing request parts (wrapped in <system-reminder>), and delays IDE context state updates until after arena cancellation checks. |
| packages/core/src/core/client.test.ts | Updates and expands tests to validate merged IDE context in the request, correct escaping, and arena cancellation behavior without addHistory() injection. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
wenshao
left a comment
There was a problem hiding this comment.
[Critical] In packages/core/src/tools/{monitor,send-message,task-stop,web-fetch}.ts, the 5 trailing super() arguments were dropped during the merge of origin/main (commit f980f57). This removes shouldDefer: true (reverts to default false), searchHint keywords, and explicit isOutputMarkdown/canUpdateOutput/alwaysLoad values. These 4 tools will now appear in every API request instead of being discovered on-demand via ToolSearch, increasing per-request token consumption. origin/main still has these args — merging this PR as-is would revert them.
Please restore the 5 arguments in all 4 tools:
true, false, true, false, 'monitor watch tail log stream background'(monitor.ts)true, false, true, false, 'send message task communicate notify'(send-message.ts)true, false, true, false, 'task stop cancel kill background'(task-stop.ts)true, false, true, false, 'web fetch url http download content'(web-fetch.ts)
— DeepSeek/deepseek-v4-pro via Qwen Code /review
IDE mode now wraps editor context in a <system-reminder> block and prepends it to the current user request instead of inserting a separate user history entry via addHistory(). This preserves the API history turn shape and avoids extra user turns in IDE mode. Key changes: - IDE context merged into user request via prependToFirstTextPart() - State update deferred until after arena cancellation check - escapeClosingSystemReminderTags() hardens against tag injection including zero-width/control character variants - forceFullIdeContext reset on stream errors for correct resend - Context prompt updated to encourage active use of editor context Refs #3712
884ba78 to
a41dc8d
Compare
resetChat only cleared GeminiClient's new perModelGeneratorCache but dropped the BaseLlmClient.clearPerModelGeneratorCache() call that was present before the refactoring. sideQuery.ts and sessionTitle.ts still route through BaseLlmClient with the fast model, so stale generators survived session reset.
| /** | ||
| * Cache of per-model ContentGenerators keyed by model ID. | ||
| * Avoids rebuilding the generator (SDK instantiation, config resolution) | ||
| * on every side query (recap, title, tool summary). | ||
| * Cleared on session reset (resetChat) to pick up config changes. |
| @@ -1437,13 +1512,23 @@ export class GeminiClient { | |||
| // side queries for session recap / title / summary), resolve the target | |||
| // model's own ContentGeneratorConfig so that per-model settings like | |||
| // extra_body, samplingParams, and reasoning are not inherited from the | |||
| // main model's config. The retry authType is resolved alongside so that | |||
| // provider-specific checks (e.g. QWEN_OAUTH quota detection) reference | |||
| // the target model's provider. | |||
| const { contentGenerator, retryAuthType } = await this.config | |||
| .getBaseLlmClient() | |||
| .resolveForModel(model); | |||
|
|
|||
| // main model's config. | |||
|
The 4 tool files ( |
…iClient The per-model ContentGenerator resolution logic (resolveModelAcrossAuthTypes, createRetryAuthTypeForModel, createContentGeneratorForModel, perModelGeneratorCache) was inadvertently duplicated from BaseLlmClient into GeminiClient. This restores the original one-line delegation to BaseLlmClient.resolveForModel() and removes ~130 lines of redundant code to keep the PR focused on IDE context merging only.
wenshao
left a comment
There was a problem hiding this comment.
Critical coreToolScheduler.ts (~line 2143) still uses the old inline regex .replace(/<\/system-reminder>/gi, '<\\/system-reminder>') for closing-tag scrub instead of the new escapeClosingSystemReminderTags. The new function handles whitespace/zero-width/control-char variants that the old literal regex misses, leaving the rule/skill activation <system-reminder> envelope vulnerable to obfuscated closing-tag injection.
Suggestion packages/core/src/utils/xml.ts has no direct unit test file. The new escapeClosingSystemReminderTags function and its helpers are only indirectly covered via client.test.ts. Consider creating xml.test.ts with dedicated tests for tag variants, ignorable char edge cases, and boundary conditions.
— deepseek-v4-pro via Qwen Code /review
| */ | ||
| export function escapeClosingSystemReminderTags(text: string): string { | ||
| return text.replace(XML_TAG_CANDIDATE_RE, (tag) => | ||
| normalizeSystemReminderCandidateTag(tag) === '</system-reminder>' |
There was a problem hiding this comment.
[Critical] escapeClosingSystemReminderTags only escapes </system-reminder> closing tags. Opening <system-reminder> and self-closing <system-reminder/> tags pass through untouched. An attacker can inject fake system-reminder blocks via IDE selected text or file paths, which the model may interpret as trusted metadata for prompt injection.
| normalizeSystemReminderCandidateTag(tag) === '</system-reminder>' | |
| const normalized = normalizeSystemReminderCandidateTag(tag); | |
| if ( | |
| normalized === '</system-reminder>' || | |
| normalized === '<system-reminder>' || | |
| normalized === '<system-reminder/>' || | |
| normalized === '</system-reminder/>' | |
| ) { | |
| return '<\\/system-reminder>'; | |
| } |
— deepseek-v4-pro via Qwen Code /review
|
|
||
| function prependToFirstTextPart( | ||
| parts: PartUnion[], | ||
| textToPrepend: string, |
There was a problem hiding this comment.
[Suggestion] prependToFirstTextPart has 5 branches but only the object-with-text-property branch is exercised in tests. The empty textToPrepend early-return, empty parts array, no-text-part-found, and raw-string-part branches are all untested, creating regression risk for the core IDE-context-merge feature.
| textToPrepend: string, | |
| // Consider extracting to partUtils.ts and adding unit tests for all branches: | |
| // 1. !textToPrepend → early return | |
| // 2. parts.length === 0 → [{ text: textToPrepend }] | |
| // 3. textPartIndex === -1 → prepend new text part | |
| // 4. typeof textPart === 'string' → string concatenation | |
| // 5. object with text property → object spread (current coverage) |
— deepseek-v4-pro via Qwen Code /review
| * data-layer solution. Follow-up PR. | ||
| */ | ||
|
|
||
| private resolveModelAcrossAuthTypes( |
There was a problem hiding this comment.
[Suggestion] resolveModelAcrossAuthTypes, createContentGeneratorForModel, and perModelGeneratorCache in GeminiClient duplicate the same logic already present in BaseLlmClient.resolveForModel(). Two parallel caches (perModelGeneratorCache vs BaseLlmClient's internal cache) serve the same purpose. Consolidate into one location (as the existing TODO comment acknowledges) to avoid maintenance divergence.
— deepseek-v4-pro via Qwen Code /review
| model: string, | ||
| ): ResolvedModelConfig | undefined { | ||
| const modelsConfig = this.config.getModelsConfig(); | ||
| const allAuthTypes: AuthType[] = [ |
There was a problem hiding this comment.
[Suggestion] resolveModelAcrossAuthTypes in client.ts drops the optional chain and null guard that BaseLlmClient's version has. If getModelsConfig() returns a falsy value, this will throw TypeError while the original safely returns undefined.
| const allAuthTypes: AuthType[] = [ | |
| const modelsConfig = this.config.getModelsConfig?.(); | |
| if (!modelsConfig) return undefined; |
— deepseek-v4-pro via Qwen Code /review
| const allAuthTypes: AuthType[] = [ | ||
| AuthType.QWEN_OAUTH, | ||
| AuthType.USE_OPENAI, | ||
| AuthType.USE_VERTEX_AI, |
There was a problem hiding this comment.
[Suggestion] The allAuthTypes array is hardcoded. Adding a new AuthType enum value will not automatically include it, causing models registered under the new auth type to be silently skipped during cross-auth-type resolution. Consider deriving from Object.values(AuthType) or adding a unit test that verifies all enum members are covered.
| AuthType.USE_VERTEX_AI, | |
| const allAuthTypes: AuthType[] = Object.values(AuthType) as AuthType[]; |
— deepseek-v4-pro via Qwen Code /review
wenshao
left a comment
There was a problem hiding this comment.
[Suggestion] escapeClosingSystemReminderTags (new in packages/core/src/utils/xml.ts) lacks a standalone unit test file (xml.test.ts). The function is only indirectly covered through client.test.ts integration tests. As an exported utility that may gain additional callers, it should have dedicated unit tests covering: no-tag inputs, opening-tag-only inputs, edge code-point ranges in isSystemReminderTagIgnorable, and performance with large HTML/JSX content.
— DeepSeek/deepseek-v4-pro via Qwen Code /review
| * readable while preventing untrusted content, including visually hidden | ||
| * format/control characters inside the tag, from ending the reminder envelope. | ||
| */ | ||
| export function escapeClosingSystemReminderTags(text: string): string { |
There was a problem hiding this comment.
[Critical] escapeClosingSystemReminderTags only escapes closing </system-reminder> tag variants. Opening <system-reminder> and self-closing <system-reminder/> tags pass through untouched. An attacker can inject fake system-reminder blocks by including <system-reminder>malicious instructions</system-reminder> in a file opened in the IDE. Since the model is instructed to treat <system-reminder> content as trusted metadata, injected opening tags create ambiguous nested envelopes that the model may interpret as system-level directives.
| export function escapeClosingSystemReminderTags(text: string): string { | |
| // Also escape opening and self-closing system-reminder tag variants. | |
| // Options: (a) extend escapeClosingSystemReminderTags to cover `<system-reminder` and `<system-reminder/>`, | |
| // or (b) use full escapeXml() on the context body for defense-in-depth. |
— DeepSeek/deepseek-v4-pro via Qwen Code /review
| return `<system-reminder>\n${safeContextText}\n</system-reminder>`; | ||
| } | ||
|
|
||
| function prependToFirstTextPart( |
There was a problem hiding this comment.
[Suggestion] prependToFirstTextPart has 5 behavioral branches but only the object-with-text-property branch is directly exercised in tests. The empty textToPrepend early-return, empty parts array, no-text-part-found, and raw-string-part branches are not covered. Consider adding dedicated unit tests for these paths to prevent regressions if request construction logic changes.
— DeepSeek/deepseek-v4-pro via Qwen Code /review




Summary
<system-reminder>block and prepends it to the current user request instead of inserting a separate user history entry viaaddHistory().</system-reminder>inside IDE-provided text.Validation
</system-reminder>.chat.addHistory()user turn is added, unsafe closing tags from IDE text cannot break the reminder envelope, and a pre-turn arena cancellation does not mark IDE context as sent.client.test.tspassed all 102 tests; lint, typecheck, build, diff whitespace checks, and manual IDE smoke passed locally.cd packages/core && npx vitest run src/core/client.test.tsand inspect thesendMessageStreamIDE context tests.Scope / Risk
/rewindin IDE mode and does not change scheduler/tool-result<system-reminder>escaping; those are follow-up scopes.Testing Matrix
Testing matrix notes:
Linked Issues / Bugs
Closes #3712
This PR handles the core IDE context merge only. Re-enabling
/rewindin IDE mode and any related CLI UI cleanup should be handled in a separate follow-up PR.