Skip to content

refactor(pricing): align parenthesized reasoning tiers with CLIProxyAPI#480

Open
makoMakoGo wants to merge 3 commits intojunhoyeo:mainfrom
makoMakoGo:fix/cliproxyapi-tier-alignment
Open

refactor(pricing): align parenthesized reasoning tiers with CLIProxyAPI#480
makoMakoGo wants to merge 3 commits intojunhoyeo:mainfrom
makoMakoGo:fix/cliproxyapi-tier-alignment

Conversation

@makoMakoGo
Copy link
Copy Markdown
Contributor

@makoMakoGo makoMakoGo commented Apr 28, 2026

Summary

Follow-up to #478. Generalize the model(level) parenthesized reasoning-tier handling so it matches the upstream contract documented at CLIProxyAPI · Thinking Budgets via Model Name Parentheses. #478 landed a narrow GPT-only / four-level subset; in practice CLIProxyAPI strips the parentheses before routing for any model family (Gemini / Claude / OpenAI / Codex / OpenRouter) and accepts a wider level vocabulary, so pricing resolution should mirror that.

Behavior

  • Accept the full CLIProxyAPI level set (case-insensitive): minimal, low, medium, high, xhigh, auto, none. Numeric thinking budgets remain out of scope.
  • Drop the GPT-family restriction; Claude, Gemini, and any other family routed through CLIProxyAPI now resolve (level) suffixes the same way (e.g. claude-sonnet-4-5(none)claude-sonnet-4-5, gemini-3-pro(auto)openrouter/google/gemini-3-pro-preview).
  • Fold the lookup pipeline: the entry point strips a recognized parenthesized tier once and falls through the existing dash-suffix / prefix-strip paths. Routing prefixes compose cleanly: myproxy-gpt-5.2(xhigh)gpt-5.2, antigravity-claude-sonnet-4-5(high)claude-sonnet-4-5.
  • Values outside the level set are not stripped: gpt-5.2(weirdgarbage), gpt-5.2(1024), and gpt-5.2() all return None instead of silently misresolving.
  • Remove the dedicated try_strip_prefixed_parenthesized_reasoning_tier helper and the unreachable parenthesis branches inside try_strip_unknown_prefix that the previous structure left behind. Net -92 +80.

Caveat / Future Work

normalize_model_for_grouping is now asymmetric between the two equivalent CLIProxyAPI suffix forms:

input grouping output folded?
gpt-5.2(xhigh) gpt-5.2 yes
gpt-5.2-xhigh gpt-5.2-xhigh no
claude-opus-4-5-high claude-opus-4-5-high no

Pricing lookup folds both forms to the same base (and same cost), so reports built on normalize_model_for_grouping will fragment cost-equivalent rows when the dash form appears. Aligning the dash path requires introducing a CLIProxyAPI-level whitelist into normalize_model_for_grouping (so -codex / -codex-max / -nano — real model variants — are not folded). That is a larger behavior change touching existing grouping assertions and is intentionally out of scope for this PR; flagging it here as a follow-up refactor target.

Test plan

  • cargo test -p tokscale-core — 599 pass (one pre-existing flaky parallel-test interaction in test_parse_all_messages_fireworks_provider_kept_under_synthetic_only_filter; passes single-threaded, unrelated to this change)
  • cargo test --workspace — 1127 pass
  • CLI smoke:
    • tokscale pricing 'gpt-5.2(xhigh)' --json → matches gpt-5.2
    • tokscale pricing 'claude-sonnet-4-5(none)' --json → matches claude-sonnet-4-5
    • tokscale pricing 'myproxy-gpt-5.2(auto)' --json → matches gpt-5.2

Summary by cubic

Align pricing lookup with CLIProxyAPI by stripping (level) reasoning suffixes across all model families and rejecting unrecognized (...) to avoid misrouting. Prefixed routes and dash variants now resolve cleanly to the base model.

  • New Features

    • Accept minimal, low, medium, high, xhigh, auto, none (case-insensitive); numeric budgets remain out of scope.
    • Strip (level) for all families and prefixed IDs (e.g., myproxy-gpt-5.2(xhigh)gpt-5.2, antigravity-claude-sonnet-4-5(high)claude-sonnet-4-5).
    • Reject unknown/empty tiers, including when paired with other suffixes, with no fallback to generic stripping (e.g., gpt-5.2(weirdgarbage), gpt-5.2(), gpt-5.2-codex(invalid)).
  • Refactors

    • Strip recognized (level) once at entry, then reuse existing suffix/prefix logic.
    • Remove try_strip_prefixed_parenthesized_reasoning_tier, has_parenthesized_suffix, and dead parenthesis branches in prefix stripping.
    • Rename strip_gpt_parenthesized_reasoning_tier to strip_parenthesized_reasoning_tier and use it in normalize_model_for_grouping.

Written for commit afbf220. Summary will update on new commits. Review in cubic

Generalize the parenthesized `(level)` suffix handling to match the upstream
CLIProxyAPI contract (https://help.router-for.me/configuration/thinking) so
pricing resolution stays consistent with how the proxy itself routes models:

- Drop the GPT-only restriction; Claude, Gemini, and any other family routed
  through CLIProxyAPI now resolve `(level)` suffixes the same way.
- Expand the accepted level set from {low, medium, high, xhigh} to the full
  CLIProxyAPI level vocabulary: {minimal, low, medium, high, xhigh, auto, none}.
  Numeric thinking budgets remain out of scope.
- Fold the lookup pipeline: the entry point strips a recognized parenthesized
  tier once and falls through the existing dash-suffix / prefix-strip paths,
  so prefixed routes (e.g. `myproxy-gpt-5.2(xhigh)`,
  `antigravity-claude-sonnet-4-5(high)`) resolve via the same mechanism that
  already handles dash forms.
- Remove the dedicated `try_strip_prefixed_parenthesized_reasoning_tier`
  helper and the unreachable parenthesis branches inside
  `try_strip_unknown_prefix` that the previous structure left behind.
@vercel
Copy link
Copy Markdown
Contributor

vercel Bot commented Apr 28, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
tokscale Ignored Ignored Preview Apr 28, 2026 6:57am

Request Review

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 2 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="crates/tokscale-core/src/pricing/lookup.rs">

<violation number="1" location="crates/tokscale-core/src/pricing/lookup.rs:253">
P2: Unknown parenthesized tiers now fall through into generic `-suffix` stripping, allowing silent misresolution to shorter model ids (e.g., `...-0125(invalid)` → `...-turbo`) instead of returning `None`.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Comment thread crates/tokscale-core/src/pricing/lookup.rs
@makoMakoGo
Copy link
Copy Markdown
Contributor Author

This is a large-scope change — if you think it has issues, just close it. The GPT-related changes in #478 are already sufficient to cover most of the scenarios that weren’t considered before.

Cubic flagged that the simplified `lookup_with_source_and_provider` path
would fall through to `try_strip_unknown_suffix` for inputs like
`gpt-5.2-codex(invalid)`, which then split on `-` and silently matched a
shorter, unrelated model id (`gpt-5.2`). Restore the short-circuit
semantics from the original PR: when the input ends with `(...)` but the
contents are not a recognized CLIProxyAPI level, return `None` instead of
peeling the parenthesized fragment off as if it were a generic suffix.
The fall-through still applies for recognized levels, so prefix routing
(`myproxy-gpt-5.2(xhigh)` → `gpt-5.2`) keeps working.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant