Skip to content

Fix saturation detection and harden load generator#360

Open
Bslabe123 wants to merge 1 commit intokubernetes-sigs:mainfrom
Bslabe123:loadgen-preprocess-fix-v2
Open

Fix saturation detection and harden load generator#360
Bslabe123 wants to merge 1 commit intokubernetes-sigs:mainfrom
Bslabe123:loadgen-preprocess-fix-v2

Conversation

@Bslabe123
Copy link
Copy Markdown
Contributor

@Bslabe123 Bslabe123 commented Mar 2, 2026

Addresses: #300

Refactors LoadGenerator introducing a stepped load strategy. New approach continually monitors throughput, automatically detects saturation, and dynamically readjusts the testing range.

Motivation

Previous static load generation frequently failed on cold starts (0 successful requests) and required repetitive manual tuning to capture inflection points.

Changes

  • Stepped Preprocessing: Ramps up load while monitoring throughput; detects saturation when rates drop below 90% tolerance.
  • Automatically shifts the test window if initial rates are too high to capture the true saturation point.
  • Replaces arbitrary start-at-1 logic with proportional scaling for better resolution at lower rates.

Verification

  • Added tests/loadgen/test_saturation_robustness.py covering hard limits, early saturation, and tolerances.
  • Updated tests/loadgen/test_preprocess.py reflecting new rate calculation.
  • Validated manually using reproduce_rates.py for mathematical correctness.
  • Added loadgen preprocessing sanity check to e2e test suite

@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 2, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Bslabe123
Once this PR has been reviewed and has the lgtm label, please assign sergeykanzhelev for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Mar 2, 2026
@Bslabe123 Bslabe123 force-pushed the loadgen-preprocess-fix-v2 branch 4 times, most recently from 2ea8c0e to 50b891b Compare March 2, 2026 22:35
@jjk-g
Copy link
Copy Markdown
Collaborator

jjk-g commented Mar 5, 2026

Can you show an example of how the rates and duration are selected as you go through finding saturation?

@Bslabe123 Bslabe123 force-pushed the loadgen-preprocess-fix-v2 branch 2 times, most recently from 16a3c77 to cf583da Compare March 9, 2026 16:53
@Bslabe123 Bslabe123 changed the title [WIP] Fix saturation detection and harden load generator Fix saturation detection and harden load generator Mar 12, 2026
@Bslabe123
Copy link
Copy Markdown
Contributor Author

Data from last round of testing across various models, model server versions, hardware, and dataset configs:

Screenshot From 2026-03-12 10-53-57 Screenshot From 2026-03-12 10-54-17 Screenshot From 2026-03-12 10-54-42 Screenshot From 2026-03-12 10-54-55 Screenshot From 2026-03-12 10-57-24

@Bslabe123 Bslabe123 marked this pull request as ready for review March 12, 2026 18:58
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 12, 2026
Comment thread inference_perf/loadgen/load_generator.py
Comment thread inference_perf/loadgen/load_generator.py
if self.sweep_config.stage_duration > 10:
step_duration = self.sweep_config.stage_duration

throughput_tolerance = 0.90
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any parameters we have hardcoded should be exposed with a reasonable default in the sweep config, or a reasonable reason (MAX vs a configurable) and then they should be moved outside the scope of the function.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we consider an alternate approach before exposing any new parameters in the sweep config unless its guaranteed that the generated stages are a function the new parameters? I'm worried about exposing implementation details that might turn into breaking API changes if we ever decide change and/or harden the preprocessing algorithm.

@Bslabe123 Bslabe123 force-pushed the loadgen-preprocess-fix-v2 branch 2 times, most recently from 81bb696 to e9a5416 Compare March 20, 2026 21:05
@Bslabe123 Bslabe123 force-pushed the loadgen-preprocess-fix-v2 branch from e9a5416 to b204285 Compare April 1, 2026 16:12
@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Apr 1, 2026
@Bslabe123 Bslabe123 force-pushed the loadgen-preprocess-fix-v2 branch 5 times, most recently from 7eb2bbf to aa3d93c Compare April 2, 2026 16:54
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 2, 2026
@Bslabe123 Bslabe123 force-pushed the loadgen-preprocess-fix-v2 branch 3 times, most recently from f3e2b3c to 1147ea2 Compare April 3, 2026 15:50
@k8s-ci-robot k8s-ci-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Apr 7, 2026
@Bslabe123 Bslabe123 force-pushed the loadgen-preprocess-fix-v2 branch from 93ec03a to b294a37 Compare April 10, 2026 14:26
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 10, 2026
@Bslabe123 Bslabe123 force-pushed the loadgen-preprocess-fix-v2 branch 14 times, most recently from 6f113fb to 9d8c8f2 Compare April 13, 2026 14:22
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 16, 2026
@Bslabe123 Bslabe123 force-pushed the loadgen-preprocess-fix-v2 branch 3 times, most recently from b149a8b to c1f444d Compare April 17, 2026 14:30
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 17, 2026
Addresses: kubernetes-sigs#379

Minor cleanup, makes sure all python files have a licence at the top.
@Bslabe123 Bslabe123 force-pushed the loadgen-preprocess-fix-v2 branch from c1f444d to c337e5c Compare April 17, 2026 14:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants