fix: improve flaky CI test resilience for channel type limits#238
fix: improve flaky CI test resilience for channel type limits#238
Conversation
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 53 minutes and 14 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughTest infrastructure updated to improve transient error detection for stream API exceptions and introduce a decorator utility for cleaning up stale channel types before test execution. Changes
Sequence DiagramsequenceDiagram
participant Test as Test Function
participant Decorator as cleanup_channel_types
participant Client as Client Fixture
participant API as Chat API
Test->>Decorator: Execute with decorator
Decorator->>Decorator: Resolve client from kwargs/signature
Decorator->>Client: Request list_channel_types()
Client->>API: GET /channel_types
API-->>Client: Return channel types list
Client-->>Decorator: Channel types data
Decorator->>Decorator: Filter non-builtin types older than 2min
Decorator->>API: DELETE stale channel types
API-->>Decorator: Deletion complete (suppress errors)
Decorator->>Decorator: Sleep 2 seconds
Decorator->>Test: Invoke original test function
Test-->>Decorator: Test execution complete
Estimated Code Review Effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
tests/base.py (1)
123-129: Scope stale cleanup to test-owned channel types.This loop deletes every non-builtin type older than two minutes. That is broader than the resources these tests create (
testtype*/testdeltype*) and can remove legitimate custom types from the shared app. Prefer filtering on an explicit test prefix or another ownership marker before deleting.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/base.py` around lines 123 - 129, The cleanup loop iterates resp.data.channel_types and deletes any non-builtin type older than _STALE_THRESHOLD, which can remove shared custom types; restrict deletion to only test-owned types by checking the channel type name for the test prefixes used in this test suite (e.g., names starting with "testtype" or "testdeltype") before calling client.chat.delete_channel_type(name=name), leaving the existing _BUILTIN_CHANNEL_TYPES and timestamp check intact and retaining the try/except around the delete call.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/test_chat_misc.py`:
- Around line 393-394: The decorators on the channel-type CRUD tests are in the
wrong order: swap the decorator order so `@retry_on_transient_error`() wraps the
test and `@cleanup_channel_types` is inside it (i.e., place
`@retry_on_transient_error`() above `@cleanup_channel_types`) so cleanup runs on
every retry attempt; make the same swap for the other channel-type CRUD tests
that use these two decorators (ensure tests referencing cleanup_channel_types
and retry_on_transient_error use the retry decorator outermost).
---
Nitpick comments:
In `@tests/base.py`:
- Around line 123-129: The cleanup loop iterates resp.data.channel_types and
deletes any non-builtin type older than _STALE_THRESHOLD, which can remove
shared custom types; restrict deletion to only test-owned types by checking the
channel type name for the test prefixes used in this test suite (e.g., names
starting with "testtype" or "testdeltype") before calling
client.chat.delete_channel_type(name=name), leaving the existing
_BUILTIN_CHANNEL_TYPES and timestamp check intact and retaining the try/except
around the delete call.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 1abd8b6c-af72-44f0-b27b-8c3d0a312894
📒 Files selected for processing (2)
tests/base.pytests/test_chat_misc.py
- Broaden _is_transient_error to catch 429, rate limits, and resource limit errors (e.g. "maximum number of custom channel types") - Use str(exc) instead of http_response.text for reliable phrase matching - Add @cleanup_channel_types decorator that deletes stale (>2min) non-builtin channel types before tests to free slots in shared CI env - Apply @retry_on_transient_error and @cleanup_channel_types to channel type CRUD tests and event_hooks test
Why
CI runs 4 parallel runners (Python 3.10-3.13) that all share one Stream app with a hard limit of 50 custom channel types. Each runner executes 4 tests that create a channel type and delete it in
finally. In practice, types leak: tests fail beforefinally, ordelete_channel_typeitself fails due to eventual consistency. Over time the app accumulates 50 stale types and everycreate_channel_typecall returns 400:Retry alone does not help because the limit is permanently saturated, not temporarily.
Changes
@cleanup_channel_typesdecorator (tests/base.py), runs before each channel type test:messaging,livestream,team,gaming,commerce)The 2-minute threshold avoids deleting types that a parallel runner just created in the current run.
_is_transient_errorimprovements (tests/base.py):str(exc)instead ofhttp_response.textfor reliable phrase matchingNew decorator applications (
tests/test_chat_misc.py):@cleanup_channel_types+@retry_on_transient_error()on all 4 channel type CRUD tests@retry_on_transient_error()ontest_event_hooks_sqs_snsSummary by CodeRabbit
Tests
Bug Fixes