Skip to content

fix: add gateway-level default-deny AuthPolicy to local-deploy#866

Open
yossiovadia wants to merge 2 commits into
opendatahub-io:mainfrom
yossiovadia:fix/local-deploy-default-deny
Open

fix: add gateway-level default-deny AuthPolicy to local-deploy#866
yossiovadia wants to merge 2 commits into
opendatahub-io:mainfrom
yossiovadia:fix/local-deploy-default-deny

Conversation

@yossiovadia
Copy link
Copy Markdown
Collaborator

@yossiovadia yossiovadia commented May 6, 2026

Summary

  • Add gateway-default-auth AuthPolicy to local-deploy.sh — blocks all model inference routes that don't have a per-route AuthPolicy

Why

The production MaaS deployment already has this policy (deployment/base/maas-controller/policies/gateway-default-auth.yaml), but local-deploy was missing it. Without it, models deployed via the new inference.opendatahub.io CRDs are completely unprotected — requests bypass Authorino entirely, and any request (no auth, invalid auth) returns HTTP 200.

Found during testing of ExternalProvider/ExternalModel CRDs (ai-gateway-payload-processing#183). The default-deny is the same pattern used in production — routes with MaaSAuthPolicy override the defaults and allow authenticated access.

What changed

test/e2e/scripts/local-deploy.sh — added a block after subscription/auth policy deployment that applies the gateway-default-auth AuthPolicy (idempotent, skips if already exists).

Test plan

  • Verified on Kind cluster (maas-experimental):
    • Without default-deny: new CRD routes return HTTP 200 with no auth
    • With default-deny: new CRD routes return HTTP 401/403 (blocked)
    • Old MaaS routes (with per-route AuthPolicy) continue to work normally
    • Valid API key on new CRD routes returns HTTP 200 (when MaaSModelRef + AuthPolicy are configured)

Summary by CodeRabbit

  • Tests
    • Added a default-deny authorization policy for model inference routes in the test deployment.
    • Added a wait/check to ensure the policy is accepted and enforced, with timeout handling.
    • Updated smoke tests to validate responses allowing 401/403 or 200 depending on per-route authorization.

Adds a gateway-default-auth AuthPolicy that blocks all model inference
routes without a per-route AuthPolicy override. Routes with
MaaSAuthPolicy (created by MaaS controller) override this default
and allow authenticated access.

Without this, models deployed via the new inference.opendatahub.io
CRDs are completely unprotected — requests bypass Authorino entirely.

The production MaaS deployment already has this policy
(deployment/base/maas-controller/policies/gateway-default-auth.yaml)
but local-deploy was missing it.
@yossiovadia
Copy link
Copy Markdown
Collaborator Author

@coderabbitai review

@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented May 6, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: yossiovadia

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved label May 6, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 6, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 6, 2026

Caution

Review failed

Failed to post review comments

📝 Walkthrough

Walkthrough

A default-deny AuthPolicy deployment block was added to test/e2e/scripts/local-deploy.sh. The script now creates an AuthPolicy named gateway-default-auth in the gateway namespace targeting the maas-default-gateway resource, applies it via kubectl, waits for Kuadrant to accept the policy, and logs status. The gateway smoke-test was updated to accept 401/403 responses (or 200 with a warning) to reflect the default-deny behavior.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Security & Code Quality Issues

  • Missing embedded YAML in diff — cannot validate manifest semantics. Confirm the AuthPolicy manifest enforces deny-by-default without implicit allow rules. (Action: include the applied YAML in the PR or link to the manifest.)

    • Relevant: CWE-276 (Incorrect Default Permissions).
  • Scope selector risk — ensure the policy target selector precisely matches only the intended gateway resource to avoid accidental wider impact. (Action: verify namespace/name/labels in the selector.)

  • Acceptance wait handling — the script logs a warning on timeout but may continue; failing fast is safer for test deployments. (Action: make timeout handling configurable and fail the script on non-acceptance unless explicitly intended.)

  • Consistent error handling — ensure kubectl apply and wait operations have robust error checks (use set -e or explicit exit codes) so failures aren’t masked. (Action: add explicit checks and exit paths.)

  • Test assertion looseness — accepting 200 with only a warning can mask enforcement regressions. (Action: make acceptance of 200 conditional or escalate to a test failure when enforcement is expected.)

Only actionable issues listed.

🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main change: adding a gateway-level default-deny AuthPolicy to the local-deploy script.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
test/e2e/scripts/local-deploy.sh (1)

1545-1552: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Smoke test now masks the regression this PR fixes (CWE-754).

With gateway-default-auth in place, an unauthenticated GET /v1/models MUST return 401. The current smoke test still treats 200 as success (line 1548-1549), so if the AuthPolicy fails to attach (CRD version drift, Kuadrant operator not ready, gateway reconcile race, etc.) the script will print "Gateway routing working (200 OK)" and exit 0 — silently shipping a wide-open gateway. That's exactly the failure mode you're trying to prevent.

Tighten the assertion to fail on anything other than 401 (or 403).

🔒 Proposed fix
 # Test gateway routing (should get 401 — auth is working)
 HTTP_CODE=$(curl -s -o /dev/null -w '%{http_code}' --max-time 5 http://localhost:19090/v1/models 2>/dev/null || echo "000")
-if [[ "$HTTP_CODE" == "401" ]]; then
-  ok "Gateway routing + auth working (401 Unauthorized — expected without API key)"
-elif [[ "$HTTP_CODE" == "200" ]]; then
-  ok "Gateway routing working (200 OK)"
-else
-  warn "Gateway returned HTTP ${HTTP_CODE} (may need a moment to stabilize)"
-fi
+if [[ "$HTTP_CODE" == "401" || "$HTTP_CODE" == "403" ]]; then
+  ok "Gateway default-deny enforced (HTTP ${HTTP_CODE} without API key)"
+else
+  fail "Gateway default-deny NOT enforced: got HTTP ${HTTP_CODE} (expected 401/403). gateway-default-auth may not be attached."
+  exit 1
+fi
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/e2e/scripts/local-deploy.sh` around lines 1545 - 1552, The smoke test's
HTTP check (assigning HTTP_CODE from the curl call) currently treats 200 as
success; update the conditional logic that inspects HTTP_CODE (the block using
the ok/warn helpers) so that only 401 or 403 are treated as success (call ok
with a message like "Gateway routing + auth working (401/403 expected without
API key)"); for any other code (including 200) fail the script (e.g., call the
existing fail helper or exit non-zero) and emit a clear error message indicating
unexpected open gateway. Ensure you change the branches that reference HTTP_CODE
and the ok/warn behavior so 200 no longer counts as success.
🧹 Nitpick comments (3)
test/e2e/scripts/local-deploy.sh (3)

1485-1514: ⚡ Quick win

Existence-check skips drift correction.

if kubectl get authpolicy gateway-default-auth ... ; then ok means any subsequent edit to the policy spec in this script (or any local mutation) won't be re-applied on rerun. kubectl apply is already declarative/idempotent — drop the guard so re-running the script converges to the desired spec. Same anti-pattern is present elsewhere in this file, but at least don't propagate it for a security-critical resource.

♻️ Proposed fix
-if kubectl get authpolicy gateway-default-auth -n "$GATEWAY_NAMESPACE" &>/dev/null; then
-  ok "Gateway default-deny AuthPolicy already deployed"
-else
-  echo "  Deploying gateway default-deny AuthPolicy..."
-  kubectl apply -f - <<EOF
+echo "  Applying gateway default-deny AuthPolicy..."
+kubectl apply -f - <<EOF
 apiVersion: kuadrant.io/v1
 kind: AuthPolicy
 metadata:
   name: gateway-default-auth
   namespace: ${GATEWAY_NAMESPACE}
 spec:
   targetRef:
     group: gateway.networking.k8s.io
     kind: Gateway
     name: maas-default-gateway
   defaults:
     rules:
       authentication: {}
       authorization:
         deny-unconfigured-models:
           metrics: false
           priority: 0
           patternMatching:
             patterns:
             - operator: eq
               selector: "context.request.http.method"
               value: "__deny_unconfigured_models__"
 EOF
-  ok "Gateway default-deny AuthPolicy deployed"
-fi
+ok "Gateway default-deny AuthPolicy applied"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/e2e/scripts/local-deploy.sh` around lines 1485 - 1514, The current
existence-check around kubectl get for the AuthPolicy named gateway-default-auth
prevents drift correction; remove the if/else guard and always run the kubectl
apply -f - <<EOF block so the AuthPolicy (gateway-default-auth) in
GATEWAY_NAMESPACE is declaratively reconciled on every script run; keep the same
YAML payload and ok "Gateway default-deny AuthPolicy deployed" message after
apply to preserve feedback.

1500-1511: 💤 Low value

Replace fragile magic-string deny rule with documented Kuadrant default-deny idiom.

The request.http.method == "__deny_unconfigured_models__" pattern is brittle; if Kuadrant normalizes selector semantics or someone removes the placeholder, the rule silently breaks. Kuadrant 1.3.1 (your deployed version) documents a standard idiom: define spec.defaults.rules.authorization.deny-all.opa.rego: "allow = false" in a gateway-level AuthPolicy. This is explicit, intention-revealing, and maintained by Kuadrant upstream.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/e2e/scripts/local-deploy.sh` around lines 1500 - 1511, Replace the
brittle magic-string rule under
defaults.rules.authorization.deny-unconfigured-models (the patternMatching
selector "context.request.http.method" == "__deny_unconfigured_models__") with
Kuadrant's documented deny-all idiom: remove the deny-unconfigured-models rule
and instead add a gateway-level AuthPolicy default entry
spec.defaults.rules.authorization.deny-all.opa.rego with the body "allow =
false"; update the defaults/auth policy where the symbols defaults, rules,
authentication, authorization, and deny-unconfigured-models appear so the policy
uses deny-all OPA rego rather than the fragile selector-based pattern.

1512-1514: ⚡ Quick win

Add explicit wait for AuthPolicy Accepted and Enforced conditions instead of generic sleep.

The script deploys the Gateway default-deny AuthPolicy but immediately proceeds with a generic 20-second sleep before smoke tests, risking false positives if Kuadrant's async policy enforcement hasn't completed. Add explicit condition waits:

Proposed fix
   ok "Gateway default-deny AuthPolicy deployed"
 fi
+
+# Wait for Kuadrant to accept and enforce the policy before traffic tests.
+kubectl wait --for=condition=Accepted authpolicy/gateway-default-auth \
+  -n "$GATEWAY_NAMESPACE" --timeout=60s || \
+  warn "gateway-default-auth not Accepted within 60s"
+kubectl wait --for=condition=Enforced authpolicy/gateway-default-auth \
+  -n "$GATEWAY_NAMESPACE" --timeout=60s || \
+  warn "gateway-default-auth not Enforced within 60s"

Kuadrant's AuthPolicy status conditions (Accepted and Enforced) provide deterministic readiness signals as defined in RFC 0004. Replace the generic sleep with these specific checks to fail fast if policy enforcement fails.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/e2e/scripts/local-deploy.sh` around lines 1512 - 1514, Replace the
generic sleep after deploying the Gateway default-deny AuthPolicy with explicit
readiness checks that wait for the AuthPolicy's status.conditions "Accepted" and
"Enforced" to become True; locate the AuthPolicy apply/creation step in the
script (the resource name or variable used to create the Gateway default-deny
AuthPolicy) and replace the "sleep 20" with deterministic checks (e.g., kubectl
wait or a short polling loop that uses kubectl get <AuthPolicyName> -o json and
inspects status.conditions for types Accepted and Enforced == True), timeout
quickly (e.g., 60s) and exit non‑zero if either condition is not satisfied so
the smoke tests run only after the policy is accepted and enforced.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@test/e2e/scripts/local-deploy.sh`:
- Around line 1545-1552: The smoke test's HTTP check (assigning HTTP_CODE from
the curl call) currently treats 200 as success; update the conditional logic
that inspects HTTP_CODE (the block using the ok/warn helpers) so that only 401
or 403 are treated as success (call ok with a message like "Gateway routing +
auth working (401/403 expected without API key)"); for any other code (including
200) fail the script (e.g., call the existing fail helper or exit non-zero) and
emit a clear error message indicating unexpected open gateway. Ensure you change
the branches that reference HTTP_CODE and the ok/warn behavior so 200 no longer
counts as success.

---

Nitpick comments:
In `@test/e2e/scripts/local-deploy.sh`:
- Around line 1485-1514: The current existence-check around kubectl get for the
AuthPolicy named gateway-default-auth prevents drift correction; remove the
if/else guard and always run the kubectl apply -f - <<EOF block so the
AuthPolicy (gateway-default-auth) in GATEWAY_NAMESPACE is declaratively
reconciled on every script run; keep the same YAML payload and ok "Gateway
default-deny AuthPolicy deployed" message after apply to preserve feedback.
- Around line 1500-1511: Replace the brittle magic-string rule under
defaults.rules.authorization.deny-unconfigured-models (the patternMatching
selector "context.request.http.method" == "__deny_unconfigured_models__") with
Kuadrant's documented deny-all idiom: remove the deny-unconfigured-models rule
and instead add a gateway-level AuthPolicy default entry
spec.defaults.rules.authorization.deny-all.opa.rego with the body "allow =
false"; update the defaults/auth policy where the symbols defaults, rules,
authentication, authorization, and deny-unconfigured-models appear so the policy
uses deny-all OPA rego rather than the fragile selector-based pattern.
- Around line 1512-1514: Replace the generic sleep after deploying the Gateway
default-deny AuthPolicy with explicit readiness checks that wait for the
AuthPolicy's status.conditions "Accepted" and "Enforced" to become True; locate
the AuthPolicy apply/creation step in the script (the resource name or variable
used to create the Gateway default-deny AuthPolicy) and replace the "sleep 20"
with deterministic checks (e.g., kubectl wait or a short polling loop that uses
kubectl get <AuthPolicyName> -o json and inspects status.conditions for types
Accepted and Enforced == True), timeout quickly (e.g., 60s) and exit non‑zero if
either condition is not satisfied so the smoke tests run only after the policy
is accepted and enforced.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 35495ae9-a5c2-4e39-9876-1bc05572520e

📥 Commits

Reviewing files that changed from the base of the PR and between 4ea1a85 and a43573e.

📒 Files selected for processing (1)
  • test/e2e/scripts/local-deploy.sh

…tighten smoke test

- Drop existence guard: always kubectl apply (idempotent, ensures drift correction)
- Wait for AuthPolicy Accepted condition before proceeding (replaces blind sleep)
- Smoke test: 200 without auth is now a warning, not success
@rhods-ci-bot
Copy link
Copy Markdown

@yossiovadia: The following test has Failed:

OCI Artifact Browser URL

View in Artifact Browser

Inspecting Test Artifacts Manually

To inspect your test artifacts manually, follow these steps:

  1. Install ORAS (see the ORAS installation guide).
  2. Download artifacts with the following commands:
mkdir -p oras-artifacts
cd oras-artifacts
oras pull quay.io/opendatahub/odh-ci-artifacts:maas-group-test-2fwxp

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants