feat: add ai-lakera-guard plugin for prompt injection and PII detection

### Description

Add an `ai-lakera-guard` plugin that integrates with the [Lakera Guard API](https://www.lakera.ai/ai-security) for LLM request/response security scanning — covering prompt injection, jailbreak detection, PII leakage, and content moderation.

**Background:**

APISIX currently has `ai-prompt-guard` (regex-based allow/deny), `ai-aws-content-moderation` (AWS Comprehend), and `ai-aliyun-content-moderation` (Aliyun SaaS). However, there is no integration with Lakera Guard, one of the most widely adopted commercial prompt security services. Kong already ships an `ai-lakera-guard` plugin; LiteLLM also supports Lakera as a guardrail backend.

Lakera Guard provides ML-based detection that goes beyond regex patterns — it catches semantic prompt injection, jailbreak attempts, PII entities, profanity, and unknown/malicious links through a single API call.

**Proposed design (for reference, open to adjustment during implementation):**

```yaml
plugins:
  ai-lakera-guard:
    direction: both              # input / output / both
    action: block                # block / alert (log only)

    endpoint:
      url: "https://api.lakera.ai/v2/guard"
      api_key_ref: "$secret://vault/kv/lakera/..."
      timeout_ms: 1000

    categories:
      - prompt_injection
      - jailbreak
      - pii
      - profanity
      - unknown_links

    reveal_failure_categories: true   # include matched categories in error response (for debugging)

    fail_open: false             # reject requests if Lakera API is unavailable (safety-first default)

    on_block:
      status: 400
      body: "Request blocked by security guard"
```

**Key design points:**

1. Follows APISIX's existing pattern of one plugin per external service (same as `ai-aws-content-moderation`, `ai-aliyun-content-moderation`)
2. Supports bidirectional scanning (both input prompts and LLM responses)
3. Users bring their own Lakera account and API key — the plugin is just the connector
4. `fail_open: false` as the safety-first default; users can override for availability-first scenarios
5. `reveal_failure_categories` helps debugging in dev; can be disabled in production
6. Category selection lets users enable only the checks they need
7. Integrates with APISIX secret management (`$secret://`) for API key storage

**This plugin is part of a broader effort** to build out APISIX's AI security plugin ecosystem. Other plugins in the family (like regex-based PII sanitizer, Presidio integration, Llama Guard integration) can be proposed separately. This issue focuses specifically on the Lakera Guard integration.

Happy to submit a PR if this direction makes sense.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add ai-lakera-guard plugin for prompt injection and PII detection #13291

Description

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

feat: add ai-lakera-guard plugin for prompt injection and PII detection #13291

Description

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions