Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
16 commits
Select commit Hold shift + click to select a range
a9b2bef
feat: add churn-guard hook and evidence-gate workflow
harsh-batheja Apr 5, 2026
958d917
fix(evidence-gate): prevent script injection and template verdict bypass
harsh-batheja Apr 5, 2026
7f8d13b
fix(pr-template): use HTML comment for verdict placeholder to prevent…
harsh-batheja Apr 5, 2026
0790edc
fix(evidence-gate): match --> anywhere in line for multi-line comment…
harsh-batheja Apr 5, 2026
5aaf838
feat: add duplicate code check with jscpd
harsh-batheja Apr 5, 2026
e11f6b4
fix(pr-template): move terminal test output example into HTML comment
harsh-batheja Apr 5, 2026
509990d
fix: strip HTML comments in claim extraction; fix backslash regex in …
harsh-batheja Apr 5, 2026
fe78cd1
fix(evidence-gate): handle > inside HTML comments in all awk/sed patt…
harsh-batheja Apr 5, 2026
4a17e38
fix(evidence-gate): align evidence heading detection and extraction c…
harsh-batheja Apr 5, 2026
7ca3163
feat(duplicate-check): write clone report to job summary
harsh-batheja Apr 5, 2026
69d5a63
feat(core): embed churn-guard in gh PATH wrapper for all agents
harsh-batheja Apr 5, 2026
f6ad6cb
fix(core): make churn-guard non-blocking (warn instead of block)
harsh-batheja Apr 5, 2026
4d44bd1
test(agent-codex): update wrapper version and case assertions for 0.3.0
harsh-batheja Apr 6, 2026
f10dbd7
fix(evidence-gate): make all checks non-blocking (warn instead of fail)
harsh-batheja Apr 6, 2026
4e19c31
fix(evidence-gate): move PASS message into else branch to avoid contr…
harsh-batheja Apr 6, 2026
5a33fe3
fix(evidence-gate): enforce evidence gate failures consistently
harsh-batheja Apr 6, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
34 changes: 34 additions & 0 deletions .github/pull_request_template.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
## Summary

<!-- What changed and why? 1–3 bullet points. -->

-
-

Closes #<!-- issue number -->

## Test plan

<!-- How was this tested? What should reviewers check? -->

- [ ]

## Evidence

**Claim class**: <!-- unit | integration | feat | fix | refactor | docs | chore -->

<!-- Describe what was tested and how. For `integration` and `pipeline-e2e` claims,
include a **Terminal test output** section with a fenced code block showing test
command output (e.g. pnpm test, vitest, jest). -->

**Terminal test output**:

<!--
Paste your actual test output in a fenced code block, e.g.:
```
$ pnpm test --filter my-package
✓ all tests passed
```
Comment thread
cursor[bot] marked this conversation as resolved.
-->

**Verdict**: <!-- replace this comment with PASS or INSUFFICIENT -->
55 changes: 55 additions & 0 deletions .github/workflows/duplicate-check.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
name: Duplicate Code Check

on:
push:
branches: [main]
pull_request:
branches: [main]

concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true

jobs:
duplicate-check:
name: Duplicate Code Check
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: pnpm/action-setup@v4
- uses: actions/setup-node@v4
with:
node-version: 20
cache: pnpm
- run: pnpm install --frozen-lockfile

- name: Run jscpd and capture output
id: jscpd
# Always run (don't stop on non-zero exit) so we can write the summary first.
# The exit code is preserved in steps.jscpd.outputs.exit_code and re-applied after.
run: |
set +e
OUTPUT=$(pnpm check:duplicates 2>&1)
EXIT_CODE=$?
set -e

echo "exit_code=$EXIT_CODE" >> "$GITHUB_OUTPUT"

# Write job summary so clone details surface on the PR checks page
{
if [ $EXIT_CODE -ne 0 ]; then
echo "## ❌ Duplicate Code Check — threshold exceeded"
else
echo "## ✅ Duplicate Code Check — within threshold"
fi
echo ""
echo '```'
echo "$OUTPUT"
echo '```'
echo ""
echo "_Threshold: 3% of lines. Clones shown above even when passing._"
echo "_To suppress a false positive, add the files to \`.jscpd.json\` ignore list._"
} >> "$GITHUB_STEP_SUMMARY"

# Re-apply the original exit code so the step fails when threshold is crossed
exit $EXIT_CODE
210 changes: 210 additions & 0 deletions .github/workflows/evidence-gate.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,210 @@
name: Evidence Gate

on:
pull_request:
types: [opened, synchronize, edited, reopened]

# Do not cancel in-progress runs: a second event (e.g. push + bot PR-body edit)
# would cancel the first job; GitHub surfaces that as a failed check on the same SHA.
concurrency:
group: evidence-gate-${{ github.event.pull_request.number }}
cancel-in-progress: false

permissions:
pull-requests: read

jobs:
evidence-gate:
name: Evidence Gate
runs-on: ubuntu-latest
# Skip entirely when PR is merged or closed — a merged PR stops receiving
# pull_request events so a stale failed check run cannot be overwritten.
# Evidence gate is a pre-merge gate; post-merge it has no function.
if: github.event.pull_request.merged == false && github.event.action != 'closed'
steps:
- uses: actions/checkout@v4.1.1

- name: Write PR body to temp file
env:
PR_BODY: ${{ github.event.pull_request.body }}
run: |
printf '%s' "$PR_BODY" > "$RUNNER_TEMP/pr_body.txt"
echo "Body fetched: ${#PR_BODY} chars"
if [ ${#PR_BODY} -eq 0 ]; then
echo "PR body is empty — treating as no evidence bundle"
echo "skip=true" >> "$GITHUB_OUTPUT"
fi
id: write_body

- name: Check for evidence bundle in PR body
id: check
run: |
if [ "${{ steps.write_body.outputs.skip }}" = "true" ]; then
echo "found=false" >> "$GITHUB_OUTPUT"
exit 0
fi

BODY=$(cat "$RUNNER_TEMP/pr_body.txt")
STRIPPED_BODY=$(printf '%s' "$BODY" | python3 -c 'import re, sys; sys.stdout.write(re.sub(r"<!--.*?-->", "", sys.stdin.read(), flags=re.S))')

if printf '%s' "$STRIPPED_BODY" | grep -qi '^[[:space:]]*## evidence'; then
echo "found=true" >> "$GITHUB_OUTPUT"
else
echo "found=false" >> "$GITHUB_OUTPUT"
fi

- name: Warn when no Evidence section found
if: steps.check.outputs.found == 'false'
run: |
echo "WARNING: No ## Evidence section found in PR body."
echo ""
echo "Consider adding an Evidence section to your PR body:"
echo ""
echo " ## Evidence"
echo " **Claim class**: unit | integration | feat | fix | refactor | docs | chore"
echo " **Verdict**: PASS | INSUFFICIENT"
echo ""
echo " Describe what was tested and how."
exit 1

Comment thread
harsh-batheja marked this conversation as resolved.
- name: Extract and normalize claim class
id: claim
if: steps.check.outputs.found == 'true'
run: |
BODY=$(cat "$RUNNER_TEMP/pr_body.txt")
STRIPPED_BODY=$(printf '%s' "$BODY" | python3 -c 'import re, sys; sys.stdout.write(re.sub(r"<!--.*?-->", "", sys.stdin.read(), flags=re.S))')

# Extract claim class — normalize to lowercase short form.
# Supports: **Claim class**: value (colon inside bold, space after colon)
# and list-bullet format "- **Claim class**: ...".
CLAIM=$(printf '%s' "$STRIPPED_BODY" \
| grep -i '\*\*Claim class' | grep -v '^- ' | head -1 \
| tr -d '*' \
| sed 's/.*Claim class: *//I' \
| sed 's/(.*//' \
| tr '[:upper:]' '[:lower:]' \
| tr ' ' '-' | tr -d '\t' \
| sed 's/^[ \t-]*//;s/[ \t-]*$//')

# Fallback: list-bullet format
if [ -z "$CLAIM" ]; then
CLAIM=$(printf '%s' "$STRIPPED_BODY" \
| grep -i '^-.*\*\*Claim class' | head -1 \
| tr -d '*' \
| sed 's/.*Claim class: *//I' \
| sed 's/(.*//' \
| tr '[:upper:]' '[:lower:]' \
| tr ' ' '-' | tr -d '\t' \
| sed 's/^[ \t-]*//;s/[ \t-]*$//')
fi
Comment thread
cursor[bot] marked this conversation as resolved.

# Normalize long forms to canonical short forms
case "$CLAIM" in
unit-test-coverage|unit-test) CLAIM="unit" ;;
integration-test) CLAIM="integration" ;;
bug-fix) CLAIM="fix" ;;
feature) CLAIM="feat" ;;
esac

echo "Claim class: $CLAIM"
echo "claim=$CLAIM" >> "$GITHUB_OUTPUT"

- name: Enforce strong artifact evidence for integration claims
if: steps.check.outputs.found == 'true'
env:
CLAIM: ${{ steps.claim.outputs.claim }}
run: |
BODY=$(cat "$RUNNER_TEMP/pr_body.txt")
STRIPPED_BODY=$(printf '%s' "$BODY" | python3 -c 'import re, sys; sys.stdout.write(re.sub(r"<!--.*?-->", "", sys.stdin.read(), flags=re.S))')

# integration + pipeline-e2e claims require rich artifacts; unit/fix/feat/docs/chore can use terminal evidence only.
if [ "$CLAIM" = "integration" ] || [ "$CLAIM" = "pipeline-e2e" ]; then
# Strip HTML comments before scanning for Evidence section.
EVIDENCE=$(printf '%s\n' "$STRIPPED_BODY" | awk '
tolower($0) ~ /^[[:space:]]*##[[:space:]]+evidence([[:space:]]|$)/ { in_section=1; next }
in_section && /^[[:space:]]*##[[:space:]]/ { exit }
in_section { print }
')

# Warn on fabricated/placeholder evidence
if printf '%s' "$EVIDENCE" | grep -qiE '\bsimulated\b'; then
echo "WARNING: Evidence contains 'simulated' — fabricated output is not valid."
fi
if printf '%s' "$EVIDENCE" | grep -qiE 'https?://(www\.)?example\.com'; then
echo "WARNING: Evidence contains example.com placeholder URL — use a real screenshot URL."
fi
if printf '%s' "$EVIDENCE" | grep -qiE '<screenshot[[:space:]]path>|<path>|<value>|\bTODO\b|\bTBD\b'; then
echo "WARNING: Evidence contains placeholder template text — fill in real values."
fi

# Terminal/test output: fenced code block with a concrete test-command keyword.
HAS_OUTPUT=false
TTO_BLOCK=$(printf '%s' "$EVIDENCE" | awk '
/\*\*Terminal test output(\*\*)?:/ { show=1 }
show && /\*\*UI media(\*\*)?:/ { exit }
show { print }
')
if printf '%s' "$TTO_BLOCK" | grep -q '```'; then
BLOCK=$(printf '%s' "$TTO_BLOCK" | sed -n '/```/,/```/p' | tail -n +2 | sed '$d')
if printf '%s' "$BLOCK" | grep -qiE '\$[[:space:]]*(pnpm|npm|pytest|vitest|jest|go[[:space:]]+test)[[:space:]]'; then
HAS_OUTPUT=true
fi
fi
if [ "$HAS_OUTPUT" = "false" ] && printf '%s' "$TTO_BLOCK" | grep -qE '\*\*Terminal test output(\*\*)?:' && printf '%s' "$TTO_BLOCK" | grep -qE 'https://[^[:space:]]+'; then
HAS_OUTPUT=true
fi

if [ "$HAS_OUTPUT" != "true" ]; then
echo "WARNING: Strong evidence standard not met for claim class '$CLAIM'."
echo "Recommended for integration/pipeline-e2e claims:"
echo " - **Terminal test output**: fenced code block with a concrete test command"
echo " (e.g. \`\`\` block containing: pnpm test, npm test, vitest, jest, etc.)"
exit 1
else
echo "Strong evidence standard PASS for $CLAIM"
fi
else
echo "Strong artifact check not required for claim class: $CLAIM"
fi

- name: Validate verdict is present and consistent
if: steps.check.outputs.found == 'true'
run: |
BODY=$(cat "$RUNNER_TEMP/pr_body.txt")
STRIPPED_BODY=$(printf '%s' "$BODY" | python3 -c 'import re, sys; sys.stdout.write(re.sub(r"<!--.*?-->", "", sys.stdin.read(), flags=re.S))')

# Extract from "## Evidence" to EOF to avoid false matches before the section,
# then strip HTML comments so template placeholders (<!-- PASS | INSUFFICIENT -->)
# cannot satisfy the verdict check on an unfilled template.
EVIDENCE_SECTION=$(printf '%s\n' "$STRIPPED_BODY" | awk '
tolower($0) ~ /^[[:space:]]*##[[:space:]]+evidence([[:space:]]|$)/ { in_section=1; next }
in_section && /^[[:space:]]*##[[:space:]]/ { exit }
in_section { print }
')

# Empty extraction means there was no real Evidence section after comment stripping.
if [ -z "$EVIDENCE_SECTION" ]; then
echo "WARNING: No ## Evidence section found in PR body after stripping HTML comments."
exit 1
fi
Comment thread
cursor[bot] marked this conversation as resolved.

# Check for PASS verdict — scoped to Evidence section
if printf '%s' "$EVIDENCE_SECTION" | grep -qi '[Vv]erdict.*:.*[Pp][Aa][Ss][Ss]'; then
Comment thread
cursor[bot] marked this conversation as resolved.
echo "Verdict: PASS — evidence gate passes"

# Check for INSUFFICIENT verdict — gate passes; used when evidence is partial
elif printf '%s' "$EVIDENCE_SECTION" | grep -qi '[Vv]erdict.*:.*[Ii][Nn][Ss][Uu][Ff][Ff][Ii][Cc][Ii][Ee][Nn][Tt]'; then
echo "Verdict: INSUFFICIENT — gate passes (marks work-in-progress evidence)"

# Check for FAIL verdict — gate passes with a warning
elif printf '%s' "$EVIDENCE_SECTION" | grep -qi '[Vv]erdict.*:.*[Ff][Aa][Ii][Ll]'; then
echo "Verdict: FAIL with present bundle — this bundle should be re-examined"

# No verdict found — warn
else
echo "WARNING: No verdict found in evidence bundle."
echo "Consider adding one of the following to your ## Evidence section:"
echo " **Verdict**: PASS"
echo " **Verdict**: INSUFFICIENT"
exit 1
fi
17 changes: 17 additions & 0 deletions .jscpd.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
{
"minLines": 8,
"minTokens": 50,
"threshold": 3,
"pattern": "**/*.{ts,tsx}",
"ignore": [
"**/dist/**",
"**/node_modules/**",
"**/__tests__/**",
"**/*.test.ts",
"**/*.test.tsx",
"**/*.spec.ts",
"**/*.spec.tsx"
],
"reporters": ["console"],
"gitignore": true
}
56 changes: 52 additions & 4 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -254,10 +254,25 @@ See [docs/DEVELOPMENT.md](docs/DEVELOPMENT.md) for the full reference. The short
chore: update vitest to v2
```

5. **Push and open a PR**. In the PR description:
- What changed and why
- How to test it
- Link to the issue it closes (e.g., `Closes #123`)
5. **Push and open a PR**. The PR template includes an **Evidence** section — fill it in:

```markdown
## Evidence

**Claim class**: unit | integration | feat | fix | refactor | docs | chore

**Terminal test output**:
```
$ pnpm test
# ... test output ...
```

**Verdict**: PASS | INSUFFICIENT
```

The `evidence-gate` CI check enforces this. PRs without a `## Evidence` section or
without a `**Verdict**` line will fail the check. For `integration` and `pipeline-e2e`
claim classes, a fenced code block showing test command output is also required.

6. **Address review comments** — update the branch and push. Reply to comments when done.

Expand All @@ -276,6 +291,39 @@ All PRs must pass:
- `pnpm test` — all tests green
- `pnpm lint` — no lint errors
- Secret scanning — no leaked credentials
- `evidence-gate` — PR body must include a `## Evidence` section with a `**Verdict**`

### Churn Guard (optional local hook)

The churn-guard hook prevents creating duplicate PRs that modify the same files as an existing open PR. To install it locally (requires Claude Code):

```bash
mkdir -p .claude/hooks
cp scripts/hooks/churn-guard.sh .claude/hooks/churn-guard.sh
```

Then add the following to `.claude/settings.json`:

```json
{
"hooks": {
"PreToolUse": [
{
"matcher": "Bash",
"hooks": [
{
"type": "command",
"command": ".claude/hooks/churn-guard.sh",
"timeout": 60000
}
]
}
]
}
}
```

The hook intercepts `gh pr create` commands and checks for file overlap with open PRs. To bypass, add `Supersedes #<N>` to the PR body.

---

Expand Down
Loading
Loading