Automation harnesses for Specmatic labs live in lab-named folders in this repo.
Current automation scope validates CLI/runtime behavior and generated artifacts. It does not automate Specmatic Studio flows, but the comparison report can indicate whether a lab README documents a Studio component.
Prerequisites:
- Python
3.14.x - Docker with the daemon running
- sibling upstream checkout at
../labs
Current labs:
api-coverageREADME:api-coverage/README.mdapi-resiliency-testingREADME:api-resiliency-testing/README.mdapi-security-schemesREADME:api-security-schemes/README.mdasync-event-flowREADME:async-event-flow/README.mdbackward-compatibility-testingREADME:backward-compatibility-testing/README.mdcontinuous-integrationREADME:continuous-integration/README.mddata-adaptersREADME:data-adapters/README.mddictionaryREADME:dictionary/README.mdexternal-examplesREADME:external-examples/README.mdfiltersREADME:filters/README.mdkafka-avroREADME:kafka-avro/README.mdkafka-sqs-retry-dlqREADME:kafka-sqs-retry-dlq/README.mdmcp-auto-testREADME:mcp-auto-test/README.mdorder-bffREADME:order-bff/README.mdoverlaysREADME:overlays/README.mdpartial-examplesREADME:partial-examples/README.mdworkflow-in-same-specREADME:workflow-in-same-spec/README.mdquick-start-api-testingREADME:quick-start-api-testing/README.mdquick-start-async-contract-testingREADME:quick-start-async-contract-testing/README.mdquick-start-contract-testingREADME:quick-start-contract-testing/README.mdquick-start-mockREADME:quick-start-mock/README.mdschema-designREADME:schema-design/README.mdschema-resiliency-testingREADME:schema-resiliency-testing/README.mdresponse-templatingREADME:response-templating/README.md
Setup the sibling upstream labs checkout and Docker images from the repo root with:
python3 setup.pyTo force ../labs back to the latest main before refreshing Docker images:
python3 setup.py --refresh-labs --forceRun every available lab harness from the repo root and build the consolidated and comparison reports with:
python3 run_all.pyRebuild the consolidated and comparison reports from the existing lab snapshots without rerunning labs:
python3 rebuild_reports.pyRefresh an individual lab report from previously captured artifacts without rerunning the lab:
python3 api-coverage/run.py --refresh-reportpython3 api-resiliency-testing/run.py --refresh-reportpython3 api-security-schemes/run.py --refresh-reportpython3 async-event-flow/run.py --refresh-reportpython3 backward-compatibility-testing/run.py --refresh-reportpython3 continuous-integration/run.py --refresh-reportpython3 data-adapters/run.py --refresh-reportpython3 dictionary/run.py --refresh-reportpython3 external-examples/run.py --refresh-reportpython3 filters/run.py --refresh-reportpython3 kafka-avro/run.py --refresh-reportpython3 kafka-sqs-retry-dlq/run.py --refresh-reportpython3 mcp-auto-test/run.py --refresh-reportpython3 order-bff/run.py --refresh-reportpython3 overlays/run.py --refresh-reportpython3 workflow-in-same-spec/run.py --refresh-reportpython3 partial-examples/run.py --refresh-reportpython3 quick-start-api-testing/run.py --refresh-reportpython3 quick-start-async-contract-testing/run.py --refresh-reportpython3 quick-start-contract-testing/run.py --refresh-reportpython3 quick-start-mock/run.py --refresh-reportpython3 schema-resiliency-testing/run.py --refresh-reportpython3 schema-design/run.py --refresh-reportpython3 response-templating/run.py --refresh-reportOutputs are written to:
output/consolidated-report/consolidated-report.jsonoutput/consolidated-report/consolidated-report.htmloutput/consolidated-report/labs-comparison.jsonoutput/consolidated-report/labs-comparison.htmloutput/consolidated-report/setup-output.jsonoutput/labs/<lab-name>-output/for each lab run
Each lab’s output/ directory is copied into output/labs/<lab-name>-output/ after the run completes. The consolidated report uses those copied folders so the links remain stable even after the live lab output is cleaned up or refreshed.
run_all.py starts by clearing the generated output/labs/ and output/consolidated-report/ folders before regenerating reports, so stale files from earlier runs do not leak into a new report set. rebuild_reports.py does not clean the output tree; it only refreshes the consolidated and comparison reports from the existing lab snapshots.
Each individual lab run also clears its own <lab>/output/ directory before a normal run starts. Refresh-only runs skip that cleanup so they can rebuild from the saved artifacts already on disk.
For Docker-based labs, a normal run also performs a best-effort runtime cleanup before the first phase starts and again after the lab finishes. This keeps stale containers, networks, or volumes from an earlier lab attempt from leaking into later results without adding heavy cleanup work between every command.
Failure messages should be explicit and actionable.
Validation focus:
- the upstream lab
README.mdis the source of truth - the console output from the automated lab run should match the README
- the generated CTRF JSON and sibling Specmatic HTML report should match the README and console output
- when a README documents commands, it should provide command sections for Windows, macOS, and Linux
- OS-specific command sections should use appropriate fenced block languages such as
shell/bashfor macOS and Linux, andpowershell/cmdfor Windows - every documented command section should be followed by a console output snippet
- OS-specific command sections should have matching OS-specific console output snippets
- all README console output snippets should use
terminaloutputfenced blocks - when README console output includes timestamps, comparison logic should ignore the datetime stamp and focus on the meaningful output content
- Studio-only phases that are not automated yet should be reported as known limitations or skipped validations, not as failures
- intentional differences that are part of the lab design should be recorded as expected differences, not counted in the failure index
- copied source snapshots such as
specmatic.yaml, example JSON files, or service source files may still be archived for inspection, but they should not drive pass/fail assertions by themselves - the shared README template is configured in
lablib/readme_expectations.pyREADME_TEMPLATEdefines the shared H1/H2/H3 schema and section-level command/output expectationsLAB_README_OVERRIDESdefines lab-specific exceptions or manual Studio allowancesEXPECTED_README_H2_SEQUENCEis derived from that template for compatibility with existing comparison logic
README command/output conventions:
- use executable fenced blocks for commands:
shell,bash,sh, orzshfor macOS and Linuxpowershell,ps1,cmd, orbatfor Windows
- place a
terminaloutputfenced block immediately after each documented command block - when commands differ by OS, include a matching
terminaloutputblock for each OS-specific command
When a command or validation fails, the message should always say:
- what failed
- what the impact is
- what action is needed to fix it
Prefer concrete paths, commands, and missing artifacts over vague summaries or raw log excerpts.
Non-failing validation states:
- use
assert_skipped(...)for validations that are intentionally not implemented yet, such as documented Studio-only steps that labs-tests does not automate yet - use
assert_expected(...)for intentional differences that should stay visible in the report but should not count as failures - skipped and expected validations should remain visible in the HTML report, but they should not appear in the failure index or contribute to the failure count
How to mark an intentional difference as expected in a lab runner:
Use assert_expected(...) inside a phase's extra_assertions.
Example:
from lablib.scaffold import assert_expected, detail
def baseline_assertions(context):
return [
assert_expected(
"This baseline mismatch is intentional and should stay visible without failing the lab.",
category="readme",
code="readme.intentional-baseline-difference",
details=[
detail("Reason", "The README documents this mismatch as the before state."),
detail("Action", "Do not fix this in labs-tests; fix only if the upstream README changes."),
],
)
]How to ignore a shared README validation without showing anything in the rendered README:
Add an HTML comment to the upstream README. GitHub and browsers do not render it, but labs-tests will read it.
Example:
<!-- labs-tests: ignore readme.os_commands.coverage -->
<!-- labs-tests: ignore readme.command_output.followup readme.output.terminaloutput_fence -->Supported shared README ignore codes currently include:
readme.commands.minimum_countreadme.commands.executable_fencesreadme.os_commands.coveragereadme.os_commands.fence_languagesreadme.os_output.path_coveragereadme.command_output.followupreadme.output.terminaloutput_fencereadme.os_output.command_coveragereadme.tests_run_summary.matches_consolereadme.structure.single_h1readme.structure.required_h2_sectionsreadme.structure.required_h2_orderreadme.structure.unexpected_h2_sections
When an ignore annotation is present:
- the validation is shown as
skipped - it remains visible in the report for traceability
- it does not count as a failure
How to mark missing test-count summaries as expected behavior for a lab:
- set
expected_missing_test_counts=Truein the lab'sLabSpec - set
expected_missing_test_counts_reasonto explain why this lab does not emit README/console/CTRF/HTML count summaries
Example:
return LabSpec(
...,
expected_missing_test_counts=True,
expected_missing_test_counts_reason="This lab validates compatibility verdicts and does not emit test-count summaries.",
)The comparison report will then show those phases as Expected instead of plain Not available.
GitHub Actions workflow:
.github/workflows/labs-tests.yml- runs
python3 run_all.py --refresh-labs --force --labs-branch dynamic-labsby default - accepts an optional space-separated
labsworkflow input to run only selected labs - accepts an optional
labs_branchworkflow input; until thedynamic-labswork is merged, the default branch isdynamic-labs - accepts an optional
manage_licenseworkflow input; by default the workflow creates or replaces../labs/license.txtbefore the run and restores or removes it afterward - emits a 60-second heartbeat while the suite is still running, so quiet phases remain visibly active in Actions
- uses a 40-minute timeout for the workflow job and the main lab execution step
- publishes a GitHub job summary based on
output/consolidated-report/consolidated-report.json - includes the consolidated report path and comparison report path in the GitHub job summary so workflow runs can be checked quickly
- uploads
output/plus every lab-local*/output/folder as thespecmatic-labs-reportsartifact
License lifecycle:
python3 run_all.pymanages../labs/license.txtby default- local runs read
status.licensefrom~/.specmatic/license.jsonand write it to../labs/license.txt - GitHub Actions runs read
SPECMATIC_LICENSE_KEYand write it to../labs/license.txt - after the run completes, the original
../labs/license.txtcontent is restored, or the file is removed if it did not exist before the run - use
python3 run_all.py --no-manage-license ...to opt out locally