The Production Test Framework provides automated infrastructure for deploying, validating, and testing production platform components. This framework automates cluster setup, service deployment, and execution of integration tests.
The framework enables end-to-end testing of the following areas:
- K3s Cluster Management: Automated bootstrap and validation of k3s clusters
- LGTM Stack Deployment: Automated deployment of Loki, Grafana, Tempo, and Mimir
- Integration Testing: Automated test execution with proper setup and teardown
- Infrastructure Automation: SSH tunnels, kubeconfig management, and dependency synchronization
To get tests running with a single command: clone the repo, copy and edit .env to set the required and optional environment variables:
git clone <repo-url> production-test-framework
cd production-test-framework
cp env.example .env
# Edit .env: CLUSTER, and the other fields checked by `make prereqs`
uv sync
make testThe sections below cover prerequisites, cluster access, the other Makefile targets, and running in Docker.
Before using the framework, ensure you have the following installed:
kubectl- Kubernetes command-line toolhelm- Kubernetes package manageruv- Python package manager- k3d and sudo (only for
test-production,create-test-cluster, anddestroy-test-cluster, which create/deletes a local k3d cluster)
The framework requires the following environment variables to be set:
export ANSIBLE_REMOTE_USER="your-ssh-username"
export REMOTE_HOST="target-cluster-hostname-or-ip"
export CLUSTER="cluster-name"
export ANSIBLE_INVENTORY_FILE="/path/to/ansible/inventory.ini"For Qase test reporting, set QASE_TESTOPS_API_TOKEN (optional). If unset, make prereqs will report it as missing but tests can still run.
You can also create a .env file in the project root with the variables above specified. Copy env.example to .env and edit the values. The .env file will be loaded when make is run.
Clone this repository, then create your local configuration:
git clone <repo-url> production-test-framework
cd production-test-framework
cp env.example .env
# Edit .env (and add ansible/inventory.ini if you use Ansible outside this Makefile)
uv syncVerify all prerequisites are installed and environment variables are set:
make prereqs# Full flow: deploy Helm charts -> run tests -> undeploy
make test
# No deploy/undeploy: run the main suite (marker "not teardown")
make test-run-only
# k3d lifecycle: create cluster -> deploy -> run-tests -> destroy cluster
make test-production
# Same as `make run-tests` with Open Mosaic specific setup and test markers
make test-openmosaicRun make help for the full list, or make help-container-targets in the Docker image.
prereqs- Check for missing prerequisites and environment variablesdeploy-helm-charts- Deploy charts (expectskubectlcontext configured; usesCLUSTERin messages)undeploy-helm-charts- Remove themosaicnamespace / release
create-test-cluster- Create a k3d cluster named${CI_JOB_ID}-k3sand amosaicnamespacedestroy-test-cluster- Delete that k3d cluster
run-tests- Run pytest once withTEST_MARKER(no Helm undeploy, no separate teardown pass)run-all-tests-run-tests, thenundeploy-helm-charts, then pytest with theteardownmarkerhelp/help-container-targets- Print help (the latter is a short list for container/CI use)
test-prereqs→deploy-helm-charts→run-all-tests(main tests, undeploy, teardown tests)test-run-only-prereqsand arun-testswith markernot teardown(no deploy/undeploy)test-production-create-test-cluster→deploy-helm-charts→run-tests→destroy-test-clustertest-openmosaic- Same asrun-tests(convenience target for an already-running stack)
TEST_MARKER- Pytest marker (default:k3s or lgtm or metrics).test-run-onlyforcesnot teardownin the Makefile; setTEST_MARKERfor other targets as needed.PYTEST_ADDOPTS- Extra pytest options (pytest reads this environment variable).QASE_TESTOPS_RUN_TITLE- Qase automated test run title. Default: "Production test run [dirty]". Override for CI or custom runs.CI_JOB_ID- Used in the k3d cluster name (defaultlocal); set in CI to avoid collisions.
make test-run-only PYTEST_ADDOPTS='-x'
make test QASE_TESTOPS_RUN_TITLE="CI run 123"By default, tests are expected in ./tests/lgtm/ (a child directory). Set TESTS_DIR if your tests live elsewhere. Tests are organized by validation area:
test_k3s_cluster.py- K3s cluster health and node validationtest_namespaces.py- Namespace and pod validationtest_services.py- Service and storage validationtest_teardown.py- Post-teardown validation
Tests use pytest markers for organization:
k3s- K3s cluster validation testslgtm- LGTM stack integration testsmetrics- Metrics-related tests (when present)teardown- Teardown validation tests
From the project root:
docker build -t production-test-framework .Optionally pass the git hash as a build arg: docker build --build-arg GIT_HASH=$(git rev-parse --short HEAD) -t production-test-framework .
Build the image as shown above. To run the container with a minimal setup:
docker run -it --rm production-test-frameworkThe sections below describe how to set environment variables, optionally forward SSH, and mount your tests and Helm charts so you can run make inside the container.
The required environment variables for cluster validation are: ANSIBLE_REMOTE_USER, REMOTE_HOST, CLUSTER, and ANSIBLE_INVENTORY_FILE. Optional variables include TESTS_DIR, and QASE_TESTOPS_API_TOKEN (see Required Environment Variables above).
You can provide them by:
- Mounting a
.envfile into the container (e.g.-v $(pwd)/.env:/app/framework/.env:ro). The Makefile loads.envfrom the framework directory when you runmake. - Passing variables with
-e VAR=valueor--env-filefor each run.
If you do not mount a .env file, the image uses built-in defaults (see the Dockerfile). Copy env.example to .env and edit it for your environment.
If you run Ansible or other tools that SSH from the container, forward your agent so keys are available:
- On the host, ensure your SSH agent has the key loaded:
ssh-add -l(usessh-addto add it). - When running the container, pass the agent socket in:
-e SSH_AUTH_SOCK=/tmp/ssh-agent/socket-v $SSH_AUTH_SOCK:/tmp/ssh-agent/socket
If SSH connections fail, see SSH Connection Issues in Troubleshooting.
- Tests: The Makefile uses
/app/framework/tests(seeTESTS_DIR). The docker entrypoint copies/app/tests/*into/app/framework/tests/, so a typical mount is-v /path/to/your/tests:/app/tests:ro. You can also mount straight to/app/framework/testsif you do not rely on that copy. Tests are Python and run with pytest. - Helm charts: The Makefile runs Helm from
/app/framework/charts/mosaic(seedeploy-helm-charts). You can mount that path directly, e.g.-v /path/to/mosaic/charts/mosaic:/app/framework/charts/mosaic:ro. Alternatively, mount your charts under/app/charts; scripts/docker-entrypoint.sh copies/app/charts/*into/app/framework/charts/at container start.
A full example that combines .env, tests, mosaic, and SSH agent forwarding is shown in the code block in the next section; see also scripts/launch_framework.sh.
By default, the container starts an interactive shell in /app/framework. The entrypoint prints a banner and runs make help-container-targets so you can see available targets. Run tests with make:
make test
make test-run-only
make test-production
make test-openmosaicThe Makefile loads .env from the framework directory, so a mounted .env is used automatically. For a full docker run example with env, tests, Helm charts, and optional SSH forwarding:
docker run -it --rm \
-v $(pwd)/.env:/app/framework/.env:ro \
-v /path/to/your/tests:/app/tests:ro \
-v /path/to/helm/charts/mosaic:/app/framework/charts/mosaic:ro \
-e SSH_AUTH_SOCK=/tmp/ssh-agent/socket \
-v $SSH_AUTH_SOCK:/tmp/ssh-agent/socket \
production-test-frameworkThen run make test (or another target) inside the container. See Makefile Targets for all test targets.
If you set RUN_MAKE_TARGET, the entrypoint runs that make target and exits; no interactive shell or banner is shown. Use this for CI or one-off non-interactive runs:
docker run -it --rm \
-e RUN_MAKE_TARGET=test-run-only \
-v $(pwd)/.env:/app/framework/.env:ro \
-v /path/to/your/tests:/app/tests:ro \
-v /path/to/helm/charts/mosaic:/app/framework/charts/mosaic:ro \
-e SSH_AUTH_SOCK=/tmp/ssh-agent/socket \
-v $SSH_AUTH_SOCK:/tmp/ssh-agent/socket \
production-test-frameworkPre-built images are published to Docker Hub via GitHub Actions (see .github/workflows/ for CI; configure DOCKERHUB_USERNAME and DOCKERHUB_TOKEN secrets for pushes).
If make prereqs shows missing tools:
# Install missing prerequisites
# For macOS:
brew install kubectl helm
# For Python/uv:
curl -LsSf https://astral.sh/uv/install.sh | shAnsible is provided by the project's Python dependencies; run uv sync from the project root so that ansible is available on your PATH via uv run when you need it.
If SSH connections fail:
- Ensure your SSH agent has the key loaded:
ssh-add -l(usessh-addto add it). - Test SSH connection manually:
ssh $ANSIBLE_REMOTE_USER@$REMOTE_HOST - Check that
ANSIBLE_REMOTE_USERandREMOTE_HOSTare set correctly
If tests fail:
- Check cluster status:
kubectl get nodes - Verify pods are running:
kubectl get pods -A - Check Helm chart deployment:
helm list -n mosaic - Review test output for specific error messages
For more information:
- Run
make helpfrom the project root to see all available targets - Review test documentation in
../tests/lgtm/README.mdif you use the default tests layout