Check compatibility of evolving JSON schemas.
jsoncompat currently supports JSON Schema Draft 2020-12 only. If a schema declares $schema, it
must be https://json-schema.org/draft/2020-12/schema with an optional trailing #.
Warning
Docs and examples at jsoncompat.com
This is alpha software. Not all incompatible changes are detected, and there may be false positives. Contributions are welcome!
Install the CLI with Cargo:
cargo install jsoncompatCheck whether a schema change is compatible for a serializer:
jsoncompat compat old-schema.json new-schema.json --role serializerCheck whether a schema change is compatible in both serializer and deserializer directions, with fuzzing enabled to look for a concrete counterexample:
jsoncompat compat old-schema.json new-schema.json --role both --fuzz 1000 --depth 8Generate example JSON values accepted by a schema:
jsoncompat generate schema.json --count 5 --prettyCompare two golden schema files in CI:
jsoncompat ci old-golden.json new-golden.json --display tableRun the guided CLI smoke test:
jsoncompat demo --noninteractiveImagine you have an API that returns some JSON data, or JSON that you're storing in a database or file. You need to ensure that new code can read old data and that old code can read new data.
It's difficult to version JSON schemas in a traditional sense, because they can break in two directions:
-
If a schema is used by the party generating the data, or "serializer", then a change to the schema that can break clients using an older version of the schema should be considered "breaking." For example, removing a required property from a serializer schema should be considered a breaking change for a schema with the serializer role.
More formally, consider a serializer schema
$S_A$ which is changed to$S_B$ . This change should be considered breaking if there exists some JSON value that is valid against$S_B$ but invalid against$S_A$ .As a concrete example, if you're a webserver that returns JSON data with the following schema:
{ "type": "object", "properties": { "id": { "type": "integer" }, "name": { "type": "string" } }, "required": ["id", "name"] }and you make
nameoptional:{ "type": "object", "properties": { "id": { "type": "integer" }, "name": { "type": "string" } }, "required": ["id"] }then you've made a breaking change for any client that is using the old schema.
We assume that the serializer will not write additional properties that are not in the schema, even if additionalProperties is true. This allows us to consider a change to the schema that adds an optional property of some type not to be a breaking change.
-
If a schema is used by a party receiving the data, or "deserializer", then a change to the schema that might fail to deserialize existing data should be considered "breaking." For example, adding a required property to a deserializer should be considered a breaking change.
More formally, consider a deserializer schema
$S_A$ which is changed to$S_B$ . This change should be considered breaking if there exists some JSON value that is valid against$S_A$ but invalid against$S_B$ .As a concrete example, imagine that you've been writing code that saves JSON data to a database with the following schema:
{ "type": "object", "properties": { "id": { "type": "integer" }, "name": { "type": "string" } }, "required": ["id"] }and you make
namerequired, attempting to load that data into memory by deserializing it with the following schema:{ "type": "object", "properties": { "id": { "type": "integer" }, "name": { "type": "string" } }, "required": ["id", "name"] }you'll be unable to deserialize any data that doesn't have a
nameproperty, which is a breaking change for thedeserializerrole.If a schema is used by both a serializer and a deserializer, then a change to the schema that can break either should be considered "breaking."
The jsoncompat crate has a small document-level API:
use jsoncompat::{Role, SchemaDocument, check_compat};
use serde_json::json;
let old = SchemaDocument::from_json(&json!({ "type": "string" })).unwrap();
let new = SchemaDocument::from_json(&json!({ "type": ["string", "null"] })).unwrap();
let compatible = check_compat(&old, &new, Role::Deserializer).unwrap();SchemaDocument::from_json(&Value)builds and canonicalizes a Draft 2020-12 schema document.check_compat(&old, &new, Role::Serializer | Role::Deserializer | Role::Both)returns whether the schema change is compatible for that role.CompatibilityErrorreports schema construction failures and compatibility features that are intentionally rejected instead of approximated.json_schema_fuzz::ValueGenerator::generate(&schema, GenerationConfig::new(depth), rng)is the separate Rust value-generation API.
Legend:
- ✅: the backcompat checker has an explicit subset rule, or the fuzzer has direct schema-guided candidate construction for that keyword.
- 🟡: the feature is parsed/evaluated, but the checker is conservative or the fuzzer is heuristic/retry-based for that keyword.
- ⚪: the keyword is accepted or stripped, but it is not represented in the resolved IR and does not contribute checker/fuzzer semantics.
- ⛔: schema construction fails with a typed error before the checker/fuzzer runs.
Important global caveats:
jsoncompat::check_compatis a document-level structural subset checker over the resolved AST. It can return false negatives for conservative subset proofs and false positives for string-language constraints that are currently ignored when both sides are open string schemas.- Compatibility checks reject non-integral
number.multipleOfconstraints with a typedCompatibilityErrorinstead of approximating fractional divisor inclusion withf64. - Runtime validation (
SchemaDocument::is_valid) always delegates to thejsonschemabackend compiled from the raw schema document, so non-f64numeric schemas/values follow the backend's support limits rather than the compatibility checker's exact numeric model. - For serializer compatibility, the checker assumes serializers do not emit undeclared extra
properties, even when
additionalProperties: true. json_schema_fuzz::ValueGenerator::generatewalks the resolved AST heuristically, then validates every candidate with the rawjsonschemabackend. If the resolved schema is known to have no valid instances, it returnsGenerateError::Unsatisfiabledeterministically; otherwise, if no valid candidate is found within the retry budget, it returnsGenerateError::ExhaustedAttempts.- Boolean schema documents
trueandfalseare supported directly.falsehas no valid instances, so generation returnsUnsatisfiable. - Unknown extension keywords are preserved only when needed to keep local ref targets addressable; otherwise they are ignored by both the checker and the fuzzer.
| Keyword | Compat | Fuzz | Notes |
|---|---|---|---|
type |
✅ | ✅ | Supports null, boolean, object, array, string, number, integer, and unions of those primitive types. |
enum |
✅ | ✅ | Numeric equality follows JSON Schema semantics, so 1 and 1.0 are treated as equal. |
const |
✅ | ✅ | Same semantic numeric equality as enum. |
minimum |
✅ | ✅ | Integer bounds are normalized exactly where possible; number bounds use f64. |
maximum |
✅ | ✅ | Integer bounds are normalized exactly where possible; number bounds use f64. |
exclusiveMinimum |
✅ | 🟡 | Integer exclusives are canonicalized into inclusive integer bounds; number generation may rely on validator retries around open lower bounds. |
exclusiveMaximum |
✅ | 🟡 | Integer exclusives are canonicalized into inclusive integer bounds; number generation may rely on validator retries around open upper bounds. |
multipleOf |
🟡 | ✅ | Exact integer divisibility is supported for integer schemas and integer-valued number.multipleOf. Compatibility checks reject non-integral number.multipleOf constraints instead of approximating fractional divisor inclusion; runtime validation and generation still accept those schemas. |
minLength |
✅ | 🟡 | The generic string generator respects this bound, but the format and pattern paths may ignore it and rely on validator retries. |
maxLength |
✅ | 🟡 | The generic string generator respects this bound, but the format and pattern paths may ignore it and rely on validator retries. |
pattern |
🟡 | 🟡 | SchemaNode::accepts_value() enforces regexes for finite-value checks using a cached matcher per pattern, but check_compat does not prove regex-language inclusion between open string schemas. Unsupported ECMAScript regex constructs are preserved as source text but treated as non-matching by the internal evaluator. Regex generation is best-effort and does not guarantee coverage for complex constructs. |
format |
🟡 | 🟡 | The checker does not prove format-language inclusion between open string schemas. The fuzzer only synthesizes date, date-time, time, email, idn-email, uri, iri, uri-reference, iri-reference, uuid, ipv4, ipv6, hostname, and idn-hostname; unknown formats fall back to generic strings. |
contentEncoding |
⚪ | ⚪ | Canonicalized for syntax, then stripped from the semantic IR. |
contentMediaType |
⚪ | ⚪ | Canonicalized for syntax, then stripped from the semantic IR. |
contentSchema |
⚪ | ⚪ | Nested schemas are canonicalized and ref-checked, but content decoding/validation is not modeled in the semantic IR. |
| Keyword | Compat | Fuzz | Notes |
|---|---|---|---|
properties |
✅ | ✅ | Property schemas are compared recursively and generated directly. |
required |
✅ | ✅ | Missing required properties are synthesized before generation returns a candidate. |
additionalProperties |
✅ | ✅ | Checker support is subject to the serializer assumption above; the fuzzer generates additional keys only when this schema is not false. |
patternProperties |
🟡 | 🟡 | Exact same regex keys are compared structurally, and property values are checked against every matching pattern schema. If the superset has no patternProperties, subset patterns fall back to additionalProperties; otherwise the checker does not prove regex-language inclusion between different patterns. The fuzzer does not synthesize property names from regexes. |
propertyNames |
🟡 | 🟡 | Checker support depends on subset reasoning for the property-name schema itself, so regex/format caveats still apply. The fuzzer can generate keys from string/enum schemas and otherwise falls back to random names plus acceptance checks. |
minProperties |
✅ | ✅ | Canonicalization raises minProperties to at least the number of required keys. |
maxProperties |
✅ | ✅ | Checked structurally and used as a hard upper bound during generation. |
dependentRequired |
✅ | 🟡 | The checker accounts for trigger keys admitted by properties, patternProperties, or additionalProperties. The fuzzer does not proactively synthesize dependencies, but invalid candidates are rejected by the final validator pass. |
dependentSchemas |
⚪ | ⚪ | Nested schemas are canonicalized for dialect/ref validation, but SchemaNodeKind::Object does not store dependentSchemas, so checker/fuzzer semantics are absent. |
unevaluatedProperties |
⚪ | ⚪ | Canonicalized recursively but not represented with evaluated-location bookkeeping in the resolved IR. |
dependencies |
⚪ | ⚪ | Legacy keyword accepted by the parser but not represented in the resolved IR; use dependentRequired / dependentSchemas for Draft 2020-12 semantics. |
| Keyword | Compat | Fuzz | Notes |
|---|---|---|---|
items |
✅ | ✅ | Schema-form items is compared recursively and used for tail-item generation. Tuple arrays must use Draft 2020-12 prefixItems; legacy tuple-form items: [...] is rejected. |
prefixItems |
✅ | ✅ | Compared positionally and generated positionally. |
additionalItems |
⚪ | ⚪ | Legacy keyword not represented in the resolved array node, so it does not affect compatibility or generation. |
minItems |
✅ | ✅ | Defaults to minContains (or 1) when contains is present, otherwise 0. |
maxItems |
✅ | ✅ | Checked structurally and used as a hard upper bound during generation. |
contains |
✅ | 🟡 | Checker reasoning handles item-match count constraints structurally. The fuzzer tries to force or avoid matching items, but this is heuristic and may rely on retries. |
minContains |
✅ | 🟡 | Participates in structural count reasoning for contains; the fuzzer uses best-effort witness construction plus validator retries. |
maxContains |
✅ | 🟡 | Participates in structural count reasoning for contains; the fuzzer uses best-effort witness avoidance plus validator retries. |
uniqueItems |
✅ | 🟡 | The checker handles the common sound cases; the fuzzer retries and then uses a synthetic fallback object for unconstrained item schemas if duplicates persist. |
unevaluatedItems |
⚪ | ⚪ | Canonicalized recursively but not represented with evaluated-location bookkeeping in the resolved IR. |
| Keyword | Compat | Fuzz | Notes |
|---|---|---|---|
allOf |
🟡 | 🟡 | Canonicalization simplifies trivial boolean branches. The checker is branch-local and can miss proofs such as allOf[A, B] <= A when only one branch is inspected. The fuzzer merges object branches heuristically and otherwise picks one non-trivial branch. |
anyOf |
🟡 | 🟡 | The checker requires every sub branch to fit sup, and for sup anyOf it looks for one branch containing all of sub; that is conservative and can miss valid disjunctive proofs. The fuzzer samples one branch and retries. |
oneOf |
🟡 | 🟡 | Canonicalization rewrites provably disjoint oneOf branches to anyOf. Outside that case, checker reasoning is conservative and does not symbolically prove exact-one exclusivity, and the fuzzer uses retry-based witness counting. |
not |
🟡 | 🟡 | The checker only handles shallow boolean/Any cases symbolically; other not proofs fall back to conservative false. The fuzzer generates type-mismatch and fixed candidate witnesses, then retries against the full schema. |
if |
🟡 | 🟡 | SchemaNode::accepts_value() evaluates the condition and the fuzzer samples branches, but the subset checker has no dedicated symbolic rule for conditional implication beyond exact-node equality and finite const/enum probes. |
then |
🟡 | 🟡 | Stored and evaluated only as part of if / then / else; checker reasoning inherits the same conditional caveat as if. |
else |
🟡 | 🟡 | Stored and evaluated only as part of if / then / else; checker reasoning inherits the same conditional caveat as if. |
| Keyword | Compat | Fuzz | Notes |
|---|---|---|---|
$ref |
✅ | 🟡 | Same-document refs to "#" and "#/..." are supported, including recursive graphs. Pure alias cycles and non-local refs are rejected with typed resolver errors. Generation is depth-limited. |
$anchor |
⛔ | ⛔ | Plain-name fragment anchors are rejected; only JSON Pointer refs are supported today. |
$dynamicRef |
⛔ | ⛔ | Dynamic refs are rejected during schema resolution. |
$dynamicAnchor |
⛔ | ⛔ | Dynamic anchors are rejected during schema resolution. |
$id |
⛔ | ⛔ | Resource-scope identifiers are rejected because reference resolution is currently same-document only. |
$defs |
✅ | 🟡 | Acts as a local ref target container; nested schemas are canonicalized and resolved, but $defs is not a semantic node by itself. |
definitions |
✅ | 🟡 | Legacy ref target container with the same caveat as $defs. |
title |
⚪ | ⚪ | Preserved as metadata in canonical JSON, but not used by checker/fuzzer semantics. |
$comment |
⚪ | ⚪ | Stripped or preserved only as metadata; no compatibility or generation semantics. |
description |
⚪ | ⚪ | Stripped or preserved only as metadata; no compatibility or generation semantics. |
default |
⚪ | ⚪ | Stripped or preserved only as metadata; default values are not synthesized or used for compatibility. |
examples |
⚪ | ⚪ | Stripped or preserved only as metadata; examples do not guide fuzzing. |
deprecated |
⚪ | ⚪ | Annotation only; ignored by checker and fuzzer semantics. |
readOnly |
⚪ | ⚪ | Annotation only; ignored by checker and fuzzer semantics. |
writeOnly |
⚪ | ⚪ | Annotation only; ignored by checker and fuzzer semantics. |
$vocabulary |
⚪ | ⚪ | Annotation only; no dialect negotiation semantics are modeled. |
The Rust code is split into five crates and one CLI binary. The website under web/ is a separate frontend and is not part of the compatibility engine.
| Path | Package | Responsibility |
|---|---|---|
schema/ |
json_schema_ast |
Draft 2020-12 dialect checks, schema canonicalization, AST construction, local $ref resolution, and direct validator compilation via jsonschema |
src/ |
jsoncompat |
Backward-compatibility checking over resolved ASTs, plus the jsoncompat CLI in src/bin/jsoncompat.rs |
fuzz/ |
json_schema_fuzz |
Schema-guided JSON instance generation for fuzz tests and example values |
python/ |
jsoncompat_py |
PyO3 bindings that expose check_compat, generate_value, and role constants |
wasm/ |
jsoncompat_wasm |
wasm-bindgen bindings that expose check_compat and generate_value to JavaScript |
The primary schema APIs are:
json_schema_ast::compile(&raw_schema), which compiles the original schema document directly with thejsonschemavalidator backend. Before compilation, this crate rejects any$schemadeclaration other than Draft 2020-12 (https://json-schema.org/draft/2020-12/schema, with an optional trailing#).json_schema_ast::SchemaDocument::from_json(&raw_schema)(alias:SchemaDocument::from_json), which eagerly stores the original raw schema JSON, eagerly canonicalizes and validates the schema document once, and then lazily materializes compiled validators on first use:SchemaDocument::is_valid(&value) -> Result<bool, SchemaBuildError>lazily compiles a validator from the original raw schema document.SchemaDocument::canonical_schema_json() -> Result<&Value, SchemaBuildError>returns the cached canonicalized schema document as JSON so humans can inspect the rewrite directly.
json_schema_fuzz::ValueGenerator::generate(&SchemaDocument, GenerationConfig, rng) -> Result<Value, GenerateError>, which is the Rust generation API.
The resolved IR extension API is public because jsoncompat and
json_schema_fuzz are separate crates. SchemaDocument::root() lazily resolves
local refs from the canonicalized JSON and exposes the immutable graph used by
compatibility analysis and generation. Each SchemaNode has an opaque NodeId
for cycle-safe identity. SchemaNodeKind only contains post-resolution semantic
variants; parser-only $ref and declaration metadata variants are private
implementation details. Scalar/object/array constraints are represented with
normalized typed domains (IntegerBounds, NumberBounds, CountRange,
ContainsConstraint, PatternConstraint) instead of loose keyword pairs, and
regex support is explicit via PatternSupport.
At a high level, the runtime flow is:
flowchart TD
raw["Raw JSON Schema document"]
compile_raw["json_schema_ast::compile(raw)\nlazily compile raw validator"]
validate_raw["SchemaDocument::is_valid(value)\nvalidate against raw schema"]
canonicalize["schema/src/canonicalize.rs\neagerly canonicalize syntax, validate keyword shapes,\nand fill implicit constraints"]
canonical_json["SchemaDocument::canonical_schema_json()\ndebug canonical JSON output"]
compile_canonical["compile(canonicalized)\nparity/debug validator"]
validate_canonical["compiled canonical validator\nparity/debug validation"]
parse["schema/src/ast.rs\nparse canonical JSON into a private arena-backed graph"]
resolve["schema/src/ast.rs\nresolve local # / #/... refs by node ID and freeze to SchemaNodeKind"]
compat["src/subset.rs + src/subset/*.rs\njsoncompat::check_compat(old_doc, new_doc, role)"]
fuzz["fuzz/src/lib.rs\njson_schema_fuzz::ValueGenerator::generate(&SchemaDocument, ...)"]
raw --> compile_raw --> validate_raw
raw --> canonicalize --> parse --> resolve
canonicalize --> canonical_json
canonicalize --> compile_canonical --> validate_canonical
resolve --> compat
resolve --> fuzz
json_schema_astis the schema frontend and resolved IR crate.schema/src/ast.rsstores the raw input schema immediately and canonicalizes it once insideSchemaDocument::from_json()so keyword-shape validation happens at schema construction without maintaining a second validator path. The resolved graph is still built lazily by parsing that cached canonical JSON into a private arena-backed graph, resolving local recursive references, normalizing applicators, and freezing it intoSchemaNodeKind. Unsupported or non-productive refs fail with typed resolver errors.SchemaDocument::is_valid()is intentionally backed by the validator compiled from the original raw schema.SchemaNode::accepts_value()is the low-level evaluator for canonicalized AST subgraphs used by compatibility/fuzzing heuristics.jsoncompatis the static compatibility checker.src/lib.rsdefinesRoleand document-levelcheck_compat, andsrc/subset.rsplussrc/subset/{scalar,object,array}.rsimplement the actual inclusion relation (sub ⊆ sup) overSchemaNode. The checker usesSchemaNode::accepts_value()for finite-value membership checks and keeps a cycle guard for recursive subset proofs.json_schema_fuzzis the value-generation engine. Its public value-generation API isValueGenerator::generate(&SchemaDocument, GenerationConfig, rng)and it only returns values accepted bySchemaDocument::is_valid(); if the resolved schema is known to be empty, generation returns a typedGenerateError::Unsatisfiable, and if the internal candidate generator cannot find a value within its retry budget, generation returns a typedGenerateError::ExhaustedAttempts. Internally it walks the canonicalizedSchemaNodegraph and usesSchemaNode::accepts_value()only as a pruning heuristic for recursive subgraphs.jsoncompat_pyandjsoncompat_wasmare thin adapters. They parse JSON strings, call the Rust core crates, and map Rust errors/results into Python or JavaScript types.
tests/backcompat.rschecks expected serializer/deserializer compatibility outcomes for hand-authored old/new schema pairs, then fuzzes each direction to look for concrete counterexamples.tests/fuzz.rsruns the JSON-Schema-Test-Suite fixture corpus throughSchemaDocument::from_json, asksjson_schema_fuzz::ValueGeneratorfor raw-valid examples, compilesSchemaDocument::canonical_schema_json()to assert canonicalization parity, and additionally checksSchemaNode::accepts_value()against the canonicalized validator on schemas that stay within the internal evaluator's supported subset. Fixtures that rely on unsupported reference-resource features are skipped when schema build returns a typed resolver error, and known generation gaps are tracked by explicitGenerateError::ExhaustedAttemptswhitelist entries.schema/src/canonicalize/integration_tests.rsandschema/src/roundtrip_tests.rstest canonicalization and AST round-tripping directly.
To inspect the canonicalized schema document that backs compatibility checks and generation, compare the raw schema you passed to SchemaDocument::from_json() with SchemaDocument::canonical_schema_json(), and compile the canonical JSON with json_schema_ast::compile() if you need validator-level parity checks on representative instances. Canonicalization is an internal library facility, not a jsoncompat CLI subcommand.
Requirements:
Run bootstrap.sh to install the necessary dependencies.
Run tests:
just checkRun the end-to-end CLI demo/smoke test:
cargo run --bin jsoncompat -- demoBy default the demo pauses before each step. For CI/non-interactive runs, pass:
cargo run --bin jsoncompat -- demo --noninteractiveRun the performance benchmark harnesses:
just benchThe schema operation benchmarks use a fixed handpicked corpus under benches/fixtures so broad fuzz fixture changes do not move the benchmark baseline.
For a fast smoke check of the benchmark binary:
just bench-checkSee the Justfile for more commands
just release will dry-run the release process for a patch release.
Right now, releases to PyPI and npm are done in CI via manual dispatch of the CI workflow
on a tag. Releases to cargo are done manually for now.
Merging to main will trigger a release of the website.
