feat: Enhance data regression fixture with JSON support and utilities by MitchellAcoustics · Pull Request #243 · ESSS/pytest-regressions

MitchellAcoustics · 2026-04-24T10:48:27Z

Closes #242

This pull request adds support for using JSON as an output format for the data_regression fixture, in addition to YAML. It introduces a recursive dictionary sorting utility to ensure consistent key ordering, updates the regression check logic to handle the new format, and adds comprehensive tests for the new functionality.

New output format support:

The data_regression.check method now accepts an extension parameter (defaulting to ".yml"), allowing regression data to be written as either YAML or JSON. If ".json" is specified, data is serialized using json.dumps with sorted keys and written as UTF-8 text. If an unsupported extension is given, a clear error is raised. [1] [2] [3]

Data normalization:

A new utility function sort_dict_by_keys recursively sorts dictionary keys (including nested dicts within lists), ensuring consistent output for regression checks and JSON serialization.
The data_dict is now normalized with sort_dict_by_keys before being dumped, ensuring stable ordering across runs and formats.

Testing improvements:

Test suite enhancements include parameterized tests to verify both YAML and JSON outputs, a test for the new dictionary sorting function, and a test to ensure unsupported extensions raise the correct error. [1] [2] [3] [4]
Adds new .json files to the test data to validate JSON output. [1] [2]

Imports and refactoring:

Imports for json and the new utility are added where needed. [1] [2]

These changes make the regression framework more flexible and robust, especially for users who prefer or require JSON output.

…egression fixture

…ce indentation

…tion

When using the JSON path, the file is opened and json.dump writes directly to disk. If serialization fails (TypeError for non-serializable objects), this can leave an empty/partial expected/obtained file behind, unlike the YAML path which renders to bytes before opening the file. serializing to a string first (e.g., json.dumps) and only writing after successful serialization. Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

The docstring says this function recursively sorts dicts, but the implementation only recurses into mapping values and won’t sort dicts nested inside sequences (e.g., [{...}, {...}]). Consider extending it to walk MutableSequence values too, or tightening the docstring/name to match the actual behavior. Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Co-authored-by: Copilot <copilot@github.com>

Agent-Logs-Url: https://github.com/MitchellAcoustics/pytest-regressions/sessions/ecff2466-7c11-4a25-87cb-0ebd388921c3 Co-authored-by: MitchellAcoustics <22335636+MitchellAcoustics@users.noreply.github.com>

Enhance data regression fixture with JSON support and formatting

for more information, see https://pre-commit.ci

nicoddemus · 2026-04-24T12:39:29Z

+        :param extension: Extension of the file. Defaults to ".yml".
+            If equal to ".json", expects `data_dict` to be JSON serializable
+            and dumps it using standard `json.dump`.


As I commented in the issue, lets use an enum instead. 👍

nicoddemus · 2026-04-24T12:41:27Z

    return f"'{libname}' library is an optional dependency and must be installed explicitly when the fixture 'check' is used"


+def sort_dict_by_keys(data: MutableMapping[Any, Any]) -> MutableMapping[Any, Any]:


Why should we always sort the dict keys? Dicts are order preserving, in fact I can think of a few situations where users might want to regress against the original order, without sorting.

Let's remove key sorting altogether from this change, users can sort the dict themselves if they require that, I think.

nicoddemus · 2026-04-24T12:42:46Z

+            if extension.lower() in [".yml", ".yaml"]:
+                dumped_str = yaml.dump_all(
+                    [data_dict],
+                    Dumper=RegressionYamlDumper,
+                    default_flow_style=False,
+                    allow_unicode=True,
+                    indent=2,
+                    encoding="utf-8",
+                )
+                with filename.open("wb") as f:
+                    f.write(dumped_str)
+            elif extension.lower() == ".json":
+                dumped_str = json.dumps(
+                    data_dict, indent=2, sort_keys=True, ensure_ascii=False
+                )
+                with filename.open("w", encoding="utf-8") as f:
+                    f.write(dumped_str)
+            else:
+                raise NotImplementedError(
+                    f"file extension `{extension}` is not supported by data_regression; "
+                    "supported extensions are '.yml', '.yaml', '.json'"
+                )


Let's use a match against the enum, with an assert_never check for the default case.

for more information, see https://pre-commit.ci

MitchellAcoustics and others added 12 commits February 12, 2026 14:23

Adds JSON support to data regression fixture

aabb940

Adds expected JSON regression files

fa7a1f4

Add recursive dictionary sorting utility and integrate it into data r…

4555f36

…egression fixture

Update JSON output formatting in data regression fixture to use 4-spa…

1658c31

…ce indentation

Merge remote-tracking branch 'origin/master' into enable-json-io

e62d1c5

Refactor JSON dumping in DataRegressionFixture to use 2-space indenta…

ed8e9b5

…tion

Add JSON sorting test to validate sort_dict_by_keys functionality

93e5cd4

Co-authored-by: Copilot <copilot@github.com>

Remove stale and unused baseline test data files

b44d3db

Agent-Logs-Url: https://github.com/MitchellAcoustics/pytest-regressions/sessions/ecff2466-7c11-4a25-87cb-0ebd388921c3 Co-authored-by: MitchellAcoustics <22335636+MitchellAcoustics@users.noreply.github.com>

Merge pull request #1 from MitchellAcoustics/enable-json-io

3220ee0

Enhance data regression fixture with JSON support and formatting

[pre-commit.ci] auto fixes from pre-commit.com hooks

87e433a

for more information, see https://pre-commit.ci

nicoddemus mentioned this pull request Apr 24, 2026

Feature request: JSON as an alternative serialization format for data_regression #242

Open

nicoddemus requested changes Apr 24, 2026

View reviewed changes

MitchellAcoustics and others added 2 commits April 24, 2026 13:49

Make indent configurable in data regression functions

2a7ea5e

[pre-commit.ci] auto fixes from pre-commit.com hooks

918f3ab

for more information, see https://pre-commit.ci

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Enhance data regression fixture with JSON support and utilities#243

feat: Enhance data regression fixture with JSON support and utilities#243
MitchellAcoustics wants to merge 14 commits intoESSS:masterfrom
MitchellAcoustics:master

MitchellAcoustics commented Apr 24, 2026

Uh oh!

nicoddemus Apr 24, 2026

Uh oh!

nicoddemus Apr 24, 2026 •

edited

Loading

Uh oh!

nicoddemus Apr 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		return f"'{libname}' library is an optional dependency and must be installed explicitly when the fixture 'check' is used"


		def sort_dict_by_keys(data: MutableMapping[Any, Any]) -> MutableMapping[Any, Any]:

Conversation

MitchellAcoustics commented Apr 24, 2026

Uh oh!

nicoddemus Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

nicoddemus Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nicoddemus Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

nicoddemus Apr 24, 2026 •

edited

Loading