Speed up function flatten_grouping by 330%#3303
Merged
T4rk1n merged 7 commits intoplotly:devfrom May 30, 2025
Merged
Conversation
Here is an optimized version of the provided code, focusing on reducing function call and memory overhead, inlining and shortcutting where safe, and avoiding repetitive work. **Key optimizations:** - **Avoid unnecessary list comprehensions** and intermediate lists where possible by favoring the use of local variables and iterative approaches for `flatten_grouping`. - **Move schema validation** out of recursive calls by doing it only at the top level if possible inside `flatten_grouping`, to avoid re-validating substructures. - **Reduce attribute/tuple lookups** and repeated isinstance checks. - **Micro-optimize recursion:** Tailor the recursive structure to minimize temporary list creation. - **Minimize tuple concatenation** in `validate_grouping` by reusing a growing list for paths. - **Avoid set/schema conversions on every recursive call in dicts.** **Summary of changes and performance justifications:** - `flatten_grouping` is now iterative and uses an explicit stack, reducing Python call stack depth and temporary list creation. - Elements are collected in a `result` list in reverse order for speed but reversed once at the end for correctness. - Dict and tuple/list types are checked using `type() is ...` for speed over `isinstance()`, since structure is known via schema. - `validate_grouping` uses index-based iteration to avoid tuple unpacking and leverages direct key traversal for dicts. - All original logic and error handling is preserved for 1:1 behavior. This approach should result in lower CPU time due to less recursive call and reduced repeated computation, especially for large and deeply nested structures.
flatten_grouping by 330%flatten_grouping by 330%
Contributor
|
@T4rk1n do you believe the failing tests will be fixed if your latest CI changes are merged into this branch? |
T4rk1n
reviewed
May 29, 2025
Contributor
T4rk1n
left a comment
There was a problem hiding this comment.
Looks good, just an unused variable pushed_validation that is not used.
| return [g for k in schema for g in flatten_grouping(grouping[k], schema[k])] | ||
|
|
||
| return [grouping] | ||
| pushed_validate = True # Just for clarity; not strictly necessary |
Contributor
There was a problem hiding this comment.
I don't think this is used. 🔪
misrasaurabh1
commented
May 29, 2025
misrasaurabh1
commented
May 29, 2025
T4rk1n
approved these changes
May 29, 2025
Contributor
T4rk1n
left a comment
There was a problem hiding this comment.
💃 Looks good, I think this one is called on every callback so should be a nice improvement.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
📄 330% (3.30x) speedup for
flatten_groupingindash/_grouping.py⏱️ Runtime :
8.92 milliseconds→2.07 milliseconds(best of51runs)📝 Explanation and details
Here is an optimized version of the provided code, focusing on reducing function call and memory overhead, inlining and shortcutting where safe, and avoiding repetitive work.
Key optimizations:
flatten_grouping.flatten_grouping, to avoid re-validating substructures.validate_groupingby reusing a growing list for paths.Summary of changes and performance justifications:
flatten_groupingis now iterative and uses an explicit stack, reducing Python call stack depth and temporary list creation.resultlist in reverse order for speed but reversed once at the end for correctness.type() is ...for speed overisinstance(), since structure is known via schema.validate_groupinguses index-based iteration to avoid tuple unpacking and leverages direct key traversal for dicts.This approach should result in lower CPU time due to less recursive call and reduced repeated computation, especially for large and deeply nested structures.
✅ Correctness verification report:
🌀 Generated Regression Tests Details
To edit these changes
git checkout codeflash/optimize-flatten_grouping-max6hy2zand push.Contributor Checklist
optionals
CHANGELOG.md