perf: use Map for staticChildren to avoid megamorphic lookups#421
Open
joshuaisaact wants to merge 1 commit into
Open
perf: use Map for staticChildren to avoid megamorphic lookups#421joshuaisaact wants to merge 1 commit into
joshuaisaact wants to merge 1 commit into
Conversation
Replace the staticChildren plain object with a Map keyed by character code. Plain object property access with dynamic string keys causes V8 to fall into megamorphic KeyedLoadIC when nodes have different sets of child keys. Map.get() is a single monomorphic call regardless of stored keys, and Maps preserve insertion order for iteration.
ec97dfd to
0231f01
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
AI disclosure: This change was developed with Claude Code (Opus) using auto-claude, an automated experiment loop that edits code, benchmarks, and keeps or reverts changes. The profiling insight and initial fix (a sparse charCode-indexed array with a secondary list to preserve iteration order) emerged from 14 experiments — 8 kept, 6 reverted. I simplified it to a Map during review, which gives the same performance win with a much cleaner diff.
All code reviewed and understood by a human. Happy to close if this doesn't meet the bar.
Why
Profiling
find()with--profshowsKeyedLoadIC_Megamorphicconsuming ~14% of total ticks on deep tree lookups (e.g. long static + parametric routes). This happens becausestaticChildrenis a plain object with dynamic single-character string keys, and every node has a different set of keys. V8 can't build a stable inline cache forobj[path.charAt(i)]when the object shapes vary across nodes, so it falls back to a slow hash-table lookup every time.What
Replace the
staticChildrenplain object with aMapkeyed by character code.Map.get()is a single monomorphic call regardless of stored keys, which eliminates the megamorphic IC entirely. Maps also preserve insertion order, soprettyPrintiteration works without any extra bookkeeping.The diff is +11/-10 across three files with no API or behavioral changes.
Benchmarks
Focused run on the 5 slowest cases (500 min samples):
+13% on the slowest case, +17% on common prefix. Other cases within noise.
Full experiment log (14 experiments)
.length=0— regression ~30% across all casesfindStaticMatchingChildintogetNextNode— +6.2% on slowestindexOf— multi-param regressed 8%matchPrefixreplacing compilednew Function— long static regressed 25%getNextNodefor pure-static nodes — +5.4% common prefixsafeDecodeURIresult objectfind()intosafeDecodeURI— +3.1% on slowestindexOf('%')fast path insafeDecodeURI— all cases regressed 5-9%pureStaticNodeflag — +3.7% common prefixfind()result object — all cases regressed 5-7% (V8 escape analysis was already eliminating it)safeDecodeURIintofind()— function too large for TurboFan optimization budgetOnly experiment 2 (the core insight) is included in this PR. The rest were explored but either had tradeoffs or were additive optimizations on top.
References