Skip to content

Commit c2e1210

Browse files
authored
feat(minor): enable cross-module specialisation of iterators + add Histogram.add(_:) (#30)
Part 1 — @inlinable iterator hot paths Downstream consumers that import `Histogram` from a different SwiftPM target previously could not specialise the iterator calls because the iterator method bodies were not visible to the SIL optimiser. Building with `@_assemblyVision` in a release consumer emitted: Unable to specialize generic function "Histogram.Histogram.recordedValues()" since definition is not visible Unable to specialize generic function "Histogram.Histogram.RecordedValues.next()" since definition is not visible and equivalents for every iterator family. In a profiled 3.4 GB workload this showed up as `_swift_getGenericMetadata` / `MetadataCacheKey::operator==` dominating the iterator hot path (roughly 6% of main-thread wall time). Mark the iterator hot paths `@inlinable` and their supporting types / stored properties / transitive helpers `@usableFromInline`, so the specialiser can inline concrete-`Count` iteration through the module boundary: - `IteratorImpl` and all its stored properties and methods. - `Percentiles`, `LinearBucketValues`, `LogarithmicBucketValues`, `RecordedValues`, `AllValues` (stored properties, init, next(), and the private helpers they reach). - Public accessors `recordedValues()`, `allValues()`, `percentiles(...)`, `linearBucketValues(...)`, `logarithmicBucketValues(...)`. - Transitive helpers: `highestEquivalentForValue`, `lowestEquivalentForValue`, `nextNonEquivalentForValue`, `sizeOfEquivalentRangeForValue`, `sizeOfEquivalentRangeFor`, `valueFromIndex`, `valueFrom`, `subBucketIndexForValue`, `bucketIndexForValue`, `countsIndexForValue`, `countsIndexFor`. - Supporting stored fields on `Histogram`: `bucketCount`, `subBucketMask`, `subBucketHalfCountMagnitude`, `leadingZeroCountBase`, `unitMagnitude`. - Computed `subBucketCount` and `subBucketHalfCount`. - Explicit `@usableFromInline` memberwise initialiser on `IterationValue` (the synthesised one was internal), so `makeIterationValueAndUpdatePrev` can construct it from an `@inlinable` context. Part 2 — new `Histogram.add(_:)` merge API The workload that prompted Part 1 was actually trying to merge two histograms by iterating `recordedValues()` on one and calling `record(_:count:)` on the other — O(recorded-values) with a log-bucket index per value, and suffered the specialisation miss. Merging two histograms of matching layout is really an element-wise sum of their backing count arrays, which is what this method does: @inlinable public mutating func add(_ other: Self) Semantics, designed to match a replay of `other.recordedValues()` into `self` via `record(_:count:)`: - `numberOfSignificantValueDigits` and `lowestDiscernibleValue` must match on both sides — those determine the bucket layout. - Layouts match, so `self` and `other` compute the same index for a given value. `other`'s nonzero buckets all live at indices ≤ `countsIndexForValue(other.maxValue)`. The merge touches `self`'s public range state only when that index would not fit in `self`'s current `counts`: * If it fits (including the empty `other.maxValue == 0` case and the case where `other`'s backing array is longer but its recorded values all fit in `self`), only the common prefix is summed. `self.counts.count` and `self.highestTrackableValue` are not touched. * Otherwise, if `self.autoResize == true`, `self` grows just enough to index `other.maxValue` (not `other.highestTrackableValue`, which can be much larger if `other` was resized then reset). * Otherwise, the merge is a precondition failure — silently accepting would drop counts. This mirrors `record(_:)`, which accepts values above the nominal `highestTrackableValue` as long as their computed index fits in `counts` (subBucketCount is rounded up to a power of two, so there is headroom past the nominal ceiling). Implementation uses `withUnsafeMutableBufferPointer` / `withUnsafeBufferPointer` with `&+=` for a tight loop. `_totalCount`, `maxValue`, and `minNonZeroValue` are updated accordingly. Tests - New `Tests/HistogramTests/HistogramMergeTests.swift` covering: round-trip merge (with bucket-by-bucket equality and percentile agreement), empty-other, self-add, receiver auto-resizes when `other.maxValue` doesn't fit, merging a resized-and-reset `other` into a fresh receiver is a strict no-op (asserts unchanged `highestTrackableValue` and `counts.count`, not just `==` which ignores those), merging a larger `other` that only recorded small values does not grow the auto-resizing receiver (mirroring a replay of `recordedValues()`), merging a shorter `other` into a longer `self` leaves self's tail untouched, merging a longer-but-in-range `other` into a fixed-size receiver succeeds, merging values above the nominal `highestTrackableValue` that still fit in the backing array succeeds, and a documentation test pinning the inputs that would trap the precondition (catching the trap itself requires signal-handling infra the package does not depend on). - Un-skipped the pre-existing `testAutoSizingAdd` in `Tests/HistogramTests/HistogramAutosizingTests.swift` (was `throw XCTSkip("Histogram.add() is not implemented yet")`). - Full suite: 60/60 pass. Verifying specialisation Minimal downstream SwiftPM target imports `Histogram` and annotates its iterator-using functions with `@_assemblyVision`: swift build -c release 2>&1 | tee /tmp/av.txt grep "Unable to specialize" /tmp/av.txt Before: 10 unique "Unable to specialize" remarks (80 emissions across five iterator families) — every public iterator accessor and its `next()`. After: 0. `grep "Specialized function"` now shows concrete specialisations like `Histogram.Histogram.RecordedValues.next()` with `(@inout Histogram<UInt64>.RecordedValues) -> @out Optional<Histogram<UInt64>.IterationValue>` and `Histogram.Histogram.add(_:)`. Scope Additive and ABI-compatible. No public type or method signature changed. Storage layout of `Histogram` and `IteratorImpl` preserved — only attributes added, plus the one new `add(_:)` method.
1 parent de0b9b8 commit c2e1210

3 files changed

Lines changed: 565 additions & 69 deletions

File tree

0 commit comments

Comments
 (0)