Commit c2e1210
authored
feat(minor): enable cross-module specialisation of iterators + add Histogram.add(_:) (#30)
Part 1 — @inlinable iterator hot paths
Downstream consumers that import `Histogram` from a different SwiftPM
target previously could not specialise the iterator calls because the
iterator method bodies were not visible to the SIL optimiser. Building
with `@_assemblyVision` in a release consumer emitted:
Unable to specialize generic function
"Histogram.Histogram.recordedValues()" since definition is not visible
Unable to specialize generic function
"Histogram.Histogram.RecordedValues.next()" since definition is not visible
and equivalents for every iterator family. In a profiled 3.4 GB workload
this showed up as `_swift_getGenericMetadata` /
`MetadataCacheKey::operator==` dominating the iterator hot path
(roughly 6% of main-thread wall time).
Mark the iterator hot paths `@inlinable` and their supporting types /
stored properties / transitive helpers `@usableFromInline`, so the
specialiser can inline concrete-`Count` iteration through the module
boundary:
- `IteratorImpl` and all its stored properties and methods.
- `Percentiles`, `LinearBucketValues`, `LogarithmicBucketValues`,
`RecordedValues`, `AllValues` (stored properties, init, next(), and
the private helpers they reach).
- Public accessors `recordedValues()`, `allValues()`, `percentiles(...)`,
`linearBucketValues(...)`, `logarithmicBucketValues(...)`.
- Transitive helpers: `highestEquivalentForValue`,
`lowestEquivalentForValue`, `nextNonEquivalentForValue`,
`sizeOfEquivalentRangeForValue`, `sizeOfEquivalentRangeFor`,
`valueFromIndex`, `valueFrom`, `subBucketIndexForValue`,
`bucketIndexForValue`, `countsIndexForValue`, `countsIndexFor`.
- Supporting stored fields on `Histogram`: `bucketCount`,
`subBucketMask`, `subBucketHalfCountMagnitude`,
`leadingZeroCountBase`, `unitMagnitude`.
- Computed `subBucketCount` and `subBucketHalfCount`.
- Explicit `@usableFromInline` memberwise initialiser on
`IterationValue` (the synthesised one was internal), so
`makeIterationValueAndUpdatePrev` can construct it from an
`@inlinable` context.
Part 2 — new `Histogram.add(_:)` merge API
The workload that prompted Part 1 was actually trying to merge two
histograms by iterating `recordedValues()` on one and calling
`record(_:count:)` on the other — O(recorded-values) with a log-bucket
index per value, and suffered the specialisation miss. Merging two
histograms of matching layout is really an element-wise sum of their
backing count arrays, which is what this method does:
@inlinable public mutating func add(_ other: Self)
Semantics, designed to match a replay of `other.recordedValues()` into
`self` via `record(_:count:)`:
- `numberOfSignificantValueDigits` and `lowestDiscernibleValue` must
match on both sides — those determine the bucket layout.
- Layouts match, so `self` and `other` compute the same index for a
given value. `other`'s nonzero buckets all live at indices ≤
`countsIndexForValue(other.maxValue)`. The merge touches `self`'s
public range state only when that index would not fit in `self`'s
current `counts`:
* If it fits (including the empty `other.maxValue == 0` case and
the case where `other`'s backing array is longer but its
recorded values all fit in `self`), only the common prefix is
summed. `self.counts.count` and `self.highestTrackableValue` are
not touched.
* Otherwise, if `self.autoResize == true`, `self` grows just
enough to index `other.maxValue` (not
`other.highestTrackableValue`, which can be much larger if
`other` was resized then reset).
* Otherwise, the merge is a precondition failure — silently
accepting would drop counts. This mirrors `record(_:)`, which
accepts values above the nominal `highestTrackableValue` as long
as their computed index fits in `counts` (subBucketCount is
rounded up to a power of two, so there is headroom past the
nominal ceiling).
Implementation uses `withUnsafeMutableBufferPointer` /
`withUnsafeBufferPointer` with `&+=` for a tight loop. `_totalCount`,
`maxValue`, and `minNonZeroValue` are updated accordingly.
Tests
- New `Tests/HistogramTests/HistogramMergeTests.swift` covering:
round-trip merge (with bucket-by-bucket equality and percentile
agreement), empty-other, self-add, receiver auto-resizes when
`other.maxValue` doesn't fit, merging a resized-and-reset `other`
into a fresh receiver is a strict no-op (asserts unchanged
`highestTrackableValue` and `counts.count`, not just `==` which
ignores those), merging a larger `other` that only recorded small
values does not grow the auto-resizing receiver (mirroring a replay
of `recordedValues()`), merging a shorter `other` into a longer
`self` leaves self's tail untouched, merging a longer-but-in-range
`other` into a fixed-size receiver succeeds, merging values above
the nominal `highestTrackableValue` that still fit in the backing
array succeeds, and a documentation test pinning the inputs that
would trap the precondition (catching the trap itself requires
signal-handling infra the package does not depend on).
- Un-skipped the pre-existing `testAutoSizingAdd` in
`Tests/HistogramTests/HistogramAutosizingTests.swift` (was
`throw XCTSkip("Histogram.add() is not implemented yet")`).
- Full suite: 60/60 pass.
Verifying specialisation
Minimal downstream SwiftPM target imports `Histogram` and annotates
its iterator-using functions with `@_assemblyVision`:
swift build -c release 2>&1 | tee /tmp/av.txt
grep "Unable to specialize" /tmp/av.txt
Before: 10 unique "Unable to specialize" remarks (80 emissions across
five iterator families) — every public iterator accessor and its
`next()`. After: 0. `grep "Specialized function"` now shows concrete
specialisations like `Histogram.Histogram.RecordedValues.next()` with
`(@inout Histogram<UInt64>.RecordedValues) -> @out
Optional<Histogram<UInt64>.IterationValue>` and
`Histogram.Histogram.add(_:)`.
Scope
Additive and ABI-compatible. No public type or method signature
changed. Storage layout of `Histogram` and `IteratorImpl` preserved —
only attributes added, plus the one new `add(_:)` method.1 parent de0b9b8 commit c2e1210
3 files changed
Lines changed: 565 additions & 69 deletions
File tree
- Sources/Histogram
- Tests/HistogramTests
0 commit comments