ByteString: extract shared private helpers to reduce code duplication by pjfanning · Pull Request #2859 · apache/pekko

pjfanning · 2026-04-13T20:48:31Z

ByteString1C and ByteString1 contained near-identical SWAR search loops and unrolled byte-scan helpers duplicated verbatim across both classes. ByteStrings similarly had identical find inner functions in its generic and Byte-typed indexOf/lastIndexOf overloads.

New shared helpers in `object ByteString`

unrolledFirstIndexOf(bytes, fromIndex, byteCount, value) / unrolledLastIndexOf(...) — unrolled 1–7 byte scans, previously duplicated in both ByteString1C and ByteString1
swarFirstIndexOf(bytes, fromOffset, searchLength, elem) — SWAR forward-search loop; eliminates duplication between indexOf(Byte, Int) and indexOf(Byte, Int, Int) in both classes; returns absolute index
swarLastIndexOf(bytes, baseOffset, searchLength, elem) — SWAR reverse-search loop; same rationale; returns absolute index (consistent with swarFirstIndexOf)

Per-class changes

ByteString1 gains a private absToLogical(absIdx) helper to eliminate the repeated if (absResult == -1) -1 else absResult - startIndex pattern in all four index-search methods
ByteString1C / ByteString1 / ByteStrings: indexOf[B >: Byte] now pattern-matches on elem and dispatches to the Byte-specific SWAR overload for Byte inputs — removing the duplicate find inner function in ByteStrings and improving performance for Byte inputs (avoids boxing)
ByteStrings: lastIndexOf[B >: Byte] follows the same dispatch pattern

All public APIs are unchanged. Net: −80 lines. All helpers are small and static, keeping HotSpot inlining characteristics equivalent to the original inlined code.

… reuse - Add `unrolledFirstIndexOf(bytes, fromIndex, byteCount, value)` and `unrolledLastIndexOf(bytes, fromIndex, byteCount, value)` shared helpers to `object ByteString`, replacing identical private methods that were duplicated in `ByteString1C` and `ByteString1`. - Add `swarFirstIndexOf(bytes, fromOffset, searchLength, elem)` helper that contains the SWAR search loop, eliminating duplication between `indexOf(Byte, Int)` and `indexOf(Byte, Int, Int)` in both `ByteString1C` and `ByteString1`. - Add `swarLastIndexOf(bytes, baseOffset, searchLength, elem)` helper that contains the reverse SWAR search loop, eliminating duplication between the `lastIndexOf(Byte, Int)` implementations in `ByteString1C` and `ByteString1`. - Make `indexOf[B >: Byte]` in `ByteString1C`, `ByteString1`, and `ByteStrings` dispatch to the Byte-specific SWAR overload for Byte inputs (pattern match on elem), eliminating the near-identical `find` inner function that was duplicated between `indexOf[B]` and `indexOf(Byte, Int)` in `ByteStrings`, and improving performance for Byte searches in all three classes. - Make `lastIndexOf[B >: Byte]` in `ByteStrings` likewise dispatch to `lastIndexOf(Byte, Int)` for Byte inputs, removing the duplicate `find` function. - Net result: -89 lines of code with no behaviour change; all public APIs unchanged; HotSpot-friendly (small, static, easily-inlineable helpers). Agent-Logs-Url: https://github.com/pjfanning/incubator-pekko/sessions/3ba37c44-b65b-4094-9f11-7541417d7c37 Co-authored-by: pjfanning <11783444+pjfanning@users.noreply.github.com>

Agent-Logs-Url: https://github.com/pjfanning/incubator-pekko/sessions/3ba37c44-b65b-4094-9f11-7541417d7c37 Co-authored-by: pjfanning <11783444+pjfanning@users.noreply.github.com>

…l helper in ByteString1 Agent-Logs-Url: https://github.com/pjfanning/incubator-pekko/sessions/3ba37c44-b65b-4094-9f11-7541417d7c37 Co-authored-by: pjfanning <11783444+pjfanning@users.noreply.github.com>

…consistency Agent-Logs-Url: https://github.com/pjfanning/incubator-pekko/sessions/3ba37c44-b65b-4094-9f11-7541417d7c37 Co-authored-by: pjfanning <11783444+pjfanning@users.noreply.github.com>

pjfanning · 2026-04-14T18:23:01Z

My computer is getting a bit old and is not great at providing consistent conditions during JMH runs.
I got these results and with the changes shows some variability than I'm willing to put down to the machine having some other stuff running during the early benchmarks. Ultimately, I think these results are close enough to say that it is unlikely that there is a slowdown caused by this PR's changes.

With Changes
[info] Benchmark                                                           Mode  Cnt          Score          Error  Units
[info] ByteString_indexOf_Benchmark.bs1_indexOf_from                      thrpt    3  150359434.050 ±  59507989.846  ops/s
[info] ByteString_indexOf_Benchmark.bs1_indexOf_from_byte                 thrpt    3  174120693.736 ± 64645037.735  ops/s
[info] ByteString_indexOf_Benchmark.bss_indexOf_from_best_case            thrpt    3  217852446.284 ± 28875894.270  ops/s
[info] ByteString_indexOf_Benchmark.bss_indexOf_from_far_index_case       thrpt    3   18300391.307 ±  4958872.389  ops/s
[info] ByteString_indexOf_Benchmark.bss_indexOf_from_far_index_case_byte  thrpt    3   16102869.070 ±  9775298.895  ops/s
[info] ByteString_indexOf_Benchmark.bss_indexOf_from_worst_case           thrpt    3    5574890.131 ±   573776.254  ops/s

Without Changes
[info] Benchmark                                                           Mode  Cnt          Score          Error  Units
[info] ByteString_indexOf_Benchmark.bs1_indexOf_from                      thrpt    3  153101714.175 ± 29261792.393  ops/s
[info] ByteString_indexOf_Benchmark.bs1_indexOf_from_byte                 thrpt    3  186251713.877 ± 88143603.248  ops/s
[info] ByteString_indexOf_Benchmark.bss_indexOf_from_best_case            thrpt    3  218559185.909 ± 36443255.263  ops/s
[info] ByteString_indexOf_Benchmark.bss_indexOf_from_far_index_case       thrpt    3   17241268.500 ±  3343535.671  ops/s
[info] ByteString_indexOf_Benchmark.bss_indexOf_from_far_index_case_byte  thrpt    3   16227186.055 ±  2458904.387  ops/s
[info] ByteString_indexOf_Benchmark.bss_indexOf_from_worst_case           thrpt    3    5570419.443 ±  1216122.560  ops/s

He-Pin · 2026-04-14T18:25:10Z

I can verify it later

He-Pin · 2026-04-15T06:49:15Z

@@ -1141,6 +975,92 @@ object ByteString {
    def SerializationIdentity: Byte


💡 Consider @inline for swarFirstIndexOf / swarLastIndexOf / absToLogical

These helpers are on the hot path for every indexOf/lastIndexOf call. After extraction from in-class methods to companion-object private methods, the JVM JIT may still inline them eventually, but:

absToLogical is tiny (if (absIdx == -1) -1 else absIdx - startIndex) — a perfect candidate for @inline to avoid the call overhead in tight loops

swarFirstIndexOf and swarLastIndexOf are called from three different ByteString subclasses — without @inline, each call site adds a frame

The original code had these methods inline within each class (effectively monomorphic), so the JIT had an easier time. Extracting to shared helpers introduces polymorphic call sites.

Suggestion:

@inline private def absToLogical(absIdx: Int): Int = ... @inline private def swarFirstIndexOf(...): Int = ... @inline private def swarLastIndexOf(...): Int = ...

This ensures zero overhead from the deduplication — best of both worlds.

pjfanning · 2026-04-15T08:51:19Z

Fair point about the extra stack frame.
Scala 2 and Scala 3 declare inlining differently (@inline def vs inline def) and I don't want to go back to having separate Scala 2 and 3 code so I think it is better to keep the code duplication.

Copilot AI and others added 5 commits April 13, 2026 20:34

Clarify documentation for swarFirstIndexOf and swarLastIndexOf helpers

b729187

Agent-Logs-Url: https://github.com/pjfanning/incubator-pekko/sessions/3ba37c44-b65b-4094-9f11-7541417d7c37 Co-authored-by: pjfanning <11783444+pjfanning@users.noreply.github.com>

Standardize swarLastIndexOf to return absolute index; add absToLogica…

15a6f30

…l helper in ByteString1 Agent-Logs-Url: https://github.com/pjfanning/incubator-pekko/sessions/3ba37c44-b65b-4094-9f11-7541417d7c37 Co-authored-by: pjfanning <11783444+pjfanning@users.noreply.github.com>

Fix comment style: remove backticks from private helper comments for …

cff7c66

…consistency Agent-Logs-Url: https://github.com/pjfanning/incubator-pekko/sessions/3ba37c44-b65b-4094-9f11-7541417d7c37 Co-authored-by: pjfanning <11783444+pjfanning@users.noreply.github.com>

scalafmt

72bfea5

pjfanning marked this pull request as draft April 13, 2026 20:50

pjfanning requested review from He-Pin, Philippus, nvollmar and raboof April 14, 2026 18:24

He-Pin reviewed Apr 15, 2026

View reviewed changes

pjfanning closed this Apr 15, 2026

pjfanning mentioned this pull request Apr 15, 2026

small optimisation on indexOf #2863

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ByteString: extract shared private helpers to reduce code duplication#2859

ByteString: extract shared private helpers to reduce code duplication#2859
pjfanning wants to merge 5 commits intoapache:mainfrom
pjfanning:copilot/optimize-code-reuse-private-functions

pjfanning commented Apr 13, 2026 •

edited

Loading

Uh oh!

pjfanning commented Apr 14, 2026 •

edited

Loading

Uh oh!

He-Pin commented Apr 14, 2026

Uh oh!

He-Pin Apr 15, 2026

Uh oh!

pjfanning commented Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		@@ -1141,6 +975,92 @@ object ByteString {
		def SerializationIdentity: Byte

Conversation

pjfanning commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

New shared helpers in object ByteString

Per-class changes

Uh oh!

pjfanning commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

He-Pin commented Apr 14, 2026

Uh oh!

He-Pin Apr 15, 2026

Choose a reason for hiding this comment

💡 Consider @inline for swarFirstIndexOf / swarLastIndexOf / absToLogical

Uh oh!

pjfanning commented Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

pjfanning commented Apr 13, 2026 •

edited

Loading

New shared helpers in `object ByteString`

pjfanning commented Apr 14, 2026 •

edited

Loading

💡 Consider `@inline` for `swarFirstIndexOf` / `swarLastIndexOf` / `absToLogical`