Describe the enhancement requested
The current base64_decode implementation processes input byte-by-byte using scalar operations.
This came up while working on a recent change to improve validation performance (PR #49660), where replacing find() with a lookup table highlighted that decoding itself is still done sequentially.
Since base64 decoding follows a regular pattern (4 chars → 3 bytes), it seems like it could benefit from SIMD/vectorized approaches (e.g., AVX2), especially for larger inputs.
I wanted to check:
- Is exploring a SIMD-based decoding path something that would be in scope for Arrow?
- Have there been any prior attempts or discussions around this?
- Would a CPU-dispatched approach (SIMD + scalar fallback) be acceptable here?
I haven’t explored SIMD in this area yet, but happy to prototype something or run comparisons if this aligns with project direction.
Component(s)
C++
Describe the enhancement requested
The current
base64_decodeimplementation processes input byte-by-byte using scalar operations.This came up while working on a recent change to improve validation performance (PR #49660), where replacing
find()with a lookup table highlighted that decoding itself is still done sequentially.Since base64 decoding follows a regular pattern (4 chars → 3 bytes), it seems like it could benefit from SIMD/vectorized approaches (e.g., AVX2), especially for larger inputs.
I wanted to check:
I haven’t explored SIMD in this area yet, but happy to prototype something or run comparisons if this aligns with project direction.
Component(s)
C++