Irregular Segmented Look-Back Scans in Accelerate
Summary
Scan operations are a common building block in various parallel algorithms. Scan operations are usually significantly faster on GPUs when dealing with large datasets, due to their parallelism. In 2016, D. Merrill and M. Garland released a single-pass
decoupled look-back scan, which was significantly faster then the state-of-the art at that time.
We extend this single-pass decoupled look-back scan to support irregular segmented scans, where we scan over a ragged array using a separate array with flags. This is then implemented into the Accelerate framework, a framework for parallel array
computations in Haskell. The performance of the algorithm was compared to the original irregular segmented scan implementation in Accelerate. The changed single-pass decoupled look-back performed significantly better than the original when dealing with large datasets.