Fusion of Expand and Permute in Accelerate
Summary
Data-parallel array languages, like Accelerate, provide data-parallel operations as high-level functions for which no low-level programming mastery is required. Permutation and flattening by expansion are two of such operations that are useful for handling irregular nested data. Accelerate transforms the high-level code written by the user to low-level code.
Because many programs are inherently memory-bound, it is beneficial to minimise the number of repeated memory loads and the number of temporary, intermediate arrays generated by this process.
In this thesis, we study whether the number of temporary arrays can be reduced when performing a permutation after an expansion. Moreover, we will study the performance benefits of such a reduction.