Coarsening functional parallelism using intelligent search algorithms
Summary
The increase in parallelism in modern-day computer architectures requires programs capable of exploiting that parallelism. With the goal of automating the process of creating parallel implementations, we want to extract implicit parallelism from a program. This can be done easily when dealing with functionally pure languages but doing so may generate a parallel implementation that spends more time on communication than it saves on computation. To this end, we present a method capable of iteratively reducing the amount of parallelism until a proper balance between computation and communication is reached. Our method may be able to find more useful implicit parallelism than simply selecting tasks based on their size, as has been done in previous work.