— Jun 09, 2016
In his book, Programming Elixir, Dave Thomas referrs to the parallel map function as the “hello world of Erlang”. It is a fun problem to solve! The premise is relatively simple, the operation performed on each element is done in separate processes. This allows the work to be done in parallel on the collection. If you’re interested in an exmaple, check out my friend Nathan Long’s post
So why would you want to map over a collection in parallel? The answer is usually “it’s FASSSST!” Yes, but when? The bottleneck that parallel map addresses is the serial nature of the traditional map operation. Each element is evaluated in serial and the operation is applied. If the operation is slow, then the map operation can be significantly sped up by running them in parallel. However, as with all optimizations there is a tax. In the case of parallel map there are two taxes:
Use parallel if:
sum(cost of operations) > cost of iteration + cost of processes
Otherwise, use regular map.
Put more succinctly, parallel map is a good fit if the operation being performed on each element is slow.
Here’s a benchmark in Elixir.
And the results:
● master ~/Code/OSS/parallel » mix run samples/maps.exs Benchmarking quick pmap... Benchmarking quick map... Name ips average deviation median quick map 23093.24 43.30μs (±19.05%) 41.00μs quick pmap 140.88 7098.46μs (±16.30%) 6926.00μs Comparison: quick map 23093.24 quick pmap 140.88 - 163.93x slower Benchmarking slow pmap... Benchmarking slow map... Name ips average deviation median slow pmap 78.93 12668.78μs (±9.53%) 12592.50μs slow map 0.50 2000475.00μs (±0.02%) 2000475.00μs Comparison: slow pmap 78.93 slow map 0.50 - 157.91x slower
Slow operation, parallel map wins. Fast operation, regular map wins.