DASH Example bench.12.for_each

Examples Index

Synopsis:

Benchmark for comparing the performance of different for_each implementations. Each for each implementation calculates the sum of the “passed” elements.

Implementations

In all cases the fastet implementation (which fits the interface) for the given task is used.

  • std::for_each l. std::for_each(lbegin, lend, f)
  • dash::for_each dash::for_each(begin, end, f)
  • dash::stl_for_each dash::stl_for_each(begin, end, f)
  • dash::for_each_with_index dash::for_each_with_index(begin, end, f)

Usage (DART-MPI)

$ DASH_MAX_UNIT_THREADS=<T> mpirun -n <P> ./bin/bench.12.for_each

Options

Parameter Description Default
\(\texttt{sb}\) initial size of one dimension of the square NArray 1000
\(\texttt{tmax}\) end benchmark if single round takes longer than # seconds 20

Sample Output

Projekt03 Haswell: 20 units, 1 thread / unit

$ mpirun -n 20 ./bin/bench.12.for_each
-- Runtime arguments
--   -sb           1000                                        initial matrix size
--   -tmax         1000                                max time in s per iteration
----------------------------------------------------------------------------------
units, mpi.impl,  l. size mb,  l. elems.s ,                    impl., time fill.s,time foreach.s,time total.s
   20, intelmpi,        0.00,   235849.06k,         std::for_each l.,        0.00,          0.00,        0.00
   20, intelmpi,        0.00,     2053.30k,           dash::for_each,        0.00,          0.02,        0.03
   20, intelmpi,        0.00,     2133.38k,       dash::stl_for_each,        0.00,          0.02,        0.02
   20, intelmpi,        0.00,     2085.24k,dash::for_each_with_index,        0.00,          0.02,        0.02
   20, intelmpi,        0.00,   223964.17k,         std::for_each l.,        0.00,          0.00,        0.00
   20, intelmpi,        0.00,     2059.94k,           dash::for_each,        0.00,          0.10,        0.10
   20, intelmpi,        0.00,     2131.26k,       dash::stl_for_each,        0.00,          0.09,        0.10
   20, intelmpi,        0.00,     2087.18k,dash::for_each_with_index,        0.00,          0.10,        0.10
   20, intelmpi,        3.00,   225161.84k,         std::for_each l.,        0.01,          0.00,        0.02
   20, intelmpi,        3.00,     2058.80k,           dash::for_each,        0.01,          0.39,        0.40
   20, intelmpi,        3.00,     2133.98k,       dash::stl_for_each,        0.01,          0.37,        0.39
   20, intelmpi,        3.00,     2089.24k,dash::for_each_with_index,        0.01,          0.38,        0.40
   20, intelmpi,       12.00,   229835.52k,         std::for_each l.,        0.05,          0.01,        0.07
   20, intelmpi,       12.00,     2059.23k,           dash::for_each,        0.05,          1.55,        1.61
   20, intelmpi,       12.00,     2134.36k,       dash::stl_for_each,        0.06,          1.50,        1.56
   20, intelmpi,       12.00,     2089.51k,dash::for_each_with_index,        0.06,          1.53,        1.59
   20, intelmpi,       48.00,   219595.46k,         std::for_each l.,        0.22,          0.06,        0.27
   20, intelmpi,       48.00,     2059.85k,           dash::for_each,        0.22,          6.21,        6.43
   20, intelmpi,       48.00,     2134.21k,       dash::stl_for_each,        0.22,          6.00,        6.21
   20, intelmpi,       48.00,     2089.49k,dash::for_each_with_index,        0.24,          6.13,        6.37