-
Notifications
You must be signed in to change notification settings - Fork 29
ChaNGa Benchmarks
Below is a table of tests and benchmarks that have been performed with ChaNGa on a variety of architectures.
Tests are available in subdirectories of the ChaNGa distribution. Input and parameter files for the benchmarks can be obtained from our google drive site.
The nature of the benchmark input files are as follows.
lambs (lambs.param and lambs.00200) is a 3 million particle representation of the final state of a cosmological simulation of a volume 70 Mpc in size.
Lambb (lambb.param and lambb.00500) is an 80 million particle representation of that same volume. These simulations were originally used in Reed et al, 2003 to calculate the mass function of dark matter halos in a dark energy dominated universe down to the scale of dwarf galaxies.
dwf1 (dwf1.2048.param and dwf1.2048.00384) is a 5 million particle zoom-in simulation. It is cosmological, but the particle sampling focuses on a single halo of roughly 1e11 solar masses. This is a dark matter only version of the DWF1 model studied in Governato et al, 2007 (see table 3) that demonstrates how disk galaxies can form in a cosmological context.
dwf1.6144 (dwf1.6144.param and dwf1.6144.01472) is a 50 million particle representation of that same halo.
The dwf1.ms (dwf1.2048.ms.param) benchmark uses the same particle set as dwf1, but benchmarks the multistepping capabilities: 64 substeps are taken in the time reported.
The benchmarks are run with load balancing using roughly 1000 particles/treepiece. The OrbLB load balancer was used for the single stepping runs, and MultistepLB was used for the dwf.ms run. Times reported are the median over all steps of the total wall clock taken per step.
Run with charm++ version 6.8.0.
Machine | Date | Compiler | Charm build | tests | benchmarks (ncores:time) | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|
step | cosmo | collapse | lambs | lambb | dwf1 | dwf1.6144 | dwf1.ms | ||||
Pleiades (Ivybridge) | 8/2017 | gcc 6.2 --enable-sse2 | verbs-linux-x86_64 smp | + | + | + | 20:11.3s 160:1.75s | 160:46.5s 1280:6.44s | 20:17.3s 160:2.60s | 160:27.7s 1280:4.43s | 20: 219s 160:55.8s |
Bluewaters | 9/2017 | gcc 4.9.3 --enable-sse2 | gni-crayxe hugepages smp | + | + | + | 32:16.3s 256:2.4s | 256:72s 2048:10.0s | 32:24.s 256:3.5s | 256:38.s 2048:5.9s | 32:320.s 256:82s |
Comet | 9/2017 | gcc 4.9.3 --enable-sse2 | mpi-linux-x86_64 smp | + | + | + |
Run with charm++ version 6.7.1.
Machine | Date | Compiler | Charm build | tests | benchmarks (ncores:time) | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|
step | cosmo | collapse | lambs | lambb | dwf1 | dwf1.6144 | dwf1.ms | ||||
bluewaters | 1/2016 | GCC | gemini_gni-crayxe-hugepages-smp | + | + | + | 32:16.0s 256:2.3s | 256:68.9s 2048:9.9s | 32:25.8s 256:3.5s | 256:41.9s 2048:6.2s | 32:343s 256:86s |
stampede | 2/2016 | GCC | verbs-linux-x86_64-smp | + | + | + | 16:18.5s 128:2.7s | 128:78.5s 1024:11.2s | 16:28.9s 128:4.4s | 128:47.7s 1024:7.4s | 16:366s 128:79s |
comet | 4/2016 | Intel | mpi-linux-x86_64 | + | + | + | 24:9.55s 144:1.77s | 144:60.2s 1152:9.25s | 24:15.1s 144:2.94s | 144:43.1s 1152:6.39s | 24:203.6s 144:61.4s |
Pleiades (Ivybridge) | 4/2016 | gcc | verbs-linux-x86_64 smp | + | + | + | 20:14.0s 160:2.55s | 160:64.2s 1280:11.0s | 20:22.0s 160:4.35s | 160:41.2s 1280:7.71s | 20:313.s 160:80.3s |
Run with charm++ version 6.5
Machine | Date | Compiler | Charm build | tests | benchmarks (ncores:time) | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|
step | cosmo | collapse | lambs | lambb | dwf1 | dwf1.6144 | dwf1.ms | ||||
gordon | 5/2013 | GCC | net-linux-x86_64-ibverbs | + | + | + | 16:18.2s 128:2.8s | 128:79.9s 1024:12.3s | 16:28.4s 128:4.2s | 128:45.9s 1024:6.9s | 16:1105s 128:300s |
gordon | 5/2013 | Intel | net-linux-x86_64-ibverbs | 16:18.3s 128:2.7s | 128:80.5s 1024:12.3s | 16:29.4s 128:4.2s | 128:47.6s 1024:7.0s | 16:1247s 128:323s | |||
gordon | 5/2013 | GCC | mpi-linux-x86_64 | 16:18.6s 128:3.0s | 128:87.1s 512:24.8s | 16:29.5s 128:4.5s | 128:51.1s 256:27.1s | ||||
bluewaters | 5/2013 | GCC | gemini_gni-crayxe-hugepages-smp | 32:17.3s 256:2.5s | 256:72.9s 2048:10.2s | 32:26.0s 256:3.3.7s | 256:41.3s 2048:6.2s | 32:401s 256:91.3s | |||
stampede | 6/2013 | GCC | net-linux-x86_64-ibverbs | 16:15.6s 128:2.8s | 128:66.7s 1024:10.1s | 16:25.1s 128:4.2s | 128:39.4s 1024:6.5s | 16:392s 128:98.2s | |||
kraken | 6/2013 | GCC | mpi-crayxt | 12:32.4s 96:5.1s | 96:140s 768:20.7s | 12:49.8s 96:7.5s | 96:78.4s 768:14.3s | 12:753s 96:186s |
Run with charm version 6.2.
Machine | Date | Compiler | Charm build | tests | benchmarks (ncores:time) | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|
step | cosmo | collapse | lambs | lambb | dwf1 | dwf1.6144 | dwf1.ms | ||||
ranger | 2/2010 | Intel | mpi-linux-x86_64-mpicxx | + | + | + | 16: 34.0s | 16: 53.8s | 16: 1400s | ||
ranger | 1/2010 | 128: 5.04s | 3072: 38s | 128: 7.90s | 1024: 22.3s | 128: 467s | |||||
kraken | 2/2010 | GCC | mpi-crayxt | + | + | + | 12:33.7s | 12: 53.1s | 12: 1003s | ||
kraken | 1/2010 | 96: 4.8s | 2304: 31s | 96: 7.3s | 768: 21.1s | 96: 834s | |||||
frost | 3/2010 | gcc | mpi-bluegenel | + | + | + | 16: 265s | 16: 396s | 32: 4476s | ||
frost | 2/2010 | 128: 36.2s | 128: 54.2s | 256: 1161s | |||||||
frost | 3/2010 | xlc | mpi-bluegenel-xlc | + | + | + | 16: 189s | 384: 287s | 16: 284s | 256: 249s | 32: 4920s |
frost | 3/2010 | xlc | 128: 27.6s | 3072: 89.7s | 128: 40.3s | 256: 1360s | |||||
blueprint | 2/2010 | xlc | lapi | 16: 71.5s | 432: 111.6s | 16: 108.0s | 168: 145.5s | 16: 1774s | |||
blueprint | 2/2010 | 128: 9.9s | 1728: 29.0s | 128: 14.8s | 1344: 24.5s | 128: 351s |