-
Notifications
You must be signed in to change notification settings - Fork 37
2020.10.07 Meeting Notes
- Individual/group updates
- AthenaPK scaling using cached MeshBlockPacks and single kernel buffer send/set functions:
- Discuss Regression Test Failures (Joshua and Andrew)
- Dense on Block - @gshipman
- Interface design (Container versus MeshBlockPacks) -- @pgrete
- Discuss large-scale testing for AMR at smaller mesh block sizes -- @gshipman
- Public headers vs. private headers
- Review non-WIP PRs
Mostly pull request reviews. @Joshua has been trying to wrap up public headers change.
Sriram still working on restart.
Introduced an abstraction for parallel reduce par_reduce
.
Ben is getting close to have particle framework changes merged in. On large static, uniform mesh, performance looks good. Still working on GPU performance.
Intel compiler errors cropping up.
Scaling of mesh block packs. Overhead of 256^3 down to 16^3 is now 6x on GPU (compared to baseline of 4x on CPU). Factor of 100 improvement over 3 months ago. This is on a full 2nd-order hydro problem.
Forrest noticed there were a lot of limitations in Jim's code due to register usage. It's an experimental code.
Feedback from Jim:
- Multi-value reduction - basically do a reduction on N variables and produce N results
- Possibly provide a custom reduction operation or something
- Output "tabs" files
- Discussion on perhaps doing something generic so downstream apps can support their own output formats
https://github.com/lanl/parthenon/issues/312
We'd be interested in doing dense-on-block variables to claw back some memory usage from switching to block-based from cell-based.
Dense-on-block basically means that you only allocate variables on blocks where they're used - implicit 0 everywhere else.
Complexity arises in implementation in AMR - boundary functions.
Dense-on-processor was considered, but the though is that is too coarse, and you only get marginal benefits from that.
More discussion is needed. @jdolence will schedule a meeting.
Integration of mesh block packing revealed issues with everything being based on Containers - that's what tasks operate on. Containers are low-granularity - only look at a set of variables per-mesh block.
Longer discussion warranted. @andrewgaspar will schedule a meeting.
Last piece of this puzzle: https://github.com/lanl/parthenon/pull/302/files
Galen wants to do weak scaling study with 16^3 blocks, maybe trying 8^3. We need to look at comparisons between CPU vs. GPU and raw numbers. Important to note that running on GPUs enables certain algorithms that are much more performant on GPU than CPU.