generated from GEOS-ESM/geos-template-repo
-
Notifications
You must be signed in to change notification settings - Fork 1
2426 hpc metrics
Florian Deconinck edited this page Nov 27, 2024
·
1 revision
In collaboration with the NCCS, we are listing below a range of metrics that should be evaluated for each benchmark.
- Time to solution
- Despite no aiming for production-ready code by the end of the project we will still keep an eye on the "job-level" turn around and document improvement and potential non-numerics slowdown due to the technology swap.
- Energy
- Light software sampling to document amplitude of TPU’s chip: imprecise but can be easily ran with little overhead.
- Hardware monitoring on selected runs for precise measure: precise but requires close cooperation with NCCS sys admin and IT.
- Node-to-node
- Compare CPU node with GPU nodes
- Minimize generation difference for valid comparison
- Node usage
- Chip usage: measure in % of theoretical throughput rather than FLOPs
- Chip idle time: important for hybrid work
- Minimal hardware requirements
- For developments
- For scientific runs