Skip to content

Commit

Permalink
Merge branch 'master' into application_testing
Browse files Browse the repository at this point in the history
  • Loading branch information
cscjlan committed Jun 25, 2024
2 parents 6a81385 + 77b1cc0 commit a81272c
Show file tree
Hide file tree
Showing 296 changed files with 8,859 additions and 4,466 deletions.
25 changes: 25 additions & 0 deletions .github/workflows/pages-html.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
name: Deploy HTML slides to Pages

on:
# Runs on pushes targeting the default branch
push:
branches:
- "master"
paths:
- "*/docs/**"
- ".github/workflows/pages.yml"

# Allows you to run this workflow manually from the Actions tab
workflow_dispatch:

# Sets permissions of the GITHUB_TOKEN to allow deployment to GitHub Pages
permissions:
contents: read
pages: write
id-token: write

jobs:
pages-html:
uses: ./.github/workflows/pages.yml
with:
include_pdf: false
17 changes: 17 additions & 0 deletions .github/workflows/pages-pdf.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
name: Deploy HTML and PDF slides to Pages

on:
# Allows you to run this workflow manually from the Actions tab
workflow_dispatch:

# Sets permissions of the GITHUB_TOKEN to allow deployment to GitHub Pages
permissions:
contents: read
pages: write
id-token: write

jobs:
pages-pdf:
uses: ./.github/workflows/pages.yml
with:
include_pdf: true
65 changes: 65 additions & 0 deletions .github/workflows/pages.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
# Script based on examples in https://github.com/actions/starter-workflows/tree/main/pages
name: Deploy slides to Pages

on:
workflow_call:
inputs:
include_pdf:
required: true
type: boolean

# Sets permissions of the GITHUB_TOKEN to allow deployment to GitHub Pages
permissions:
contents: read
pages: write
id-token: write

# Allow only one concurrent deployment, skipping runs queued between the run in-progress and latest queued.
# However, do NOT cancel in-progress runs as we want to allow these production deployments to complete.
concurrency:
group: "pages"
cancel-in-progress: false

jobs:
build:
timeout-minutes: 30
runs-on: ubuntu-latest
container:
image: ghcr.io/csc-training/slidefactory:3.1.0-beta.4
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Setup Pages
id: pages
uses: actions/configure-pages@v4
- name: Build slides
env:
INCLUDE_PDF: ${{ inputs.include_pdf }}
run: |
git config --global --add safe.directory $PWD
GIT_SHORT_SHA=$(git rev-parse --short $GITHUB_SHA)
GIT_DATE=$(git show -s --format=%ci $GITHUB_SHA)
ARGS=""
[[ "$INCLUDE_PDF" == "true" ]] && ARGS="--with-pdf"
slidefactory pages about.yml build \
--filters tools/mpi_links.py \
--info_content "Updated for [$GIT_SHORT_SHA]($GITHUB_SERVER_URL/$GITHUB_REPOSITORY/commit/$GITHUB_SHA) ($GIT_DATE)" \
$ARGS
- name: Upload artifact
uses: actions/upload-pages-artifact@v3
with:
path: ./build

deploy:
environment:
name: github-pages
url: ${{ steps.deployment.outputs.page_url }}
runs-on: ubuntu-latest
needs: build
steps:
- name: Deploy to GitHub Pages
id: deployment
uses: actions/deploy-pages@v4
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@
*.html
*.pdf
*.sif
build/
core

#ignore default binary name
a.out
Expand Down
15 changes: 7 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,15 @@
# CSC Summer School in High-Performance Computing 2023
# CSC Summer School in High-Performance Computing 2024

This is the material repository for the high-performance computing summer school by CSC - Finnish IT Center for Science.
This is the material repository for the High-Performance Computing Summer School organized by [CSC - IT Center for Science](https://csc.fi/en/).

Feel free to fork this repository and work through the exercises, see more details in
[exercise instructions](exercise-instructions.md). You can also add general notes to yourself (like how to compile files etc.) at the end of this readme file (`README.md`).
Feel free to fork this repository to work through the exercises.

Versions from previous years can be found in tags.

## Presentation slides

The slides are available [here](https://github.com/csc-training/summerschool/releases/).

## Exercises

- [General instructions](exercise-instructions.md)
Expand All @@ -16,7 +19,3 @@ Versions from previous years can be found in tags.
- [Hybrid MPI/OpenMP](hybrid)
- [GPU programming with OpenMP](gpu-openmp)
- [GPU programming with HIP](gpu-hip)

---
## Notes
- [x] Have fun!
13 changes: 13 additions & 0 deletions about.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# This file is used in the generation of the web page
title: CSC Summer School in High-Performance Computing
modules:
- intro-to-hpc
- computer-platforms
- mpi
- parallel-io
- hybrid
- gpu-openmp
- gpu-hip
- application-performance
- application-testing
- build-systems
2 changes: 1 addition & 1 deletion application-design/README.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
## Exercieses
## Exercises

- [Continuous integration](ci/)
3 changes: 3 additions & 0 deletions application-design/about.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# This file is used in the generation of the web page
title: Application Design
slidesdir: docs
2 changes: 1 addition & 1 deletion application-design/docs/01_design_choices.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: Application design
event: CSC Summer School in High-Performance Computing 2023
event: CSC Summer School in High-Performance Computing 2024
lang: en
---

Expand Down
2 changes: 1 addition & 1 deletion application-design/docs/02-documentation-testing.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: Documenting and testing
event: CSC Summer School in High-Performance Computing 2023
event: CSC Summer School in High-Performance Computing 2024
lang: en
---

Expand Down
2 changes: 1 addition & 1 deletion application-design/docs/03-collaboration-release.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: Collaboration and release
event: CSC Summer School in High-Performance Computing 2023
event: CSC Summer School in High-Performance Computing 2024
lang: en
---

Expand Down
3 changes: 3 additions & 0 deletions application-performance/about.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# This file is used in the generation of the web page
title: Application Performance
slidesdir: docs
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: Introduction to Application Performance
event: CSC Summer School in High-Performance Computing 2023
event: CSC Summer School in High-Performance Computing 2024
lang: en
---

Expand Down Expand Up @@ -41,7 +41,7 @@ Have you profiled your code?

- Profiled the code: 99.9% of the execution time was being spent on these lines:

```fortran
```fortranfree
do i=1,n ! Removing these unnecessary loop iterations reduced the
do j=1,m ! wall-time of one simulation run from 17 hours to 3 seconds…
do k=1,fact(x)
Expand Down
98 changes: 47 additions & 51 deletions application-performance/docs/02-gpu-performance.md
Original file line number Diff line number Diff line change
@@ -1,27 +1,30 @@
---
title: Single node performance optimization
event: CSC Summer School in High-Performance Computing 2023
title: GPU performance optimization
event: CSC Summer School in High-Performance Computing 2024
lang: en
---

# GPU performance optimization {.section}

# Introduction
- GPUs (Graphics Processing Units) are widely used in High-Performance Computing (HPC) applications.
- GPUs are powerful and complex processors designed for parallel computing.
- GPUs require explicit expression of parallelism by the programmer.

- GPUs (Graphics Processing Units) are widely used in High-Performance Computing (HPC) applications
- GPUs are powerful and complex processors designed for parallel computing
- GPUs require explicit expression of parallelism by the programmer

# General Principles for High GPU Performance

<div class=column>
:::::: {.columns}
::: {.column width="50%"}
- Keep all the compute resources busy (idle resources are a waste)
- Minimize the synchronization at all levels
- Minimize the data transfers between host and device
- Keep the data in faster memory and use an appropriate access pattern
</div>
<div class=column>
![](img/lumi_node.png){.center width=40%}
</div>
:::
::: {.column width="50%"}
![](img/lumi_node.png){.center width=50%}
:::
::::::

# GPU performance analysis {.section}

Expand All @@ -30,6 +33,7 @@ lang: en
![](img/perf-analysis-single-gpu.svg){.center width=60%}

# Measuring performance

- Don’t speculate about performance – measure it!
- Performance analysis tools help to
- Find hot-spots
Expand All @@ -43,12 +47,10 @@ lang: en

# Hardware performance counters

- Hardware performance counters are special registers on CPU \& GPU that count
hardware events
- They enable more accurate statistics and low overhead
- Special registers on CPU \& GPU that count hardware events
- Enable more accurate statistics and low overhead
- In some cases they can be used for tracing without any extra
instrumentation

- Number of counters is much smaller than the number of events that can be
recorded
- Different devices have different counters
Expand All @@ -71,71 +73,65 @@ lang: en
- Start with an overview!
- Call tree information, what routines are most expensive?

# Sampling vs. Tracing
# [Sampling]{.underline} vs. Tracing

<div class=column>
Sampling
:::::: {.columns}
::: {.column width="50%"}

- Application is stopped at predetermined intervals
- Information is collected about the state of application
- Lightweight, but may give skewed results
- Statistical information

</div>
<div class=column>
Tracing
:::
::: {.column width="50%"}

- Records events, e.g., every function call
- Requires usually modification to the executable *i.e.* instrumentation
- More accurate, but may affect program behavior
- Generates often lots of data

</div>
![](img/sampling.png){.left width=100%}

# Sampling vs. Tracing
:::
::::::

<div class=column>
Sampling
# Sampling vs. [Tracing]{.underline}

![](img/sampling.png){.left width=80%}
:::::: {.columns}
::: {.column width="50%"}

</div>
<div class=column>
Tracing
- Records events, e.g., every function call
- Requires usually modification to the executable: code instrumentation
- More accurate than sampling, but may affect program behavior
- Generates often lots of data

![](img/tracing.png){.left width=80%}
:::
::: {.column width="50%"}

</div>
![](img/tracing.png){.left width=100%}

# Tau Analysis Utilities
:::
::::::

<small>
# Tau Analysis Utilities

- TAU is a powerful performance evaluation toolkit
- <https://www.cs.uoregon.edu/research/tau/home.php>
- A performance evaluation toolkit
- Runs on all HPC platforms, relatively easy to install
- Targets all parallel programming/execution paradigms (GPU, MPI, OpenMP, pthreads, ...)
- Programming languages: Fortran, C, C++, UPC, Java, Python, ...
- Programming languages: Fortran, C, C++, UPC, Java, Python, ...

# Tau Analysis Utilities cont.

- TAU has instrumentation, measurement and analysis tools
- User-friendly graphical interface
- Profiling: Measures total time spent in each routine
- Tracing: Shows events and their timings across processes on a timeline
- I/O performance evaluation
- Memory debugging

</small>

# Omniperf Tools

- <https://amdresearch.github.io/omniperf/getting_started.html>
- system performance profiling tool for machine
learning/HPC workloads running on AMD MI GPUs.
- presently targets usage on MI100 and MI200 accelerators.
learning/HPC workloads running on AMD MI GPUs
- presently targets usage on MI100 and MI200 accelerators
- profiling, roofline model, tracing
- built on top of `roctracer` and `rocprof`
- supports both a web-based GUI and a command-line analyzer for user convenience.


# Web Resources
- TAU homepage
- <https://www.cs.uoregon.edu/research/tau/home.php>
- Omniperf
- <https://amdresearch.github.io/omniperf/getting_started.html>
- supports both a web-based GUI and a command-line analyzer for user convenience
2 changes: 1 addition & 1 deletion application-performance/docs/03-mpi-performance.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: MPI performance analysis
event: CSC Summer School in High-Performance Computing 2023
event: CSC Summer School in High-Performance Computing 2024
lang: en
---

Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: Single node performance optimization
event: CSC Summer School in High-Performance Computing 2023
event: CSC Summer School in High-Performance Computing 2024
lang: en
---

Expand Down
Loading

0 comments on commit a81272c

Please sign in to comment.