Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Albany/Trilinos and fix preinstalled kokkos logic #6514

Merged
merged 4 commits into from
Jul 25, 2024

Conversation

jewatkins
Copy link
Contributor

@jewatkins jewatkins commented Jul 18, 2024

This updates preinstalled Albany/Trilinos on pmcpu/chrysalis to versions that are compatible with e3sm/kokkos (https://github.com/E3SM-Project/kokkos/tree/e3sm-kokkos-4.2.00). e3sm/kokkos is first preinstalled and is then used as a tpl to build trilinos.

This also fixes preinstalled kokkos logic so that the preinstalled e3sm/kokkos is only used when USE_ALBANY or USE_TRILINOS is enabled (I think this only happens when MALI is built). This is based on #6473 and https://github.com/E3SM-Project/E3SM/tree/bartgol/find-kokkos-after-albany so we can close those once this is pushed.

SMS.ne30pg2_r05_IcoswISC30E3r5_gis20.BGWCYCL1850.chrysalis_gnu.allactive-gis20km works with this PR.

[non-BFB] only for tests with active MALI

jgfouca and others added 4 commits July 18, 2024 15:00
If Trilinos is being used, we should use the kokkos that comes
with it. This removes the need to explicitly set Kokkos_ROOT in
config_machines.xml to the Trilinos one.
…t yet

If it contained a line like

find_dependency(Trilinos REQUIRED)

then things would work correctly. Until then, find Trilinos first.
 - also update trilinos/albany installs on pmcpu/chrysalis
@jewatkins
Copy link
Contributor Author

chrysalis gcc installs are currently using gcc 9.2. Should I update those to gcc/11.2.0?

also, baselines might change with this PR since it's a new install.

Copy link
Contributor

@mperego mperego left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jewatkins !

@jewatkins
Copy link
Contributor Author

@jonbob anything we need to do to get this pushed? Steve tested the SMS BG case on pmcpu and it worked.

@jonbob
Copy link
Contributor

jonbob commented Jul 24, 2024

I'll test it as I merge -- but I'll make it a priority for today

@jonbob jonbob added the non-BFB PR makes roundoff changes to answers. label Jul 24, 2024
jonbob added a commit that referenced this pull request Jul 24, 2024
Update Albany/Trilinos and fix preinstalled kokkos logic

This updates preinstalled Albany/Trilinos on pmcpu/chrysalis to versions
that are compatible with e3sm/kokkos. e3sm/kokkos is first preinstalled
and is then used as a tpl to build trilinos.

This also fixes preinstalled kokkos logic so that the preinstalled
e3sm/kokkos is only used when USE_ALBANY or USE_TRILINOS is enabled (I
think this only happens when MALI is built).

[non-BFB] only for tests with active MALI
@jonbob
Copy link
Contributor

jonbob commented Jul 24, 2024

Successfully ran all tests in e3sm_landice_developer using no compiler on chrysalis, including:

SMS.ne30pg2_r05_IcoswISC30E3r5_gis20.BGWCYCL1850.chrysalis_gnu.allactive-gis20km

which previously had been failing. The tests did show expected DIFFs, but only with active MALI

Also passed as a test of not impacting other E3SM cases:
ERP_Ld3.ne30pg2_r05_IcoswISC30E3r5.WCYCL1850.chrysalis_intel.allactive-pioroot1

merged to next

Copy link
Contributor

@bartgol bartgol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jewatkins are those installations of trilinos/albany serial? Or do they enable openmp and serial? There is a possible issue down the road, related to whether compile_threaded is ON or OFF and whether EAM uses Kokkos. But we can cross that bridge when we get there.

@jewatkins
Copy link
Contributor Author

@jewatkins are those installations of trilinos/albany serial? Or do they enable openmp and serial? There is a possible issue down the road, related to whether compile_threaded is ON or OFF and whether EAM uses Kokkos. But we can cross that bridge when we get there.

they are serial-only. we'll have to do some additional work on the albany side to support compile_threaded since openmp would be the default host space.

@bartgol
Copy link
Contributor

bartgol commented Jul 24, 2024

@jewatkins are those installations of trilinos/albany serial? Or do they enable openmp and serial? There is a possible issue down the road, related to whether compile_threaded is ON or OFF and whether EAM uses Kokkos. But we can cross that bridge when we get there.

they are serial-only. we'll have to do some additional work on the albany side to support compile_threaded since openmp would be the default host space.

I think this is something for the devops team. Namely, we need to decide whether compile_threaded should force all components to use threads, or not. Imho, we'll need new and more fine-grained options (both in CIME and consequently in CMake), to allow different components to use different threading backends. This means we will have to figure out the threading choice of each component (serial, openemp, hip/cuda/sycl, ..), and then link to a kokkos installation that supports all of those. It seems complicated, so the devops team should first decide which cases it want to support. EAM/EAMxx+MALI will be a good testcase for this potential mix of kokkos devices.

@jonbob jonbob merged commit 99e16b2 into E3SM-Project:master Jul 25, 2024
21 checks passed
@jonbob
Copy link
Contributor

jonbob commented Jul 25, 2024

merged to master and expected DIFFs blessed

@mperego
Copy link
Contributor

mperego commented Jul 25, 2024

This is a big milestone for MALI in E3SM we have been after for a couple of years.
We finally have coupled BG tests enabled and running with an up-to-date version of Albany and Kokkos.
Thanks @stephenprice and @jonbob for setting up the tests, @jewatkins for updating the Albany and Trillions libraries fixing the cake logic, and @bartgol for updating Kokkos in E3SM and @jgfouca for the work in cmake.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Machine Files mpas-albany-landice non-BFB PR makes roundoff changes to answers.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants