Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updating cuda code to improve performance and to work around compiler issue #14

Merged
merged 1 commit into from
Sep 18, 2014

Conversation

worleyph
Copy link
Contributor

@worleyph worleyph commented Sep 5, 2014

Updating cuda_mod.F90 to improve performance and to work
around a new PGI compiler issue when compiling for the GPU.
The optimization work also eliminates a double allocation bug.

  1. Modified implementation of Pack-Exchange-Unpack
    to improve performance. (M. Norman)

  2. Renamed rrearth as rrearth_d to work around a
    (recently observed) PGI bug that that confused the
    locally defined rrearth with a variable in the
    physical constants module. (M. Norman and P. Worley)

(performance optimization does not change numerics;
bit-for-bit after fixing compiler and code bugs)

… PGI compiler issue when compiling for the GPU.

1) Modified implementation of Pack-Exchange-Unpack
   to improve performance. (M. Norman)

2) Renamed rrearth as rrearth_d to work around a
   (recently observed) PGI bug that that confused the
   locally defined rrearth with a variable in the
   physical constants module. (M. Norman and P. Worley)
@agsalin
Copy link
Member

agsalin commented Sep 18, 2014

Changes specific to GPU. Developers certify that code changes are bit-for-bit.
Some syntax changes for PGI compiler.

agsalin added a commit that referenced this pull request Sep 18, 2014
Updating cuda code to improve performance and to work around compiler issue
@agsalin agsalin merged commit fffb533 into master Sep 18, 2014
@agsalin agsalin deleted the worleyph/cam/se-gpu-restoration branch September 18, 2014 16:48
@douglasjacobsen douglasjacobsen assigned agsalin and unassigned mt5555 Sep 18, 2014
douglasjacobsen pushed a commit that referenced this pull request Aug 27, 2015
@jgfouca jgfouca mentioned this pull request Oct 23, 2015
apcraig pushed a commit to apcraig/E3SM that referenced this pull request Jul 26, 2022
…ctic/DODport01

Update pe layout for TL319_oEC60to30v3 for DOD machines
yunpengshan2014 pushed a commit that referenced this pull request Dec 6, 2022
Add ZM enhancement to v3atm

This branch adds the following enhancements to the ZM scheme to the
NGD_v3atm branch:

. Multiscale Coherent Structure Parameterization - implemented to
represent the effects of organized mesoscale convective systems.

. ZM convective cloud microphysics - developed from a two-moment four-class
(cloud water, cloud ice, rain, and snow) convective microphysics scheme
(Song and Zhang 2011, Song et al. 2012) by further representing the microphysical
processes related to the fifth hydrometeor species, graupel, and the interaction
between microphysics and cloud thermodynamics. The scheme is linked to aerosols
through cloud ice nucleation and droplet activation parameterizations. Currently,
the wet aerosol scavenging by convective clouds is not considered in the microphysics
scheme since it is already considered in the aerosol module. This one-way coupling
between microphysics and aerosol module enables the microphysics scheme to represent
the impact of aerosols on convective clouds and avoid the double counting of the
impact of convective clouds on aerosols.

. ZM mass flux adjustment - developed to better couple convection and circulation.

[Stealth]
[BFB]
AllenHuAtmoSci pushed a commit to AllenHuAtmoSci/PerlmutterMAM5 that referenced this pull request Apr 24, 2023
AaronDonahue pushed a commit that referenced this pull request May 9, 2023
cee/15.0.0 with GPU MPI buffers can crash in a system lib like this:

#4  0x00007fffe159e35b in (anonymous namespace)::do_free_with_callback(void*, void (*)(void*)) [clone .constprop.0] () from /opt/cray/pe/cce/15.0.0/cce/x86_64/lib/libtcmalloc_minimal.so.1
#5  0x00007fffe15a8f16 in tc_free () from /opt/cray/pe/cce/15.0.0/cce/x86_64/lib/libtcmalloc_minimal.so.1
#6  0x00007fffe99c2bcd in _dlerror_run () from /lib64/libdl.so.2
#7  0x00007fffe99c2481 in dlopen@@GLIBC_2.2.5 () from /lib64/libdl.so.2
#8  0x00007fffea7bce42 in _ad_cray_lock_init () from /opt/cray/pe/lib64/libmpi_cray.so.12
#9  0x00007fffed7eb37a in call_init.part () from /lib64/ld-linux-x86-64.so.2
#10 0x00007fffed7eb496 in _dl_init () from /lib64/ld-linux-x86-64.so.2
#11 0x00007fffed7dc58a in _dl_start_user () from /lib64/ld-linux-x86-64.so.2
#12 0x0000000000000001 in ?? ()
#13 0x00007fffffff42e7 in ?? ()
#14 0x0000000000000000 in ?? ()

Work around this by using cee/14.0.3.
philipwjones added a commit to philipwjones/E3SM that referenced this pull request May 17, 2023
OkayHughes pushed a commit to OkayHughes/E3SM that referenced this pull request Jun 21, 2023
…dera-path-cime-config

Add CLDERA_PATH to gnu_mappy.cmake mach file
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants