Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FC5AV1C-04P2 is non-BFB with threading turned on #1203

Closed
singhbalwinder opened this issue Jan 4, 2017 · 76 comments
Closed

FC5AV1C-04P2 is non-BFB with threading turned on #1203

singhbalwinder opened this issue Jan 4, 2017 · 76 comments

Comments

@singhbalwinder
Copy link
Contributor

@wlin7 recently identified that the model is non-BFB when run with threading turned on. He has reproduced this problem on Eidson, Anvil and Cori with ne30 and ne120 grids. I (Balwinder) was also able to reproduce this on Eos with ne16 and ne30 grids. To reproduce this, I ran a smoke test on Eos:

SMS_Ln5.ne16_ne16.FC5AV1C-04P2

Resubmitting this test (./case.submit) produces non-BFB results when compared against the initial run.

@singhbalwinder
Copy link
Contributor Author

I ran few tests and identified the following line in cam/src/physics/cam/physpkg.F90 which might be causing this (line 997):
!$OMP PARALLEL DO PRIVATE (C, phys_buffer_chunk)

@singhbalwinder
Copy link
Contributor Author

@amametjanov ran some tests on Mira and found the results to be BFB. This may imply that the non-BFB behavior is only associated with the Intel compiler. I am not sure if @wlin7 has tested it with any other compiler. I have only run tests with the Intel compiler. Following is the email I got from @amametjanov :

The issue is probably isolated to Intel compiler . I did two runs of 
`-compset FC5AV1C-04P2 -res ne30_ne30 -mach mira` and compared atm.log's of the two runs; 
values were BFB: e.g.

run1:
 nstep, te       46   0.33526714831201005E+10   0.33526842034517279E+10   0.70337333877628250E-03   0.98533311649086012E+05
 nstep, te       47   0.33526160259059653E+10   0.33526285871776705E+10   0.69457809221461578E-03   0.98533311029669741E+05
 nstep, te       48   0.33525611951309304E+10   0.33525737290326490E+10   0.69306464339303055E-03   0.98533313845669283E+05
run2:
 nstep, te       46   0.33526714831201005E+10   0.33526842034517279E+10   0.70337333877628250E-03   0.98533311649086012E+05
 nstep, te       47   0.33526160259059653E+10   0.33526285871776705E+10   0.69457809221461578E-03   0.98533311029669741E+05
 nstep, te       48   0.33525611951309304E+10   0.33525737290326490E+10   0.69306464339303055E-03   0.98533313845669283E+05

Cprnc of cam.r.0001-01-02-00000.nc and cam.rh0.0001-01-02-00000.nc also shows the two runs to be 
identical. Both runs used 1024 nodes with 675 nodes and 4 MPI tasks x 16 OMP threads per node 
for the ATM.

So threaded F-cases are still BFB. :)

@wlin7
Copy link
Contributor

wlin7 commented Jan 5, 2017

@singhbalwinder , @amametjanov , for record, my non-BFB runs were also using intel compiler.

@worleyph
Copy link
Contributor

worleyph commented Jan 5, 2017

FYI - I'm comparing pgi and intel on Titan at the moment. Two runs with the Intel compiler were not BFB. PGI job will be submitted next.

@mt5555
Copy link
Contributor

mt5555 commented Jan 5, 2017

I'll add some results here from Intel (v15) compiler on the SNL institutional cluster skybridge.
I have not been able to reproduce the problem on this machine.

Note that @singhbalwinder isolated the problem to the physics parameterizations, so that would suggest this is not due to the initialization, and thus it should be detected by ERP tests (which compare a restart run to a full run).

ERP_Ld3_P8x4.ne4_ne4.FC5AV1C (PASS)
ERP_Ld3_P32x4.ne16_ne16.FC5AV1C PASS
ERP_P8x4.ne4_ne4.FC5AV1C PASS
ERP_Ld3_P32x4.ne16_ne16.FC5AV1C-04P2 PASS

SMS_Ld3_P32x4.ne16_ne16.FC5AV1C (compare against baseline - PASS)

@worleyph
Copy link
Contributor

worleyph commented Jan 5, 2017

@mt5555 , FYI - physics parameterizations are called during initialization.

@worleyph
Copy link
Contributor

worleyph commented Jan 5, 2017

On Titan, PGI was B4B for two identical runs, while Intel was not., using the 675x4 for ATM and 676x4 everything else for

 -compset FC5AV1C-04P2 -res ne30_ne30

@worleyph
Copy link
Contributor

worleyph commented Jan 5, 2017

Note that setting BFBFLAG to TRUE did not fix the problem.

@singhbalwinder
Copy link
Contributor Author

Thanks all for running these tests. So far it seems like the issue only exists when running with the Intel compiler on all machines but Skybridge. I am using intel/16.0.1.150, so it might just be some specific versions of the compiler.

In my tests, the problem appears to be somewhere in the radiation codes. If I comment out the physics_update call after radiation, I get BFB answers for my five time steps runs. I am currently looking into the radiation codes.

@wlin7
Copy link
Contributor

wlin7 commented Jan 6, 2017

Hi all,

Per Shaocheng's request, I have created a confluence page to track this problem. The problem and the main findings up to now are copied there. Since not everybody can see the github notice, main findings/updates here will be summarized and posted on the confluence until the problem is resolved. Thanks.

@ndkeen
Copy link
Contributor

ndkeen commented Jan 6, 2017

I can test on Cori with Intel 17. Is it just one of the tests?
I see: SMS_Ln5.ne16_ne16.FC5AV1C-04P2
I can run that with two diff thread PE layouts, but then how to determine if they are BFB or not?

@mt5555
Copy link
Contributor

mt5555 commented Jan 6, 2017

you should be able to run an ERP test which will compare a restart run with a full run.
To use SMS test, you would do the following:

create a baseline:
./create_test -g SMS_Ln5.ne16_ne16.FC5AV1C-04P2

make a new run, and compare against the baseline:
./create_test -c SMS_Ln5.ne16_ne16.FC5AV1C-04P2

I believe @wlin7 isn't using the test system, but instead just making two runs, and than diffing the atm.log file.

@singhbalwinder
Copy link
Contributor Author

Thanks @wlin7 for setting up the confluence page.

@ndkeen : To reproduce this bug, do the following:

Run SMS.ne16_ne16.FC5AV1C-04P2 using a PE layout which has more than one thread. This run will generate atm.log.* file in the run directory. After this run finishes, go to the case directory of this test and issue ./case.submit again. You will get a new atm.log.* file from this second run. Compare the two files and look if there are differences in the global stats reported in these files. The global stats looks like the following:

 nstep, te        2   0.33448292351395354E+10   0.33448008133551283E+10  -0.15716762386893993E-02   0.98527804155681632E+05
 nstep, te        3   0.33449462574350176E+10   0.33449131757767444E+10  -0.18293639217668925E-02   0.98527556425431263E+05
 nstep, te        4   0.33450519447836037E+10   0.33450230985236235E+10  -0.15951560136756191E-02   0.98527371879217841E+05
 nstep, te        5   0.33451305205696392E+10   0.33451101504293952E+10  -0.11264401849883554E-02   0.98527266615589455E+05
 nstep, te        6   0.33452377748902602E+10   0.33452111422874856E+10  -0.14727470926464590E-02   0.98527165389324291E+05

If you see differences in the numbers after nstep, te between the two runs, the runs are non-BFB.

@worleyph
Copy link
Contributor

worleyph commented Jan 6, 2017

And in my runs the difference first shows up on nstep 3, so you do not need a long run to see this. I've been using 1 day, though 5 step is probably enough.

@mt5555
Copy link
Contributor

mt5555 commented Jan 6, 2017

I ran this test on Edison, with acme master Jan 6:

ERP_Ld3_P8x4.ne4_ne4.FC5AV1C FAIL

So this confirms that the problem does show up in an ERP test and at very low resolution (ne4).

This identical test will PASS on skybridge. These are both xeon systems, the main difference being intel compiler versions.

@worleyph
Copy link
Contributor

worleyph commented Jan 6, 2017

I tried FC5AV1C-04P (not 04P2) and the differences showed up with nstep 1. I went back and looked again, and the same was true for FC5AV1C-04P2 (not nstep 3). Sorry for the misinformation.

@worleyph
Copy link
Contributor

worleyph commented Jan 6, 2017

@mt5555 , the Titan results are from using intel/15.0.2.164 .

@ndkeen
Copy link
Contributor

ndkeen commented Jan 6, 2017

Ack, I got:

35: forrtl: severe (154): array index out of bounds
35: Image              PC                Routine            Line        Source             
35: acme.exe           0000000003DFB9E1  Unknown               Unknown  Unknown
35: acme.exe (deleted  0000000003DF9B1B  Unknown               Unknown  Unknown
35: acme.exe           0000000003DA1634  Unknown               Unknown  Unknown
35: acme.exe (deleted  0000000003DA1446  Unknown               Unknown  Unknown
35: acme.exe           0000000003D1FE86  Unknown               Unknown  Unknown
35: acme.exe (deleted  0000000003D2BA13  Unknown               Unknown  Unknown
35: acme.exe           00000000039D8D20  Unknown               Unknown  Unknown
35: acme.exe (deleted  00000000010F3451  Unknown               Unknown  Unknown
35: acme.exe           00000000010F9CC2  Unknown               Unknown  Unknown
35: acme.exe (deleted  0000000000E4971D  Unknown               Unknown  Unknown
35: acme.exe           00000000004FB6D9  cam_comp_mp_cam_r         240  cam_comp.F90
35: acme.exe           00000000004EAC4F  atm_comp_mct_mp_a         341  atm_comp_mct.F90
35: acme.exe           000000000042705E  component_mod_mp_         227  component_mod.F90
35: acme.exe           000000000041C577  cesm_comp_mod_mp_        1926  cesm_comp_mod.F90
35: acme.exe           000000000042408D  MAIN__                     62  cesm_driver.F90
35: acme.exe (deleted  000000000040A8DE  Unknown               Unknown  Unknown
35: acme.exe           0000000003E1B7F0  Unknown               Unknown  Unknown

This is with a master a few days ago, but still with cime5

@rljacob
Copy link
Member

rljacob commented Jan 6, 2017

Skybridge has Intel 15.0.1 and Mark reports it passes. Titan has 15.0.2 and Pat reports a fail. Someone should try Blues (15.0.0).

@jayeshkrishna
Copy link
Contributor

I will try ERP_Ld3_P8x4.ne4_ne4.FC5AV1C on blues

@wlin7
Copy link
Contributor

wlin7 commented Jan 6, 2017

Noel, I saw this "array index out of bounds" during initialization on cori-knl when doing ne120 simulations, after successful runs using the same executable. Rerun was ok, so no further effort to dig into it. Don't know if it could be related to the current issue. You may also set "cosp_lite=.true." in user_nl_cam to reduce memory usage if the model is built with COSP.

@jayeshkrishna
Copy link
Contributor

I ran ERP_Ld3_P8x4.ne4_ne4.FC5AV1C on blues with intel 15.0 (the default with --compiler=intel) and it crashed for me. Can someone else try it on blues?

@rljacob
Copy link
Member

rljacob commented Jan 6, 2017

Note: There's a new test type in cime5 called "REP" which will simply do 2 identical runs and compare results. Try REP_Ld1_P8x4.ne4_ne4.FC5AV1C.

@wlin7
Copy link
Contributor

wlin7 commented Jan 6, 2017

@jayeshkrishna , for the blues. are you getting error message like

HYD_pmcd_pmip_control_cmd_cb (pm/pmiserv/pmip_cb.c:912): assert (!closed) failed
HYDT_dmxu_poll_wait_for_event (tools/demux/demux_poll.c:76): callback returned error status
[proxy:0:0@b511] main (pm/pmiserv/pmip.c:206): demux engine error waiting for event

I have been seeing such failure lately, a lot, with the model build previously working fine.

@jayeshkrishna
Copy link
Contributor

The error message above (HYD_pmcd_... ) can occur whenever a process crashes and the process manager is unable to clean up resources (close sockets etc) .
I have seen issues mentioned in #1206 on blues though.

@ndkeen
Copy link
Contributor

ndkeen commented Jan 6, 2017

I re-ran on cori, and got the same problem (array index out of bounds). I too have seen this and I think sometimes it was related to using too much memory on the node. This case was asking for 64 MPI's on 1 node, so I just submitted asking for 64 MPI's across 4 nodes -- this also failed, different error, but not helpful. All I can tell is it happened before the previous one. I have run other tests similar to this on cori without a problem. I will try a debug run.

OK, with DEBUG=TRUE run, the simulation is progressing. I have actually seen this behaviour before with a ne120 F compset on Cori -- where it runs DEBUG, but not optimized, however the error message is different.

For the DEBUG run, it ran for 13 ATM steps. I submitted again and after 6 steps, all values are identical. At what point might it be not equal? Ah, I did not change env_mach_pes to use more than 1 thread. I am still trying to debug why it stops for me with 1 thread, but will also run with more.

@singhbalwinder
Copy link
Contributor Author

@worleyph : Are you using the same PE- layout for all your tests?

@worleyph
Copy link
Contributor

@singhbalwinder , yes. See

 /autofs/nccs-svm1_home1/worley/ACPI/SVN/ACME/master/DEBUG2/ACME/cime/scripts/FC5AV1C-04P_ne30_ne30_intel_workcheck

Two threads should be enough, but @wlin7 started with 4 threads, and I have been sticking with that.

@worleyph
Copy link
Contributor

Did a diff of the two directories. In cam/physics, the only differences are in

 ACME/components/cam/src/physics/cam/clubb_intr.F90
 ACME/components/cam/src/physics/cam/micro_mg2_0.F90
 ACME/components/cam/src/physics/cam/micro_mg_cam.F90
 ACME/components/cam/src/physics/cam/micro_mg_utils.F90
 ACME/components/cam/src/physics/cam/phys_control.F90

Looks like it is supposed to be BFB:

 logical           :: mg_prc_coeff_fix     = .false.
 logical           :: rrtmg_temp_fix       = .false.

but perhaps something snuck in anyway? @singhbalwinder , which PR brought in these fixes?

@worleyph
Copy link
Contributor

@singhbalwinder , the only PR between #1127 (worked) and #1147 (failed) is #1128, which should be irrelevant. Looks to be part of #1147, and all of the source code changes are listed in the previous comment (cam/physics). I'll test #1128, just to be sure.

@worleyph
Copy link
Contributor

PR #1128 is also BFB by this test (rerunning the same problem twice). So the problem first shows up in #1147 .

@worleyph
Copy link
Contributor

worleyph commented Jan 15, 2017

@singhbalwinder , I commented out the differences between PR #1128 and PR #1147 in three files:

 clubb_intr.F90
 micro_mg_cam.F90
 micro_mg_utils.F90

and recovered BFB behavior. Looking at these changes, nothing looks wrong to me, so probably a compiler issue? The micro_mg code makes heavy use of pure and elemental functions and subroutines. I tried removing elemental from one routine in the above, and it made no difference. I tried removing more, and it would require removing LOTS more and changing many of the calls as well, so did not pursue this further.
I'll keep poking, and see if changing only one or two of these files is sufficient.

@worleyph
Copy link
Contributor

Backing out the changes to micro_mg_cam.F90 was sufficient (in this one test):

 !pw         cldfsnow_logic = ( cldfsnow(i,k) .lt. 1.e-4_r8 )
 !pw         if (rrtmg_temp_fix) then
 !pw            cldfsnow_logic = ( cldfsnow(i,k) .le. 1.e-4_r8 )
 !pw         endif
 !pw         if( cldfsnow_logic .and. ( qsout(i,k) .gt. 1.e-6_r8 )) then
 !pw++
          if( ( cldfsnow(i,k) .lt. 1.e-4_r8 ) .and. ( qsout(i,k) .gt. 1.e-6_r8 ) ) then
 !pw--

@mt5555
Copy link
Contributor

mt5555 commented Jan 16, 2017 via email

@gold2718
Copy link

@mt5555 , Yes, this sort of variable initialization in a subroutine or function sets the SAVE attribute. This practice is not allowed in CAM (at least not on purpose, a few may have slipped through code review), could it be added to the ACME coding standards?

@worleyph
Copy link
Contributor

I'll let someone else determine the fix.

@worleyph
Copy link
Contributor

Actually, there is no reason to initialize this variable at all. It is set before it is used.

I'll try this real quick.

@worleyph
Copy link
Contributor

worleyph commented Jan 16, 2017

@mt5555

 (edit: I see above your original message already gave the workaround).

No - this was just restoring the original code, so not supporting the rrtmg_temp_fix option. Your fix was relevant, but initializing to .false. is not necessary. The use of the variable cldfsnow_logic is not really necessary either. The following should be fine as well, I think.

     if (rrtmg_temp_fix) then
        if( ( cldfsnow(i,k) .le. 1.e-4_r8 ) .and. ( qsout(i,k) .gt. 1.e-6_r8 )) then
           cldfsnow(i,k) = 0.25_r8
        endif
     else
        if( ( cldfsnow(i,k) .lt. 1.e-4_r8 ) .and. ( qsout(i,k) .gt. 1.e-6_r8 )) then
           cldfsnow(i,k) = 0.25_r8
        endif
     endif

and this will all go away when the rrtmg_temp_fix is adopted, I assume. I'd vote for this latter solution, just to eliminate the cldfsnow_logic variable, but I don't really have a vote here.

@worleyph
Copy link
Contributor

Both "fixes" worked (using my one test).

@mt5555
Copy link
Contributor

mt5555 commented Jan 16, 2017

@gold2718 : regarding the compiler forcing the 'SAVE' attribute: is that required by the standard or up to the compiler? Perhaps the PGI compiler keeping it a stack variable and adding initialization code so it works correctly?

@singhbalwinder
Copy link
Contributor Author

Thanks a lot @worleyph ! This is great!

These were some temporary codes which are meant to be removed once we settle down on a configuration. I am familiar with this 'SAVE' attribute being assigned to all variable initialized at module level. I always assumed that it is in the standard rather than specific to a compiler. I am not sure why other compilers (PGI and XLF) are not complaining about it. Intel compiler on several machines like Eos, Skybridge etc. was also fine. On Eos, Intel compiler started showing non-BFB answers after CIME5 merge (not sure why).

It would have been okay (although still a bad idea) if this logical were to initialize at the beginning and stayed the same for the entire simulation. This particular logical is based on cldfsnow(i,k) value and should be private for each thread.

I will follow @worleyph suggestion and fix the code.

@gold2718
Copy link

The Fortran standard specifies that all variables initialized in a routine (function or subroutine) inherit the SAVE attribute. Module-level variables are a bit more confusing since they were required to have an explicit SAVE attribute until the 2008 standard where the SAVE attribute became the default. I'm still using explicit SAVE statements at module level since I never know which standard a particular compiler is using.

As for PGI and XLF, they have routinely ignored this sort of fine point in the standards. I still have #ifdef CPRPGI lines in the code to do things in a slower, broken way to work around places where PGI does not implement Fortran correctly.

BTW, another reason to avoid SAVE variables in routines is that in many cases, it hurts performance.

singhbalwinder added a commit that referenced this issue Jan 16, 2017
This PR addresses an issue which makes the model non-deterministic
(i.e. non-BFB) when run with more than one thread. PR #1147 introduced
a logical variable (cldfsnow_logic) which was declared and assigned at
module level. This kind of declaration automatically sets a variable
with 'SAVE' attribute which in turn makes the variable a shared variable
(to be shared by all the threads). This PR removes this variables and
retain the same functionality.

Fixes #1203

[BFB] - Bit-For-Bit
singhbalwinder added a commit that referenced this issue Jan 17, 2017
This PR addresses an issue which makes the model non-deterministic
(i.e. non-BFB) when run with more than one thread. PR #1147 introduced
a logical variable (cldfsnow_logic) which was declared and assigned at
module level. This kind of declaration automatically sets a variable
with 'SAVE' attribute which in turn makes the variable a shared variable
(to be shared by all the threads). This PR removes this variables and
retain the same functionality.

Fixes #1203

[BFB] - Bit-For-Bit
wlin7 added a commit that referenced this issue Jan 18, 2017
Merge branch 'singhbalwinder/atm/fix-non-BFB-threading-runs' into next

This PR addresses an issue which makes the model non-deterministic
(i.e. non-BFB) when run with more than one thread. PR #1147 introduced
a logical variable (cldfsnow_logic) which was declared and assigned at
module level. This kind of declaration automatically sets a variable
with 'SAVE' attribute which in turn makes the variable a shared variable
(to be shared by all the threads). This PR removes this variables and
retain the same functionality.

Fixes #1203

[BFB] - Bit-For-Bit
wlin7 added a commit that referenced this issue Jan 19, 2017
Merge branch 'singhbalwinder/atm/fix-non-BFB-threading-runs' (PR #1219)

This PR addresses an issue which makes the model non-deterministic
(i.e. non-BFB) when run with more than one thread. PR #1147 introduced
a logical variable (cldfsnow_logic) which was declared and assigned at
module level. This kind of declaration automatically sets a variable
with 'SAVE' attribute which in turn makes the variable a shared variable
(to be shared by all the threads). This PR removes this variables and
retain the same functionality.

Fixes #1203

[BFB] - Bit-For-Bit
rljacob pushed a commit that referenced this issue Feb 27, 2017
This PR addresses an issue which makes the model non-deterministic
(i.e. non-BFB) when run with more than one thread. PR #1147 introduced
a logical variable (cldfsnow_logic) which was declared and assigned at
module level. This kind of declaration automatically sets a variable
with 'SAVE' attribute which in turn makes the variable a shared variable
(to be shared by all the threads). This PR removes this variables and
retain the same functionality.

Fixes #1203

[BFB] - Bit-For-Bit
rljacob pushed a commit that referenced this issue Feb 27, 2017
Merge branch 'singhbalwinder/atm/fix-non-BFB-threading-runs' (PR #1219)

This PR addresses an issue which makes the model non-deterministic
(i.e. non-BFB) when run with more than one thread. PR #1147 introduced
a logical variable (cldfsnow_logic) which was declared and assigned at
module level. This kind of declaration automatically sets a variable
with 'SAVE' attribute which in turn makes the variable a shared variable
(to be shared by all the threads). This PR removes this variables and
retain the same functionality.

Fixes #1203

[BFB] - Bit-For-Bit
agsalin pushed a commit that referenced this issue Apr 13, 2017
Fix bug in ERP namelist comparison.

The following pattern was broken...
% ./create_test -o -g ERP.f45_g37_rx1.A.melvin_gnu -b jimtest
% ./create_test -c -n ERP.f45_g37_rx1.A.melvin_gnu -b jimtest

In the first command, the cmpgen_namelists used to occur during the RUN
phase and therefore occurs after the ERP build phase which impacts
the contents of the namelist files. In the second command, the cmpgen_namelists
occurs after the setup phase and therefore will not match the first.

With this change, we have essentially re-introduced all the namelist
creations that I tried to remove a few months ago in order to
improve performance. The SETUP phase is back up to 15 seconds on melvin.

Test suite: scripts_regression_tests
Test baseline:
Test namelist changes:
Test status: bit for bit

Fixes #1102

User interface changes?: Namelists in baselines are relative to post-SETUP phase

Code review: @jedwards4b
rljacob pushed a commit that referenced this issue Apr 16, 2021
This PR addresses an issue which makes the model non-deterministic
(i.e. non-BFB) when run with more than one thread. PR #1147 introduced
a logical variable (cldfsnow_logic) which was declared and assigned at
module level. This kind of declaration automatically sets a variable
with 'SAVE' attribute which in turn makes the variable a shared variable
(to be shared by all the threads). This PR removes this variables and
retain the same functionality.

Fixes #1203

[BFB] - Bit-For-Bit
rljacob pushed a commit that referenced this issue Apr 16, 2021
Merge branch 'singhbalwinder/atm/fix-non-BFB-threading-runs' (PR #1219)

This PR addresses an issue which makes the model non-deterministic
(i.e. non-BFB) when run with more than one thread. PR #1147 introduced
a logical variable (cldfsnow_logic) which was declared and assigned at
module level. This kind of declaration automatically sets a variable
with 'SAVE' attribute which in turn makes the variable a shared variable
(to be shared by all the threads). This PR removes this variables and
retain the same functionality.

Fixes #1203

[BFB] - Bit-For-Bit
rljacob pushed a commit that referenced this issue May 6, 2021
This PR addresses an issue which makes the model non-deterministic
(i.e. non-BFB) when run with more than one thread. PR #1147 introduced
a logical variable (cldfsnow_logic) which was declared and assigned at
module level. This kind of declaration automatically sets a variable
with 'SAVE' attribute which in turn makes the variable a shared variable
(to be shared by all the threads). This PR removes this variables and
retain the same functionality.

Fixes #1203

[BFB] - Bit-For-Bit
rljacob pushed a commit that referenced this issue May 6, 2021
Merge branch 'singhbalwinder/atm/fix-non-BFB-threading-runs' (PR #1219)

This PR addresses an issue which makes the model non-deterministic
(i.e. non-BFB) when run with more than one thread. PR #1147 introduced
a logical variable (cldfsnow_logic) which was declared and assigned at
module level. This kind of declaration automatically sets a variable
with 'SAVE' attribute which in turn makes the variable a shared variable
(to be shared by all the threads). This PR removes this variables and
retain the same functionality.

Fixes #1203

[BFB] - Bit-For-Bit
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants