-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix shared library error in tc_analysis #180
Comments
There appears to be a related concurrency/parallelism issue with E3SM Diags using TC Analysis: |
Got |
Also note that if you set
and |
Hey Ryan, I can also try troubleshooting, can you provide information on how to reproduce the problem? |
@chengzhuzhang Thanks! So the issue is that the error appears somewhat random. My concern is that I have forced How you can test/debug:
If you want the full list of steps I did, here they are:
|
I got an error parsing years:
|
@chengzhuzhang Can you point me to your config file? |
Yes, /home/ac.zhang40/test_zppy/tc_analysis.cfg on Chrysalis. |
Thanks, it says |
Hey Ryan, after correcting my |
Thanks, I will try a few more runs. My guess is the error occurs randomly... which makes debugging a challenge. Yes it writes intermediate files on |
I have run the following config file in four configurations: each combination of (running with zppy dev environment / running with E3SM Unified 1.6.0rc4 environment) x (one year-set / multiple year-sets). However, I could not replicate the error in any of the 4 configurations.
|
I think writing to |
@forsyth2, do you have a way of dumping out all of your environment variables ( |
Maybe we should consider close this issue if it is not reproduced later? |
Ok. I'll run a TC analysis test in parallel to be sure, and if there are no errors, I guess we can close this issue. |
Thanks! If I understand correctly, there was a workaround to force tc_analysis tasks and e3sm_diags tasks to run sequentially due to |
Yeah, that seems reasonable. I'm also wondering if your changes in E3SM-Project/e3sm_diags#824 (notably the file name changes) may help as well (i.e., by not directing parallel processes to changing the same data). |
There is a
GenerateConnectivityFile: error while loading shared libraries: libnetcdf.so.11: cannot open shared object file: No such file or directory
error when multiple years_sets are run simultaneously.The text was updated successfully, but these errors were encountered: