Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Broken pipe error when processing top boundary conditions #9

Open
Gabrielsvc opened this issue Apr 4, 2022 · 4 comments
Open

Broken pipe error when processing top boundary conditions #9

Gabrielsvc opened this issue Apr 4, 2022 · 4 comments
Labels
bug Something isn't working

Comments

@Gabrielsvc
Copy link

Processing top boundary conditions...
0%| | 0/5 [00:00<?, ?it/s]
Killed
(wrf4palm) isi-user@eolic-support:~/Repos/WRF4PALM$ Process ForkPoolWorker-121:
Traceback (most recent call last):
File "/home/isi-user/miniconda3/envs/wrf4palm/lib/python3.9/site-packages/multiprocess/pool.py", line 131, in worker
put((job, i, result))
File "/home/isi-user/miniconda3/envs/wrf4palm/lib/python3.9/site-packages/multiprocess/queues.py", line 381, in put
self._writer.send_bytes(obj)
File "/home/isi-user/miniconda3/envs/wrf4palm/lib/python3.9/site-packages/multiprocess/connection.py", line 208, in send_bytes
self._send_bytes(m[offset:offset + size])
File "/home/isi-user/miniconda3/envs/wrf4palm/lib/python3.9/site-packages/multiprocess/connection.py", line 413, in _send_bytes
self._send(buf)
File "/home/isi-user/miniconda3/envs/wrf4palm/lib/python3.9/site-packages/multiprocess/connection.py", line 376, in _send
n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe

This happened with max_pool=4 and 12, using the wrf4palm environment setup with conda. Will look further on why this is happening. Our namelist is attached to the issue.
namelist_wrf4palm.txt
And here is the wrfout file, which contains the first day of january with 48 timesteps, one for each 30 minutes.
Drive link for wrfout file

@dongqi-DQ dongqi-DQ added the bug Something isn't working label Apr 4, 2022
@Gabrielsvc
Copy link
Author

Huh, running in a more powerful machine gave me one more iteration done before getting the process killed(2 out of 5, instead of 0 out of 5 or 1 out of 5 iterations) Maybe it's breaking due to poor hardware?

@Gabrielsvc
Copy link
Author

Found the culprit. Memory was blowing up when running this step, as shown by running htop together with run_config_wrf4palm.py The extra iteration was being reached due to the extra 8GB RAM I have on the new machine. I've increased the swap size from 4GB to 18GB.

Maybe add a verification in some point of the code so that future users will know what caused the problem?

@dongqi-DQ
Copy link
Owner

The RAM usage has long been an issue... I will try to figure out how to optimize this more and will add more info in the code and the documentation.

@dongqi-DQ
Copy link
Owner

I've modified the code for top boundary processing (see commit d6fc6e2). Before this, the code loaded the entire dataset into RAM, which could be very large. Testing from my side shows a 40% drop of RAM usage. But when very fine grid spacing is used, the problem might still exist. So I will leave this issue open for now and see if I can figure out something to optimise this further.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants