-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Xarray Dataset load memory issue #2
Comments
Thanks for reporting this. This line is redundant and I have it removed WRF4PALM/run_config_wrf4palm.py Line 340 in 2cf661f
Can you provide your domain configuration? I will try to run WRF4PALM with the same domain size and resolution and see where I can reduce the RAM usage. The multiprocessing package I used will need the data to be loaded into RAM. Otherwise, python will return NaNs. This is the same when writing into netcdf files. I will try to crop the WRF domain vertically and see whether that can aid the RAM issue. Worst case scenario, more nesting inside a parent domain with coarse grid spacing may be the only solution. |
The lines after that use .load() function also crash, it's because of excessive ram usage. |
Thanks for the files. The domain is big indeed. Interpolating 1 km WRF into 30 m PALM with a great number of grid points certainly can cost a lot of RAM. Based on your namelist, WRF4PALM will also need to interpolate 120 timestamps (12 hours with frequency of 10 mins), which can also significantly increase the RAM usage. I will look into how I can further modify the RAM usage, but I cannot guarantee how much optimization and when a new update will be done. Though, I do recommend not to update boundary conditions every 10 minutes in the offline nesting because this can hinder PALM's own features. The simulation results will be more like a high resolution WRF. The optimal boundary condition update frequency varies case by case and it might be good to give PALM some relaxation such that its own land surface and canopy model will show some impacts. |
Why .load() is slow in your opinion, compared to other stages? I interpolate some 9 WRF nodes in XY plane and 15 in Z axis over 32x32x32 LES nodes. .load() takes 5-10 minutes and is single core operation. |
Just my first guess that xarray doesn't read the data into memory and doesn't do any computation until the data are loaded into RAM when |
Only one thread is busy approximately 90% of time according to htop, when load method is executed. So far it suits me. |
Paraview does so too. First filters are set for application to data, then application is launched. |
I'm running a big domain and at the line (
WRF4PALM/run_config_wrf4palm.py
Line 340 in 2cf661f
Reading WRF
cfg file is saved: madeira_30m_offline
Start horizontal interpolation
Calculating soil temperature and moisture from WRF
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████| 952/952 [17:30<00:00, 1.10s/it]
Start vertical interpolation
[1] 9349 killed python3.9 run_config_wrf4palm.py
/Users/ricardofaria/miniconda3/lib/python3.9/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
warnings.warn('resource_tracker: There appear to be %d '
Apparently loading it to memory is not the best for big domains?
Edit:
I was tracking memory usage and it got to 60gb ram and swap
The text was updated successfully, but these errors were encountered: