Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Xarray Dataset load memory issue #2

Open
ricardo88faria opened this issue Nov 2, 2021 · 7 comments
Open

Xarray Dataset load memory issue #2

ricardo88faria opened this issue Nov 2, 2021 · 7 comments

Comments

@ricardo88faria
Copy link
Collaborator

ricardo88faria commented Nov 2, 2021

I'm running a big domain and at the line (

ds_interp["Z"] = ds_interp["Z"].load()
) the code crash with the following error output:
Reading WRF
cfg file is saved: madeira_30m_offline
Start horizontal interpolation
Calculating soil temperature and moisture from WRF
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████| 952/952 [17:30<00:00, 1.10s/it]
Start vertical interpolation
[1] 9349 killed python3.9 run_config_wrf4palm.py
/Users/ricardofaria/miniconda3/lib/python3.9/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
warnings.warn('resource_tracker: There appear to be %d '

Apparently loading it to memory is not the best for big domains?

Edit:
I was tracking memory usage and it got to 60gb ram and swap

@dongqi-DQ
Copy link
Owner

Thanks for reporting this. This line is redundant and I have it removed

ds_interp["Z"] = ds_interp["Z"].load()

Can you provide your domain configuration? I will try to run WRF4PALM with the same domain size and resolution and see where I can reduce the RAM usage.

The multiprocessing package I used will need the data to be loaded into RAM. Otherwise, python will return NaNs. This is the same when writing into netcdf files. I will try to crop the WRF domain vertically and see whether that can aid the RAM issue. Worst case scenario, more nesting inside a parent domain with coarse grid spacing may be the only solution.

@ricardo88faria
Copy link
Collaborator Author

The lines after that use .load() function also crash, it's because of excessive ram usage.
Here is the link with the files of the case I'm trying to run: http://oomdata.arditi.pt:8080/thredds/catalog/oompalm01/WRF4PALM_test/catalog.html

@dongqi-DQ
Copy link
Owner

The lines after that use .load() function also crash, it's because of excessive ram usage. Here is the link with the files of the case I'm trying to run: http://oomdata.arditi.pt:8080/thredds/catalog/oompalm01/WRF4PALM_test/catalog.html

Thanks for the files. The domain is big indeed. Interpolating 1 km WRF into 30 m PALM with a great number of grid points certainly can cost a lot of RAM. Based on your namelist, WRF4PALM will also need to interpolate 120 timestamps (12 hours with frequency of 10 mins), which can also significantly increase the RAM usage.

I will look into how I can further modify the RAM usage, but I cannot guarantee how much optimization and when a new update will be done.

Though, I do recommend not to update boundary conditions every 10 minutes in the offline nesting because this can hinder PALM's own features. The simulation results will be more like a high resolution WRF. The optimal boundary condition update frequency varies case by case and it might be good to give PALM some relaxation such that its own land surface and canopy model will show some impacts.

@xanfus
Copy link
Contributor

xanfus commented Mar 21, 2022

Why .load() is slow in your opinion, compared to other stages? I interpolate some 9 WRF nodes in XY plane and 15 in Z axis over 32x32x32 LES nodes. .load() takes 5-10 minutes and is single core operation.

@dongqi-DQ
Copy link
Owner

Why .load() is slow in your opinion, compared to other stages? I interpolate some 9 WRF nodes in XY plane and 15 in Z axis over 32x32x32 LES nodes. .load() takes 5-10 minutes and is single core operation.

Just my first guess that xarray doesn't read the data into memory and doesn't do any computation until the data are loaded into RAM when .load() is used. So when .load() is executed, a lot of computation time could be used for the calculations. Also I think one of xarray's advantage is parallel computing with dask so single core operation may slow xarray down. But I'm guessing that by saying "single core operation" you mean the multiprocessing functions instead of xarray?

@xanfus
Copy link
Contributor

xanfus commented Mar 21, 2022

"single core operation"

Only one thread is busy approximately 90% of time according to htop, when load method is executed. So far it suits me.

@xanfus
Copy link
Contributor

xanfus commented Mar 22, 2022

Just my first guess that xarray doesn't read the data into memory and doesn't do any computation until the data are loaded into RAM when .load() is used. So when .load() is executed, a lot of computation time could be used for the calculations.

Paraview does so too. First filters are set for application to data, then application is launched.
I don't recognize interpolation computationally greedy. Also, just to acquire some 32x32x32x100x5x4=6 MBs of data, 20+ GBs are loaded into RAM. I have to see, how PALM dimensions are applied to original dataset.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants