Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default cache directory should respect XDG Base Directory Specification. #1456

Closed
gerritholl opened this issue Nov 26, 2020 · 5 comments · Fixed by #1887
Closed

Default cache directory should respect XDG Base Directory Specification. #1456

gerritholl opened this issue Nov 26, 2020 · 5 comments · Fixed by #1887

Comments

@gerritholl
Copy link
Member

Feature Request

Is your feature request related to a problem? Please describe.

As a default cache directory, satpy uses '.':

satpy/satpy/resample.py

Lines 426 to 432 in a18b6e2

def _create_cache_filename(self, cache_dir=None, prefix='',
fmt='.zarr', **kwargs):
"""Create filename for the cached resampling parameters."""
cache_dir = cache_dir or '.'
hash_str = self.get_hash(**kwargs)
return os.path.join(cache_dir, prefix + hash_str + fmt)

This is potentially problematic: the current directory may be unwriteable or otherwise a poor choice to create files.

Describe the solution you'd like

I would like for satpy to respect the XDG Base Directory Specification and use the $XDG_CACHE_HOME directory (probably $XDG_CACHE_HOME/satpy) as a cache directory if none is specified.

Describe any changes to existing user workflow

Users who have $XDG_CACHE_HOME set will find cache files where they previously didn't.

Simply taking $XDG_CACHE_HOME will not introduce additional dependencies. Satpy could consider to use appdirs.user_cache_dir("satpy") using the appdirs package. This solution would add appdirs as a satpy dependency.

Additional context

I can pass cache_dir=appdirs.user_cache_dir("satpy") explicitly to .resample(...)

@djhoese
Copy link
Member

djhoese commented Nov 26, 2020

Pyspectral is already using appdirs and if we are going to follow any specification it needs to work/apply to all three of the major platforms we support (linux, mac, windows) so I vote for appdirs if we have to.

Side question: Satpy also defaults to putting output files (from Scene.save_dataset/s) in the current directory. Do you see us needing to follow a different standard practice/specification for this as well?

@gerritholl
Copy link
Member Author

I'm not aware of an existing standard for where to put output files. I personally have a small utility

def plotdir(basedir=None, create=False):
    pd = (pathlib.Path(
            (basedir or
             os.environ.get("PLOT_BASEDIR") or
             "/media/nas/x21308/plots_and_maps/")) /
          datetime.datetime.now().strftime("%Y/%m/%d"))
    if create:
        pd.mkdir(parents=True, exist_ok=True)
    return pd

that I pass when I'm creating plots or maps, but that's a bit too specific for my personal taste (in particular the sorting into the current date) to suit as a generic default. For logfiles I deliberately don't use appdirs.user_log_dir() because I don't agree that logfiles should be in the cache directory (I want my logfiles to persist).

@djhoese
Copy link
Member

djhoese commented Nov 27, 2020

Although appdirs isn't necessarily a standard, it at least tries to use semi-standard directories for things. I agree that the log thing is a little odd (issue on their github?).

For output files, I think the current directory is "good enough" for now and pretty expected by most people.

@djhoese
Copy link
Member

djhoese commented Dec 4, 2020

I realized this today, Satpy doesn't cache anything by default does it? All of the uses of cache_dir in resampling are triggered by the user passing an actual cache_dir. So the real issue here is that that _create_cache_filename shouldn't have cache_dir be a keyword argument at all. It should be a required positional argument.

@pnuu @mraspaud Am I missing a use case of this function?

@pnuu
Copy link
Member

pnuu commented Jan 7, 2021

Nothing is cached by default in satpy.resample. And indeed, cache_dir should be a normal argument in _create_cache_filename, and the cache_dir = cache_dir or '.' line removed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants