Skip to content
This repository has been archived by the owner on Nov 21, 2023. It is now read-only.

[Feature] Allow specifying alternative extraction / temporary directory #20

Closed
pdcastro opened this issue May 11, 2021 · 20 comments
Closed

Comments

@pdcastro
Copy link

pdcastro commented May 11, 2021

The Readme file reads:

To Where Are the Packages Uncompressed at Runtime?
It depends on the operating system. You may find the location on your system with:
$ node -p "require(\"os\").tmpdir()"
Look for a directory named caxa in there.

Temporary directories are, well, temporary, and may get deleted when the user logs out or reboots the machine, depending on system configuration. I'd like to avoid that, to avoid the occasional performance hit which would be bad UX for end users (and might be "mysterious" to them). My app has hundreds of megabytes in node_modules and the runtime extraction time is non trivial (in excess of 10 seconds in a fairly fast machine with a SSD disk, possibly 30+ seconds in a slow machine).

  • A flag of the caxa command line could allow specifying an alternative folder (other than "require(\"os\").tmpdir()"). Say, --extract-dir /usr/local/lib/myApp.
  • In addition, a {{PREFIX}} string could be replaced at runtime with the value of the PREFIX environment variable, with a default value specified as {{PREFIX=/default/prefix}}.

Examples:

$ caxa --extract-dir "C:\Program Files\myApp" ...
$ caxa --extract-dir "/opt/myApp" ...
$ caxa --extract-dir "{{PREFIX=/usr/local}}/lib/myApp" ...

In the last example above, at runtime, if the PREFIX environment variable was defined and set to '/usr', then the app would be extracted to /usr/lib/myApp. If the PREFIX environment variable was not defined, or if it was defined and set to /usr/local, then the app would be extracted to /usr/local/lib/myApp.

I appreciate that extracting to a folder like /usr/local would require superuser privileges / sudo. That's OK.


Rationale
I was thinking of using caxa as a sort of "installer" for my app. I would set the --command entrypoint to a short JavaScript module (part of the app) that would additionally create a symlink or file at, say, /usr/local/bin/myApp that would look like:

#!/usr/bin/env sh
/usr/local/lib/myApp/node_modules/.bin/node /usr/local/lib/myApp/bin/main.js

Or some variation of this. Then running myApp on a command prompt would execute /usr/local/bin/myApp, and in turn /usr/local/lib/myApp/node_modules/.bin/node .... One could even consider that caxa could provide a standard "installer mode" feature that would do just that, but this is probably for some other feature request. :-)

"If you just want caxa to extract / install the app, why not just use a traditional installer?" Well, that's what I'm currently doing on Windows (NSIS) and macOS (pkgbuild), but I am currently considering the options on Linux and every option I came across had some drawbacks, for example:

  • .AppImage (squashfs) depends on fuse which in turn requires a Docker container to run in privileged mode.
  • apt install myApp à la Spotify requires setting up repository keys and is not available on some Linux distributions.
  • pkg has the complications listed in caxa's Readme and the latest versions have some bugs related to fs promises and native node modules. I have used pkg for years but found myself spending a lot of time chasing bugs because of the modifications that pkg makes to Node.js.
  • Plain shell scripts tend to snowball into large files full of corner cases depending on whether curl or wget or bash or POSIX sh or apt or apk are available, "unless it's Arch Linux or Fedora or"... And I'd rather code in JavaScript, like the rest of the app.
  • Snap and Flatpak depend on daemons / systemd / d-bus ...

At which point I came across caxa and thought, "this might just do the trick!" Simple enough and universal enough, especially with the upcoming ARM support in issue #4. :-)

@leafac
Copy link
Owner

leafac commented May 19, 2021

Idea given by @xemle:

When we pursue this, we might also give an option to uninstall previous versions.

@rnreekez
Copy link
Contributor

rnreekez commented Jul 20, 2021

Currently running into this with an application that is intended to be a long running process. Our use case may be niche but I wouldn't be surprised if others run into this as well. From what I can tell the following is happening...

  1. User runs application, caxa successfully unpacks and the process is left in a good, running state.
  2. Temp clean up occurs and for speculative reasons the application stops functioning property. The current theory is one of the tasks we run relies on reading a file from disk and we fail to do so which results in the unstable state of the application. Here's the interesting thing we're seeing, because the application is currently running, the node executable cannot be deleted so it's only a partial cleanup. Since the node binary was not deleted, the directory structure stays in place but all JS files are removed. 😧
  3. User realizes the application stops responding so attempts to kill and restart it. The app throws errors about missing modules due to the partial cleanup.

I'll likely make a temporary fork to resolve our issue by allowing the unpacking location to be specified, but I don't have a great solution from cleaning up previous releases, especially if we want to hold to a couple of versions prior. I'll push up what I come up with, feel free to take or leave. Thanks again for the project!

@leafac
Copy link
Owner

leafac commented Aug 15, 2021

Thanks for using caxa and for the detailed report.

The root of the issue seems to be that the system is cleaning up the temporary directory in the middle of operation. I don’t think that’s standard behavior, is it? As far as I know, the temporary directory is only supposed to be cleaned up on a reboot. Did your users configure something different on their machines? Doesn’t this cause issues with other applications? I suppose caxa isn’t the only application that relies on the temporary directory that way.

That being said, I’m looking forward to seeing what you come up with. What do you have to show?

Another possible workaround is to create a folder at <temporary-directory>/caxa/locks/<your-application-identifier>/<number-that-matches-the-extraction-attempt-at-caxa/applications-folder>. This will force the next activation of caxa to think that the extraction is corrupt and extract again.

@TheBotlyNoob
Copy link

👍 Please Add This!

@dadiborn
Copy link

I will join that party! Me too needs that option to change Temp dir to custom one, please add this option , thank you!

@rnreekez
Copy link
Contributor

rnreekez commented Oct 21, 2021

Sorry for the delay!

As far as I know, the temporary directory is only supposed to be cleaned up on a reboot.

This is correct, in this particular instance we have an end user that aggressively cleans their temp files as a matter of policy.

This isn't the most elegant solution but I kinda kludged this together in a forked repo because I needed to solve it short term for a project I'm working on. The same issues still apply, there is a need for previous run clean up that's something i'm hoping to introduce at a later day.

At a high level, two new options are introduced: temporary-directory (default) and the inverse no-temporary-directory. If temporary-directory is used, then caxa works as usual. If the other option is set, then we use an alternate, permanent location.

I needed this to be as flexible as possible across platforms so I tried to use something native to Go which turned out to be os.UserCacheDir(). This varies on platform but Go does provide a value for all platforms. I did not opt to have the custom directory specified because that would need to happen by me at build time to then be used by runtime within the end user's environment with an unpredictable filesystem and permission set. Relying on a particular directory seemed iffy.

Honestly, I hadn't even thought about just extracting to a hidden folder next to the binary. That's a pretty good option too. Also, I just saw your suggested workaround. The issue I ran into is the application does not start exhibiting these behaviors until it attempts to read a file from disk that has now been purged by temp file cleanup. Now the actively running application starts acting erratic, they'd have to stop the application and restart to get caxa to fix itself. In this scenario, I should probably just read the file into memory and hold it but that's just the current state of the app right now.

Anyway here is the commit if anyone has feedback: rnreekez@3628edd

@leafac
Copy link
Owner

leafac commented Oct 22, 2021

Hmmm, that’s actually an interesting idea: What if instead of using the temporary directory, we used the cache directory? (See https://github.com/LinusU/node-cachedir and https://pkg.go.dev/os#UserCacheDir) Do you think that would work better in all scenarios?

@pdcastro
Copy link
Author

pdcastro commented Oct 22, 2021

What if instead of using the temporary directory, we used the cache directory?

You mean, always using the cache directory and never the temporary directory? Interesting.

Well, it might be good enough for some people, but it would not quite address what was originally requested in this GitHub issue, specifically the ability of specifying a custom extraction folder, e.g. --extract-dir "/opt/myApp", which I consider a requirement in order to use caxa as a kind of application installer that would be a "serious alternative" to the likes of apt install myApp, AppImage, snap, flatpak, etc.

I think there are really two issues being discussed in this GitHub issue:

  • Some people (not me) are happy with "the caxa-powered executable being the app", and they just want to avoid the performance hit of the temporary folder being deleted. In this scenario, the OS cache dir is just a "less temporary" kind of temporary dir. If the OS cache dir was emptied (to free disk space), the only consequence is a performance hit the next time when the app is executed again.

  • Other people (like me) wish to use the caxa-powered executable as an installer for the app (as I described in the original post). After the app is installed, the caxa-powered executable can be deleted.

Having in mind the latter scenario (this GitHub issue as originally described):

  • It would not be "good form" to "install" a 500MB app in an out-of-sight $HOME/.cache/ folder (much like a temporary folder). End users might think: Where did all my disk space go? Where was that app installed?
  • If the end user runs out of disk space and they google "disk space full", they might come across some advice on the web for them to empty their cache folder (much like temporary folders). This would effectively uninstall the app, unintentionally.
  • In my communication with the end users of my app (for support and debugging), I would like to be able to refer to a specific folder where the app was installed, e.g. "/opt/myApp" on Linux or "~/Applications/myApp" on macOS or "C:\Program Files\myApp" on Windows, rather than "the OS cache dir" (even if the cache dir was in a fairly predictable location).

Therefore, if rnreekez@3628edd was merged to caxa, or if temporary folders were simply replaced with the OS cache dir, I would not consider that this issue could be closed as "resolved". Not saying that those are bad ideas! Just that they do not fully solve this issue as originally described (note the "Rationale" section of the original post).

@leafac
Copy link
Owner

leafac commented Nov 8, 2021

Oh, yeah. I agree with your points 100%. One feature doesn’t subsume the other.

@mikekoetter
Copy link

+1 for the alternate extraction/tmp folder feature

@jhuckaby
Copy link

I really need this feature in order to consider using caxa. My app absolutely must run from /opt/MYAPPNAME for a variety of reasons. It cannot run from a temporary directory. Consider this my huge +1 for this feature request.

I suppose I could "detect" that it is running from a temp directory and copy everything to the proper location on startup, but that feels like a hack. I would much rather avoid the temp directory entirely.

@dadiborn
Copy link

dadiborn commented Mar 1, 2022

While custom path is not yet implemented i override (temporary, only for this cmd session) environment variable for temp directory. So on Windows I execute app from CMD: set TMP=d:\caxa-project\tmp && d:\caxa-project\final_caxa_build.exe and running app extracts self into my custom dir.

@gkinsman
Copy link

gkinsman commented Mar 4, 2022

The windows local application data folder is designed for this purpose. It wouldn't be a surprise that these files unpacked into there would take up space as that's how other deployment systems work as well.

Without this, caxa apps break on every restart, and there's no current workaround.

@leafac
Copy link
Owner

leafac commented Mar 9, 2022

Thanks for all the information. At this point I think we have a clear path ahead:

  1. Add a command-line flag that allows you to specify an extraction path.
  2. Embed that information on the JSON footer of the binary generated by caxa.
  3. In the stubs (Go, shell, and so forth), honor that option.

Could y’all get the ball rolling on this and send a pull request?

Thanks!

@gkinsman
Copy link

Would you be able to please point me at where the folder resolution logic is currently? I can't find it 😂.

Also, would it possibly be wise as a shorter term win to change the default to use AppData instead of a temp dir, so that the defaults are safe?

@leafac
Copy link
Owner

leafac commented Jul 25, 2022

@gkinsman

Would you be able to please point me at where the folder resolution logic is currently? I can't find it 😂.

Folder resolution occurs in three locations:

JavaScript: To create the binary. See

os.tmpdir(),

Go: To extract the binary (the Go stub is the default stub that most people use). See

lock := path.Join(os.TempDir(), "caxa/locks", footer.Identifier, strconv.Itoa(extractionAttempt))

Bash: To extract the binary (the Bash stub is a nice little gimmick). See

export CAXA_TEMPORARY_DIRECTORY="$(dirname $(mktemp))/caxa"

Also, would it possibly be wise as a shorter term win to change the default to use AppData instead of a temp dir, so that the defaults are safe?

Yes, I agree. I used the temporary directory in caxa because I thought that operating systems would only clean up the temporary directory on reboot. But then I learned that operating systems may clean the temporary directory at any point. So I think we should stop using the temporary directory.

On Windows AppData seems like a reasonable place, but perhaps there’s a better option that I don’t know of. Is there some place that’s equivalent to macOS’s Library/Caches?

In macOS Library/Caches may be the best location.

On Linux perhaps /tmp is okay after all. As far as I remember people didn’t run into issues on Linux.

One thing to think about is how we can avoid accumulating garbage on people’s disks. The temporary directory guaranteed that naturally because it would clean things on reboot. I think that macOS’s Library/Caches should do the same, but we should investigate.

In any case, do you know what would be a nice way to go about this?

@gkinsman
Copy link

Thanks for those pointers! I might take a look if I get some free time this week.

I suppose it's an interesting question of what the purpose of this library is. Why do files created by caxa need to be cleaned up at all? I'd expect that if a user runs an executable built by caxa, that it would perform a one-time extraction to a known location (like AppData or ProgramData) and then forever be ok to function.

Once an app itself is running, it can create its own garbage wherever it wants and that would necessitate using a temp dir, but surely for installation we want to put it somewhere that it will definitely stick around in?

In fact, one of the major .NET deployment tools called Squirrel.Windows uses LocalAppData to store deployments to avoid requiring adminstrative priveliges to install. I suggest we follow it's model 🙂.

@leafac
Copy link
Owner

leafac commented Aug 31, 2022

@gkinsman

Why do files created by caxa need to be cleaned up at all?

Hmmm, that’s a thought-provoking question! 😁

I guess it depends on how you think of the binaries generated by caxa:

Option 1: Self-contained binary: Much like a Go application (think, Caddy, for example). You download it, you run it, you delete it, and it leaves as little traces on your machine as possible.

Option 2: Installer: You’ll be running the application over and over, so it’s okay if parts of the application stick around.

I created caxa to help me deploy web applications to servers. In that context, you’ll be deploying new versions often, and old versions that stick around end up just filling the disk for no good reason.

So I went with Option 1 and made caxa use the temporary directory because then a reboot works as a convenient cleanup mechanism.

But Option 2 is reasonable as well, and perhaps we could expand caxa’s scope a bit to embrace that. Particularly now that we learned about the bad behavior of temporary directories under some circumstances.

I guess the key would be to include an “uninstall” solution, because at some point people will want to clean things up. How do you think this should be done?


Also, note @maxb2’s work on #60

@Eren561
Copy link

Eren561 commented May 26, 2023

Is this feature still in development? This would be incredibly useful and would lead me to immediately adopt caxa as my go to node packager. I tried utilizing the changes in this thread but caxa still kept looking in the TMPDIR. I also tried changing my TMPDIR environment variables as a quick fix but alas it gets changed back for some reason :(

@leafac
Copy link
Owner

leafac commented Nov 21, 2023

Hi y’all,

Thanks for using caxa and for the conversation here.

I’ve been thinking about the broad strategy employed by caxa and concluded that there is a better way to solve the problem. It doesn’t use temporary directories at all, so can you please give it a try and report back?

It’s a different enough approach that I think it deserves a new name, and it’s part of a bigger toolset that I’m building, which I call Radically Straightforward · Package.

I’m deprecating caxa and archiving this repository. I invite you to continue the conversation in Radically Straightforward’s issues.

Best.

@leafac leafac closed this as completed Nov 21, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants