-
-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
use restore keys to restore cache from partial key matches + cache manifest install directory #94
Conversation
I am testing this here: https://github.com/tenacityteam/tenacity/pull/228 |
643f73f
to
f9e61bb
Compare
5400eaf
to
0193e0f
Compare
The force-pushing makes Github notifications be useless for this since one can't see any of the changes :( |
Sorry for all the force pushing. I think I finally have the package caching working right. |
src/vcpkg-utils.ts
Outdated
if (isWindows) { | ||
paths.push(process.env.LOCALAPPDATA + "\\vcpkg"); | ||
} else { | ||
paths.push(process.env.HOME + "/.cache/vcpkg"); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried caching both VPCKG_ROOT/packages
and CMAKE_BINARY_DIR/vcpkg_installed
but neither worked. Only caching these locations outside of both the vcpkg root and the CMake build directory works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please use the approach described in the other comment about VCPKG_DEFAULT_BINARY_CACHE
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if i look this right, the variable VCPKG_DEFAULT_BINARY_CACHE
is not being set by the action.
My suggestion is that this action sets the env var value in order to explicitly drive where the vcpkg
is going to install the built libraries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overriding that variable does not seem right. I think the most robust way to handle this would be checking if VCPKG_DEFAULT_BINARY_CACHE
is set, and if so cache that location, otherwise use the default. https://vcpkg.io/en/docs/users/binarycaching.html#configuration
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are many default locations that vcpkg can write the binary caching to, so the run-vcpkg
action cannot know 100% what was the place vcpkg
put the binary cache. This is the reason that the action must drive and be certain of the location.
I agree that as you stated, if VCPKG_DEFAULT_BINARY_CACHE
is set and it is valid (assuming valid. means existing and writeable, something the action must check), the action must not override it. On the other hand if not set, the action should set it to a sensible and valid value.
} | ||
else if (userProvidedCommitId) { | ||
core.info(`Using user provided vcpkg's Git commit id='${userProvidedCommitId}', adding it to the cache's key.`); | ||
key += "localGitId=" + Utils.hashCode(userProvidedCommitId); | ||
key += `_vcpkgGitCommit=${userProvidedCommitId}`; | ||
restoreKeys.push(key); | ||
} | ||
else { | ||
core.info(`No vcpkg's commit id was provided, does not contribute to the cache's key.`); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to put the hash of the vcpkg.json file before the commit hash (#95) in the cache key. Once a project is setup I would think the vcpkg.json file would change infrequently. Updating the packages would simply be a matter of changing the commit hash/updating the submodule. When doing so, the cache should hit for the last cache with the same vcpkg.json to avoid having to rebuild every package. But this needs to wait for the hashFiles
function to be published in the @actions/glob NPM package (actions/toolkit#830).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the hashFiles
function really necessary? Or any hash function should be a good fit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose another hash function could do 🤷
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still see that your point of using eactly the same hash function has its benefits: it would not invalidate existing keys. On the other hands, this PR already invalidate existing keys as the key changed its format, so no concern here. Any hash function would be a good fit.
f9e61bb
to
914f819
Compare
src/vcpkg-utils.ts
Outdated
"!" + vcpkgRootDir + directorySeparator + "buildtrees", | ||
"!" + vcpkgRootDir + directorySeparator + "downloads" ]; | ||
|
||
if (isWindows) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of pushing hardcoded path, let's have an input
that has a default value, and that can be set by the user as well.
Based on that input, the value of the environment variable VCPKG_DEFAULT_BINARY_CACHE
must be set to the path where the cache is going to be stored. E.g., $VCPKG_ROOT/cache
could be a valid default path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think there's a need to add more complexity to this action. vcpkg already allows the user to control the sources used for binary caching with the VCPKG_BINARY_SOURCES environment variable. I think this action should unconditionally cache the default directories of the binary caching. If the user wants to use a different path for whatever reason, that path can be specified with VCPKG_BINARY_SOURCES and passed to the additionalCachedPaths
parameter of this action. In this case I think the default location would be empty so there wouldn't be a harm in caching it. If vcpkg does still fall back to the default cache path, the user can add clear
to the end of their VCPKG_BINARY_SOURCES environment variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think my other comment about VCPKG_DEFAULT_BINARY_CACHE
is valid here as well. But since GitHub view is not really helpful in highlighting the exact lines this comment was referring too.
src/vcpkg-utils.ts
Outdated
if (isWindows) { | ||
paths.push(process.env.LOCALAPPDATA + "\\vcpkg"); | ||
} else { | ||
paths.push(process.env.HOME + "/.cache/vcpkg"); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please use the approach described in the other comment about VCPKG_DEFAULT_BINARY_CACHE
After more experimentation I found that the way it was before without caching the vcpkg binary cache archives was indeed slightly better. Please take another look at the code. The issue was that the action was not caching the |
This contribution is great, thanks for submitting it! I will certainly take a look, and I totally agree on avoiding the I'd be a bit slower in answering over next couple of days (hopefully not), but I look forward for finalizing this work. |
From the GitHub Actions documentation: https://docs.github.com/en/actions/guides/caching-dependencies-to-speed-up-workflows#matching-a-cache-key You can provide a list of restore keys to use when there is a cache miss on key. You can create multiple restore keys ordered from the most specific to least specific. The cache action searches for restore-keys in sequential order. When a key doesn't match directly, the action searches for keys prefixed with the restore key. If there are multiple partial matches for a restore key, the action returns the most recently created cache.
src/vcpkg-utils.ts
Outdated
const vcpkg_installed_path = path.resolve(vcpkgRootDir + "/../vcpkg_installed"); | ||
paths.push(vcpkg_installed_path); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would better be added to the getOrdinaryCachedPaths
function in the runvcpkglib but I do not know how to edit that library and use that branch in this action for testing. This is my first time using Node other than quick tests with the REPL and my first time writing TypeScript. So maybe you could take care of changing the library and cherry pick just the first commit of this branch. In the meantime I will keep using this branch for Tenacity.
Is this something you would take care of? I don't know how to do that. |
Upon further testing, it seems that caching the |
I have opened an issue for vcpkg to clarify if searching the binary cache before vcpkg_installed is by design or not. Refer to the issue for how to reproduce that locally. |
vcpkg rebuilds all packages for a new CMake build directory when caching only vcpkg_install in manifest mode unless they are in the binary cache. microsoft/vcpkg#19424
I see two improvements/changes in this PR: My understanding is that in my testing caching Regarding point 2, there are many variables, and it can be low value. Whenever a project has established the dependencies, they may change once each month, and whenever rebuilding all cache from scratch takes six hours vs three hours, it may not be worth it. I am open to discussion and especially interested to see the results in workflow runs that validates the benefits of point 1 and 2. |
The reason is that two different location of the
Basically we should not mix the " |
That is really confusing.
This action does not. I added a |
@Be-ing thanks for this great contribution! I have not the time to validate nor test the changes, nor I got all the answers to my previous questions (see this comment). Anyhow the idea of using fallback keys is great, and also the one to automatically hash the |
From the GitHub Actions documentation:
https://docs.github.com/en/actions/guides/caching-dependencies-to-speed-up-workflows#matching-a-cache-key
You can provide a list of restore keys to use when there is a
cache miss on key. You can create multiple restore keys ordered
from the most specific to least specific. The cache action
searches for restore-keys in sequential order. When a key doesn't
match directly, the action searches for keys prefixed with the
restore key. If there are multiple partial matches for a restore
key, the action returns the most recently created cache.