-
Notifications
You must be signed in to change notification settings - Fork 533
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
excessive cache invalidation in ccache #1323
Comments
some data points:
and
|
so looks like we are loading really old caches in -- instead of the most recent cache that is uploaded. https://github.com/llvm/torch-mlir/runs/8129353328?check_suite_focus=true restored from So we need to debug / fix this cache because pinning the build wont help if we load an old cache. |
Importing onnx graph fails if an output is also used by another node. This happens because the output ValueInfo will be registered, and then it will throw an error that it already exists when importing internal ValueInfos. Solution is to import the internal ValueInfos before importing the output ValueInfos. Resolves llvm#1376 Signed-off-by: Michael Holman <michhol@microsoft.com>
Building on top of the previous findings, I realized that PyTorch uses precompiled headers, which, to work with ccache, require build flags that we might have to upstream to PyTorch. However, we can perhaps work around these limitations by leveraging the fact that we don't clear the VM disk between consecutive CI runs, although we do remove the PyTorch build files. More precisely, we could change this snippet (in the # Copy over all of the cmake files
mv build/lib*/torch/share libtorch/
mv build/lib*/torch/include libtorch/
mv build/lib*/torch/lib libtorch/
# Copy over all lib files
mv build/lib/* libtorch/lib/
# Copy over all include files
mv build/include/* libtorch/include/ to use |
I am ok with the change from However I am not sure we should assume we don't clear artifacts between VM invocations in the CI. I thought it is supposed to be a clean run -- maybe it was a transient bug ? |
Lucky for us, when you and Maksim wrote the build_libtorch.sh script, y'all added a code path to handle both cases, one where the PyTorch source is checked out and one where it doesn't. checkout_pytorch() {
if [[ ! -d "$PYTORCH_ROOT" ]]; then
...
else
cd "${PYTORCH_ROOT}"
git fetch --depth=1 origin "${TORCH_MLIR_SRC_PYTORCH_BRANCH}"
git reset --hard FETCH_HEAD
fi Combined with the fact that we don't pass |
Ahh we added that path to support local customer forks of Pytorch source builds. |
Investigate why we have ccache invalidation (especially when building from source) in the CI.
TODO: test local behaviour of ccache for a days worth of PyTorch changes and validate we see similar behaviour on the CI.
The text was updated successfully, but these errors were encountered: