-
-
Notifications
You must be signed in to change notification settings - Fork 647
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Relative imports not mapping to correct sources when multiple files have identical contents #19618
Comments
I can reproduce this on my machine; thanks for providing a self-contained reproducer (and of course, sorry for the trouble)! Unfortunately, I haven't had time to dig into the root cause fully yet... I'm hoping it's not the new Rust parser in 2.17, but I'm suspicious 🤔 I may or may not get a chance to look in more detail over the next day or two. |
Okay, yes, this is does indeed caused by the Rust parser. This means that there's a work around: continuing to use the old parser Here's the reproduction script I was using: # easier management/clearing of the cache
export PANTS_LOCAL_STORE_DIR=$(mktemp -d)
echo "*** Before"
git checkout example-dep-inf-issue
pants peek :: > /dev/null
echo "*** After"
git checkout example-dep-inf-issue-2
pants peek :: > /dev/null
echo "*** New cleared cache"
export PANTS_LOCAL_STORE_DIR=$(mktemp -d)
pants peek :: > /dev/null Running
Running it with This is the diff between the two branches ( diff --git a/app/module_1/break_pants/__init__.py b/app/module_1/break_pants/__init__.py
index 3c2673c..8b38b6b 100644
--- a/app/module_1/break_pants/__init__.py
+++ b/app/module_1/break_pants/__init__.py
@@ -1 +1 @@
-from ._example import x
\ No newline at end of file
+from module_2.break_pants import *
\ No newline at end of file
diff --git a/app/module_2/break_pants/BUILD b/app/module_2/break_pants/BUILD
new file mode 100644
index 0000000..db46e8d
--- /dev/null
+++ b/app/module_2/break_pants/BUILD
@@ -0,0 +1 @@
+python_sources()
diff --git a/app/module_2/break_pants/__init__.py b/app/module_2/break_pants/__init__.py
new file mode 100644
index 0000000..3c2673c
--- /dev/null
+++ b/app/module_2/break_pants/__init__.py
@@ -0,0 +1 @@
+from ._example import x
\ No newline at end of file
diff --git a/app/module_1/break_pants/_example.py b/app/module_2/break_pants/_example.py
similarity index 100%
rename from app/module_1/break_pants/_example.py
rename to app/module_2/break_pants/_example.py As implied by having to nuke all pants artifacts, this appears to be a on-disk caching issue, not just an in-memory pantsd one: running |
Okay, mechanism: this is inappropriate caching when there's two files in different locations that have identical contents, the relative import is resolved and cached in the old location, and isn't recomputed for the new location. Moving a file is a good way to get that in practice, as in the old reproducer. Reduced reproducer (thanks again @njgrisafi, your initial repo was crucial): set -x
cd $(mktemp -d)
# avoid corrupting the normal cache
export PANTS_LOCAL_STORE_DIR=$(mktemp -d)
# Setup:
cat > pants.toml <<EOF
[GLOBAL]
pants_version = "2.17.0rc3"
backend_packages = ["pants.backend.python"]
[source]
root_patterns = ["/"]
[python]
interpreter_constraints = ["==3.9.*"]
[python-infer]
use_rust_parser = true
EOF
mkdir a b
echo 'python_sources()' > a/BUILD
echo 'python_sources()' > b/BUILD
# two identical files with relative imports:
echo 'from .file import x' > a/__init__.py
echo 'from .file import x' > b/__init__.py
# only one of those imports is valid
touch a/file.py
pants dependencies a/__init__.py
# OK: a/file.py
pants dependencies b/__init__.py
# BUG: a/file.py
# should be an error because there's no b/file.py (and also, it's not what Python would load, if b/file.py did exist) |
…19630) This adds the file path for the file being dependency-inferred to the cache key for the Rust dep inference. Without this, the same file contents appearing multiple times in different places will give the wrong results for relative imports, because the dep inference process mashes together the file path and the relative import. The circumstances here seem most likely to occur in the real world if a file is moved, with the inference results from before the move reused for the file _after_ the move too. I've tested this manually on the reproducer in #19618 (comment), in addition to comparing the before (fails) and after (passes) behaviour of the new test. To try to make this code more resilient to this sort of mistake, I've removed the top level `PreparedInferenceRequest.path` key, in favour of using the new `input_file_path` on the request protobuf struct, at the cost of an additional `String` -> `PathBuf` conversion when doing the inference. Fixes #19618
…herry-pick of #19630) (#19640) This adds the file path for the file being dependency-inferred to the cache key for the Rust dep inference. Without this, the same file contents appearing multiple times in different places will give the wrong results for relative imports, because the dep inference process mashes together the file path and the relative import. The circumstances here seem most likely to occur in the real world if a file is moved, with the inference results from before the move reused for the file _after_ the move too. I've tested this manually on the reproducer in #19618 (comment), in addition to comparing the before (fails) and after (passes) behaviour of the new test. To try to make this code more resilient to this sort of mistake, I've removed the top level `PreparedInferenceRequest.path` key, in favour of using the new `input_file_path` on the request protobuf struct, at the cost of an additional `String` -> `PathBuf` conversion when doing the inference. Fixes #19618 This is a cherry pick of #19630, but is essentially a slightly weird rewrite that partially cherry-picks #19001 too, by copying over the whole `DependencyInferenceRequest` protobuf type as is (even with some fields that are unused) because that's the easiest way to generate appropriate bytes for turning into the digest for use in the cache-key. I think it's okay to have this "weird" behaviour in the 2.17 branch, with the real/normal code in main/2.18?
…19630) This adds the file path for the file being dependency-inferred to the cache key for the Rust dep inference. Without this, the same file contents appearing multiple times in different places will give the wrong results for relative imports, because the dep inference process mashes together the file path and the relative import. The circumstances here seem most likely to occur in the real world if a file is moved, with the inference results from before the move reused for the file _after_ the move too. I've tested this manually on the reproducer in #19618 (comment), in addition to comparing the before (fails) and after (passes) behaviour of the new test. To try to make this code more resilient to this sort of mistake, I've removed the top level `PreparedInferenceRequest.path` key, in favour of using the new `input_file_path` on the request protobuf struct, at the cost of an additional `String` -> `PathBuf` conversion when doing the inference. Fixes #19618
…herry-pick of #19630) (#19690) This adds the file path for the file being dependency-inferred to the cache key for the Rust dep inference. Without this, the same file contents appearing multiple times in different places will give the wrong results for relative imports, because the dep inference process mashes together the file path and the relative import. The circumstances here seem most likely to occur in the real world if a file is moved, with the inference results from before the move reused for the file _after_ the move too. I've tested this manually on the reproducer in #19618 (comment), in addition to comparing the before (fails) and after (passes) behaviour of the new test. To try to make this code more resilient to this sort of mistake, I've removed the top level `PreparedInferenceRequest.path` key, in favour of using the new `input_file_path` on the request protobuf struct, at the cost of an additional `String` -> `PathBuf` conversion when doing the inference. Fixes #19618 Co-authored-by: Huon Wilson <huon@exoflare.io>
Describe the bug
Some background can be found on slack https://pantsbuild.slack.com/archives/C046T6T9U/p1692230510929189
When a renaming/moving files with relative imports, pants dependency inference fails on the new file with
UnownedDependencyError
. The error only occurs on relative imports and what's interesting is pants says the source for the relative import is the old file that was moved.Pants version
2.17.0rc1
OS
Both MacOS and Linux
Additional info
This issue is a bit difficult to reproduce. There's some specific order of operations that causes pants to get into this state.
I pushed this a branch to my example repo https://github.com/njgrisafi/pants-example/tree/example-dep-inf-issue to hopefully help with narrowing down reproducing this.
Simple reproduction
git clone git@github.com:njgrisafi/pants-example.git
git checkout example-dep-inf-issue
./pants peek :: >> /dev/null
git checkout example-dep-inf-issue-2
./pants peek :: >> /dev/null
You should see the following error
To fix the issue you can nuke all the pants artifacts
Rerunning dependency inference will now succeed as expected.
Detailed reproduction
Detailed steps to repro this behavior (first checkout repo / branch
example-dep-inf-issue
):./pants peek :: >> /dev/null
app/module_2/break_pants
app/module_2/break_pants/__init__.py
(with contentfrom ._example import x
)app/module_1/break_pants/_example.py
->app/module_2/break_pants/_example.py
app/module_1/break_pants/__init__.py
(with contentfrom module_2.break_pants import *
)./pants peek :: >> /dev/null
You should see an error like
The expected behavior is pants would map the relative import to
module_2.break_pants._example.x
.The text was updated successfully, but these errors were encountered: