Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't abort resolution on transitive updates #5180

Merged
merged 2 commits into from
Mar 15, 2018

Conversation

alexcrichton
Copy link
Member

This commit is directed at fixing #4127, allowing the resolver to automatically
perform transitive updates when required. A few use casese and tagged links are
hanging off #4127 itself, but the crux of the issue happens when you either add
a dependency or update a version requirement in Cargo.toml which conflicts
with something listed in your Cargo.lock. In this case Cargo would previously
provide an obscure "cannot resolve" error whereas this commit updates Cargo to
automatically perform a conservative re-resolution of the dependency graph.

It's hoped that this commit will help reduce the number of "unresolvable"
dependency graphs we've seen in the wild and otherwise make Cargo a little more
ergonomic to use as well. More details can be found in the source's comments!

Closes #4127

@rust-highfive
Copy link

r? @matklad

(rust_highfive has picked a reviewer for you, use r? to override)

@alexcrichton
Copy link
Member Author

cc @Eh2406, most of the work you've been doing has been in the resolver itself and while this only lightly touches the resolver you might be interested to see how lock files interact with the resolver!

@Eh2406
Copy link
Contributor

Eh2406 commented Mar 14, 2018

Thanks for the cc!

In general I am really glad this bug is getting dealt with! I have helped a number of persons on irc with this. Most recently, I depend on A witch depends on B. Now I, for some inexplicable reason, want to specify a version of B that is not the one in the lock file. So I add B = "=..." to the toml and done. Well not done, hit this bug. But regenerate the lock file and done.

Copy link
Member

@matklad matklad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I definitely like idea of "let prefer already locked packages to the newer ones"!

However, the heuristic of "let's avoid locking of the things that might change" somewhat non-robust (see the comment with failing tests).

I wonder if we could, more generally, "don't look foo if that will fail resolution process"? Looks NP-complicated at the first sight.

However... What if don't lock anything at all, and instead just rely on the "prefer packages from the previous lockfile" heuristic? Perhaps, in practice, that will give us the results we want? This is also curiously similar to the "pick minimal version of dependencies" issue.

@@ -187,7 +187,7 @@ impl EncodableResolve {
}
}

fn build_path_deps(ws: &Workspace) -> HashMap<String, SourceId> {
fn build_path_deps(ws: &Workspace) -> (HashMap<String, SourceId>) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: useless parens

/// should be printed
///
/// * `print_warnings` - whether or not to print backwards-compatibility
/// warnings and such
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about the actual implementation, but at least the docs hint that config and print_wornings might be folded into one Option argument?

Not related to the PR though!

fn new(registry: &'a mut Registry, replacements: &'a [(PackageIdSpec, Dependency)],) -> Self {
fn new(registry: &'a mut Registry,
replacements: &'a [(PackageIdSpec, Dependency)],
try_to_use: &'a [&'a PackageId]) -> Self {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: could be &[&'a PackageId]

@@ -678,7 +712,11 @@ impl<'a> RegistryQueryer<'a> {
// When we attempt versions for a package, we'll want to start at
// the maximum version and work our way down.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's touch up the comment to say about in_previous?

@@ -67,6 +71,8 @@ pub struct Workspace<'cfg> {
// needed by the current configuration (such as in cargo install). In some
// cases `false` also results in the non-enforcement of dev-dependencies.
require_optional_deps: bool,

loaded_packages: RefCell<HashMap<PathBuf, Package>>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a short doc-comment for the field?

fn register_previous_locks<'a>(ws: &Workspace,
registry: &mut PackageRegistry,
resolve: &'a Resolve,
keep: &Fn(&&'a PackageId) -> bool) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keep signature looks weird, but we use it three times as filter(keep), where the signature has to us double references... No idea how to make this better...

//
// And finally, with all that in mind, hopefully this loop makes sense! If
// not, of course feel free to ask on IRC :). Like the function, this is
// directly targeted at solving #4127
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fascinating read! 👍

Ok(p) => p,
Err(_) => continue,
};
for dep in resolve.deps_not_replaced(node) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, here we are solving the case of "don't lock potentially changed dependencies". Another similar scenario might happen, I think, when the version of the pacakge itself changes (you literally change Cargo.toml from [pacakge] name = foo version = 1 to [package] name = foo version = 2). So, perhaps we need something like

if pkg.package_id() != node {
  avoid_locking.insert(node);
}

is needed as well?

// that. As a result the update of `serde` here shouldn't update to `serde
// 0.1.1` as that would also force an update to `log 0.1.1`.
//
// Also note that this is probably counterintuitive and weird. We may wish
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most definitely weird. Would be producing a warning here challenging?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah for now I think we want to keep this as is (afaik we've never gotten a bug report about this?) and is "easily fixable" with cargo update --aggressive, albeit not exactly a discoverable solution.

};
let pkg = match ws.load(&path.join("Cargo.toml")) {
Ok(p) => p,
Err(_) => continue,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, this looks for path dependencies, already present in resolve. But about newly added path dependencies? They probably would be handled when their parent crate is processed. Hm, nope, the following test fails:

#[test]
fn update_via_new_dep() {
    Package::new("log", "0.1.0").publish();
    let p = project("foo")
        .file("Cargo.toml", r#"
            [package]
            name = "bar"
            version = "0.0.1"
            authors = []

            [dependencies]
            log = "0.1"
            # foo = { path = "foo" }
        "#)
        .file("src/lib.rs", "")
        .file("foo/Cargo.toml", r#"
            [package]
            name = "foo"
            version = "0.0.1"
            authors = []

            [dependencies]
            log = "0.1.1"
        "#)
        .file("foo/src/lib.rs", "")
        .build();

    assert_that(p.cargo("build"), execs().with_status(0));
    Package::new("log", "0.1.1").publish();

    File::create(p.root().join("Cargo.toml")).unwrap().write_all(br#"
        [package]
        name = "bar"
        version = "0.0.1"
        authors = []

        [dependencies]
        log = "0.1"
        foo = { path = "foo" }
    "#).unwrap();

    assert_that(p.cargo("build"), execs().with_status(0));
}

A similar situation happens when you add a new workspace member (which isn't necessary a path dependency of anything):

#[test]
fn update_via_new_member() {
    Package::new("log", "0.1.0").publish();
    let p = project("foo")
        .file("Cargo.toml", r#"
            [package]
            name = "bar"
            version = "0.0.1"
            authors = []

            [workspace]
            # members = [ "foo" ]

            [dependencies]
            log = "0.1"
        "#)
        .file("src/lib.rs", "")
        .file("foo/Cargo.toml", r#"
            [package]
            name = "foo"
            version = "0.0.1"
            authors = []

            [dependencies]
            log = "0.1.1"
        "#)
        .file("foo/src/lib.rs", "")
        .build();

    assert_that(p.cargo("build"), execs().with_status(0));
    Package::new("log", "0.1.1").publish();

    File::create(p.root().join("Cargo.toml")).unwrap().write_all(br#"
            [package]
            name = "bar"
            version = "0.0.1"
            authors = []

            [workspace]
            members = [ "foo" ]

            [dependencies]
            log = "0.1"
    "#).unwrap();

    assert_that(p.cargo("build"), execs().with_status(0));
}

@alexcrichton
Copy link
Member Author

Some excellent thinking! This is gonna be a trick one... I think that would explain this as well where it seems that empirically this patch doesn't handle everything. I don't think it's 100% related to only adding new path dependencies as well, on IRC I believe it was just adding a dependency locally to a preexisting package. I'll try to come up with a test case for that as well before pushing another update.

As to how to fix this... I'm not sure! Unfortunately the logic for preferring previous version isn't quite so easy as it also affects whether we update a source (aka print Updating registry ...) which is something we generally avoid at all costs. The way the lockfile interacts with resolve has grown more and more complicated over time, so it's difficult to imagine a hugely different way to do it at this point. I'm gonna see if I can tack it on, hopefully there's something salvageable here...

@alexcrichton
Copy link
Member Author

Ok I've pushed an update which I believe should be much more robust yet still preserve Cargo's original properties (and also fixes #5182 to boot).

I think that this commit is basically starting to hammer the nails in the coffin for the current strategy of dealing with lock files. At a fundamental level if you add any dependency then you have to reresolve the entire dependency graph as that new dependency could bring in anything from any source, conflicting with a previous activation that we don't actually want to conflict.

This iteration leverages @matklad's idea of leveraging the "sort candidates based on whether they were previously in the lock file" idea. If any dependency isn't resolvable with the current set of packages in the lock file then the dependency's source is "poisoned" in the lock file. All lock file entries from the same source aren't actually locked (aka become candidate for update). We still try to avoid updating most of them via the sorting strategy, however.

I realized as I was typing this up, however, that this still isn't quite right. While I think it's unambiguously better than what we have today it still doesn't handle cases, for example, where you add a git dependency which then adds new constraints on crates.io dependencies.

@alexcrichton
Copy link
Member Author

I think I'd prefer to fix what's here and land it now as I think it's a huge improvement over what we have today (sometimes you add a crates.io dependency and Cargo refuses to budge). Fixing the git issue I think is going to be very invasive...

@bors
Copy link
Contributor

bors commented Mar 15, 2018

☔ The latest upstream changes (presumably #5176) made this pull request unmergeable. Please resolve the merge conflicts.

ret.sort_unstable_by(|a, b| {
b.summary.version().cmp(a.summary.version())
let a_in_previous = self.try_to_use.contains(a.summary.package_id());
let b_in_previous = self.try_to_use.contains(b.summary.package_id());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, maybe let’s change try_to_use to hash set to have better O?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah it is indeed here! The argument coming in is a slice but it's switched to a HashSet for precisely this reason :)

let source = dep.source_id();

// If this is a path dependency then try to push it onto our worklist
if let Ok(path) = source.url().to_file_path() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that the comment is not 100% accurate. In particular, git dependencies for local repositories will also pass this check! This I think is important, because the git dependencies in Cargo own tests are local, and, so, during the tests, would use a non-realistic code path. Lets add an if source.is_path() condition, which checks kind?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha excellent point

@matklad
Copy link
Member

matklad commented Mar 15, 2018

r=me with #5180 (comment) and #5180 (comment) addressed

This commit removes the `Source::for_path` constructor in favor of
`Workspace::load`. This prevents re-parsing manifests multiple times as Cargo
loads up as this can sometimes be an expensive operation. Instead the
`Workspace` now retains a cache of packages that can be updated as it goes
along. Finally, this should mean that we're only parsing path dependencies at
most once rather than multiple times.
This commit is directed at fixing rust-lang#4127, allowing the resolver to automatically
perform transitive updates when required. A few use casese and tagged links are
hanging off rust-lang#4127 itself, but the crux of the issue happens when you either add
a dependency or update a version requirement in `Cargo.toml` which conflicts
with something listed in your `Cargo.lock`. In this case Cargo would previously
provide an obscure "cannot resolve" error whereas this commit updates Cargo to
automatically perform a conservative re-resolution of the dependency graph.

It's hoped that this commit will help reduce the number of "unresolvable"
dependency graphs we've seen in the wild and otherwise make Cargo a little more
ergonomic to use as well. More details can be found in the source's comments!

Closes rust-lang#4127
Closes rust-lang#5182
@alexcrichton
Copy link
Member Author

@bors: r=matklad

@bors
Copy link
Contributor

bors commented Mar 15, 2018

📌 Commit 51d2356 has been approved by matklad

@bors
Copy link
Contributor

bors commented Mar 15, 2018

⌛ Testing commit 51d2356 with merge b3df526...

bors added a commit that referenced this pull request Mar 15, 2018
Don't abort resolution on transitive updates

This commit is directed at fixing #4127, allowing the resolver to automatically
perform transitive updates when required. A few use casese and tagged links are
hanging off #4127 itself, but the crux of the issue happens when you either add
a dependency or update a version requirement in `Cargo.toml` which conflicts
with something listed in your `Cargo.lock`. In this case Cargo would previously
provide an obscure "cannot resolve" error whereas this commit updates Cargo to
automatically perform a conservative re-resolution of the dependency graph.

It's hoped that this commit will help reduce the number of "unresolvable"
dependency graphs we've seen in the wild and otherwise make Cargo a little more
ergonomic to use as well. More details can be found in the source's comments!

Closes #4127
@bors
Copy link
Contributor

bors commented Mar 15, 2018

☀️ Test successful - status-appveyor, status-travis
Approved by: matklad
Pushing b3df526 to master...

@bors bors merged commit 51d2356 into rust-lang:master Mar 15, 2018
@Eh2406
Copy link
Contributor

Eh2406 commented Mar 15, 2018

@alexcrichton you may want to do a perf run with x.py incase this removes some optimizations. Specifically I think, a lot more deps with has_another leading to more clones.

@alexcrichton alexcrichton deleted the transitive-update branch March 15, 2018 18:02
@alexcrichton
Copy link
Member Author

@Eh2406 true! My intention, though, was that this behavior (aka lots more has_another) only happens when the dependency graph changes, in theory none of this logic should kick in if manifests are unchanged

@alexcrichton
Copy link
Member Author

Ok yeah looks like ./x.py isn't much slower, and the perf run shows cloning nowhere near the top.

It still takes 4 seconds for a null build, which is crazy long, but that's another problem for another day

@Eh2406
Copy link
Contributor

Eh2406 commented Mar 15, 2018

Thanks for checking!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Transitive minor updates of deps shouldn't fail cargo update
6 participants