Slow solving with `build-tool-depends` on local packages with tests #7472

michaelpj · 2021-07-02T10:07:09Z

Describe the bug
Suppose I have a local package B which provides a build tool, and which either has tests or depends on a package A that has tests. Finally, I have another local package C which depends on B's build tool.

Finally, I have tests: true in my cabal.project, since I naturally want to test my local packages.

The result of this is that the solver becomes slower, because (I hypothesize) it tries to create a build plan for A without tests to satisfy the build-tool dependency, then finds that it can't do that (due to the project constraint) and has to redo it.

That looks something like this (from cabal configure -v3):

[_13] trying: C:B:exe.A~>A-0.1.0.0 (dependency of C:B:exe.B)
[_14] rejecting: C:B:exe.A:!test (dependencies not linked: stanza "test" incompatible; conflict set: A, B, C:B:exe.A, C:B:exe.B, A:test, C:B:exe.A:test)

This is not a big deal in a small project. However, if we have a long chain of packages up to the build tool dependency like so:

A->B-> ... ->Z->Tool

all of which with tests, then the solver will redo most of the work over and over until it finally realises it has to build all of A to Z with tests. This can cause a very large slowdown in solving time.

cabal freeze does not help, since it only pins the versions and the test/no-test incompatibility still bites.

To Reproduce

I made a repository that exhibits the behaviour I described in the small: https://github.com/michaelpj/cabal-build-tool-depends-solver-repro

Changing tests:true to tests:false in cabal.project stops the solver from failing and having to backtrack.

Expected behavior

I don't understand how cabal's solver works at all, but I would like this to not be slow! Maybe the test/no-tests constraint needs to be pushed down eagerly?

System information

NixOS
cabal 3.4.0.0, ghc 8.10.4

The text was updated successfully, but these errors were encountered:

michaelpj · 2021-07-02T10:57:01Z

Quick check for "does this affect me":

cabal configure -v3 > out.txt
cat out.txt | grep 'stanza "test"' | wc -l

michaelpj · 2021-07-03T15:10:02Z

A few more notes for the interested:

We observed this in the wild on some Cardano projects, notably https://github.com/input-output-hk/plutus and https://github.com/input-output-hk/cardano-node.
- It's not appallingly slow, but you can observe the test/non-test confusions in the solver output.
The problem is especially acute on those repositories with cabal 3.2, since in those days tests: true applied to source-repository-packages, multiplying the number of test/non-test confusions that get hit.
It's also acute for haskell.nix (Resolving dependencies is much slower for haskell.nix with lots of source-repository-package stanzas input-output-hk/haskell.nix#1156), which replaces source-repository-package stanzas with local packages in order to avoid network access, but thus ends up in the same place as cabal 3.2.

michaelpj · 2021-07-03T15:12:20Z

For newer versions of cabal which don't run the solver when calling configure, the easiest way to see the issue is probably cabal build --dry -v3 all.

gbaz · 2021-07-22T23:34:35Z

Here's the result of my investigation thus far. The planPackages call constructs a DepResolverParams (this gets printed out at v3) that in turn includes "preferences" which include, in this example:

  A [BenchStanzas]
  B [BenchStanzas]
  C [BenchStanzas]

Those preferences say try with bench enabled before bench disabled. In this case we should set the preferences to also have TestStanzas.

Since this section also sets up constraints, and tests are definitely included in the constraints, all the data should be there to include them in preferences as well.

If this is done the search tree will be reordered to try packages first with test enabled (the success path!) and only then disabled. Not quite sure how to tweak the code to do this though, if anyone else wants to take a run at it first :-)

gbaz · 2021-07-23T15:35:18Z

Anyone affected by this issue is welcome to try out the linked PR an see if it improves things. (Assuming it gets a good review, I think it should probably be merged even if there's no noticeable speedup since it seems slightly more correct, and at worst will remove some "line noise" from solver output by eliminating some unnecessary local backjumps.

grayjay · 2021-07-27T03:58:26Z

I ran the reproduction, but I'm not sure it reproduces the performance issue, even though it contains the log message about incompatible stanzas. I didn't see any backtracking with cabal 3.4.0.0.

Full solver log:

[__0] trying: A-0.1.0.0 (user goal)
[__1] trying: base-4.14.1.0/installed-4.14.1.0 (dependency of A)
[__2] trying: rts-1.0/installedrts (dependency of base)
[__3] trying: integer-gmp-1.0.3.0/installed-1.0.3.0 (dependency of base)
[__4] trying: ghc-prim-0.6.1/installed-0.6.1 (dependency of base)
[__5] rejecting: A:!test (constraint from config file, command line flag, or user target requires opposite flag selection)
[__5] trying: A:*test
[__6] trying: B-0.1.0.0 (user goal)
[__7] trying: C-0.1.0.0 (user goal)
[__8] trying: C:B:exe.B~>B-0.1.0.0 (dependency of C)
[__9] trying: C:B:exe.base~>base-4.14.1.0/installed-4.14.1.0 (dependency of C:B:exe.B)
[_10] trying: C:B:exe.rts~>rts-1.0/installedrts (dependency of C:B:exe.base)
[_11] trying: C:B:exe.integer-gmp~>integer-gmp-1.0.3.0/installed-1.0.3.0 (dependency of C:B:exe.base)
[_12] trying: C:B:exe.ghc-prim~>ghc-prim-0.6.1/installed-0.6.1 (dependency of C:B:exe.base)
[_13] trying: C:B:exe.A~>A-0.1.0.0 (dependency of C:B:exe.B)
[_14] rejecting: C:B:exe.A:!test (dependencies not linked: stanza "test" incompatible; conflict set: A, B, C:B:exe.A, C:B:exe.B, A:test, C:B:exe.A:test)
[_14] trying: C:B:exe.A:*test
[_15] done

The log shows that cabal is trying to use the same instance of A for both the top-level build target and C's build tool dependency (level 13), which is an optimization. Using the same instance of A means that the two uses must make the same choice for enabling tests. The solver first tries disabling tests for the build tool dependency at level 14 but immediately encounters the conflict and enables tests instead. This is how I would expect the solver to behave, even with more packages in the chain of dependencies.

I think I would need to see an example of how increasing the chain of dependencies causes backtracking, especially if it causes the solver to make the same choice multiple times.

I also think that cabal should add the same preferences for all local packages as in #7472 (comment), for consistency.

michaelpj · 2021-08-03T08:28:05Z

I've attached the actual constraint-solver log from the project in question. It's quite long, but perhaps illuminating.

out.txt

michaelpj · 2021-08-03T08:35:26Z

In particular it has:

Lots of steps even in the happy path
Lots of backracking when it hits the test/non-test conflict

michaelpj · 2021-08-03T08:43:37Z

(Sorry that the reproduction wasn't obviously slow, I was trying to just show the test/non-test conflict issue, which seemed more self-contained!)

grayjay · 2021-08-10T00:49:45Z

Thanks for the logs! I graphed the level in the search tree vs line number and saw that there were four main instances of backtracking:

All four have similar conflicts involving build tools and enabling testing, so I focused on the first one, which has these relevant lines:

[1016] trying: plutus-pab:cardano-node:exe.plutus-tx-0.1.0.0 (dependency of plutus-pab:cardano-node:exe.plutus-ledger-api)
[1017] trying: plutus-pab:cardano-node:exe.plutus-tx:!test
...
[1510] trying: plutus-tx~>plutus-pab:cardano-node:exe.plutus-tx-0.1.0.0 (user goal)
[1511] rejecting: plutus-tx:!test (constraint from config file, command line flag, or user target requires opposite flag selection)
[1511] rejecting: plutus-tx:*test (dependencies not linked: stanza "test" incompatible; conflict set: plutus-ledger-api, plutus-tx, plutus-pab:cardano-cli:exe.plutus-ledger-api, plutus-pab:cardano-cli:exe.plutus-tx, plutus-pab:cardano-node:exe.plutus-ledger-api, plutus-pab:cardano-node:exe.plutus-tx, plutus-tx:test, plutus-pab:cardano-cli:exe.plutus-tx:test, plutus-pab:cardano-node:exe.plutus-tx:test)
[1511] fail (backjumping, conflict set: plutus-ledger-api, plutus-tx, plutus-pab:cardano-cli:exe.plutus-ledger-api, plutus-pab:cardano-cli:exe.plutus-tx, plutus-pab:cardano-node:exe.plutus-ledger-api, plutus-pab:cardano-node:exe.plutus-tx, plutus-tx:test, plutus-pab:cardano-cli:exe.plutus-tx:test, plutus-pab:cardano-node:exe.plutus-tx:test)
[1510] rejecting: plutus-tx-0.1.0.0 (multiple instances)
[1510] fail (backjumping, conflict set: plutus-ledger-api, plutus-tx, plutus-pab:cardano-cli:exe.plutus-ledger-api, plutus-pab:cardano-cli:exe.plutus-tx, plutus-pab:cardano-node:exe.plutus-ledger-api, plutus-pab:cardano-node:exe.plutus-tx, plutus-tx:test, plutus-pab:cardano-cli:exe.plutus-tx:test, plutus-pab:cardano-node:exe.plutus-tx:test)
...
[1017] trying: plutus-pab:cardano-node:exe.plutus-tx:*test
[1018] trying: plutus-tx~>plutus-pab:cardano-node:exe.plutus-tx-0.1.0.0 (user goal)
[1019] rejecting: plutus-tx:!test (constraint from config file, command line flag, or user target requires opposite flag selection)
[1019] trying: plutus-tx:*test

This conflict differs from the simple example and leads to backtracking because of a difference in goal order. The solver chose the version of the build tool dependency plutus-tx (plutus-pab:cardano-node:exe.plutus-tx) before the version of the top level build target plutus-tx. Only the top level build target is constrained to enable testing, so the solver initially chose to disable tests for plutus-pab:cardano-node:exe.plutus-tx. Then when it chose the version for the top level build target, it tried linking plutus-tx to plutus-pab:cardano-node:exe.plutus-tx, which requires both to make the same choice for enabling testing. The solver was unable to resolve the conflict by unlinking plutus-tx, because cabal doesn't currently support building the same version of a package twice. It was also unable to choose an alternative version for plutus-tx, because it is a local package. Those conflicts meant that the solver had to backtrack many levels to enable testing in plutus-pab:cardano-node:exe.plutus-tx.

I think that the solution is to remove the restriction on building the same version of a package twice. Then the solver could have unlinked the two goals and avoided backtracking. If I understand correctly, cabal could still build each component of the package at most once (avoiding increasing build time) while fulfilling the requirement to build both with and without tests. There is already an issue for removing the restriction to only build each version of a package once: #6459

Avoiding backtracking would only decrease the number of steps by about 60%, so there may be other performance issues related to solving for build tools on the happy path.

gbaz · 2021-08-10T01:13:03Z

@grayjay thanks for the detailed analysis!

note that my patch (#7490) would add a preference to build-tool dependency to enable testing as well, which tends to bypass this path that you pointed out -- and I think in fact that's what accounts for its improved performance. Looking at the cabal project now, I note some other places besides tests enabled that would induce the same backjumping. In particular, there are ghc options and flags also set at the top level alone:

package cardano-api
  ghc-options: -Werror

package cardano-cli
  ghc-options: -Werror

package cardano-config
  ghc-options: -Werror

package cardano-node
  ghc-options: -Werror

package cardano-node-chairman
  ghc-options: -Werror

package tx-generator
  ghc-options: -Werror

package cryptonite
  -- Using RDRAND instead of /dev/urandom as an entropy source for key
  -- generation is dubious. Set the flag so we use /dev/urandom by default.
  flags: -support_rdrand

Inspecting the logs, I noticed the cryptonite flag in particular led to some significant backjumping.

The current patch only pushes stanza prefs through. Do you think it would make sense to have a patch to push through options and flags explicitly as well?

gbaz · 2021-08-10T01:28:55Z

In particular I'm proposing that at

cabal/cabal-install/src/Distribution/Client/ProjectPlanning.hs

Line 1055 in 2bd0588

(PackageConstraint (scopeToplevel pkgname)

and the following definition

we change scopeToplevel to ScopeAnyQualifier.

grayjay · 2021-08-10T09:45:11Z

@gbaz I think that how we treat ghc options and flags should depend on the meaning of a package section. I'm not sure whether the section is supposed to apply to only the top level use of the package or all uses. I had thought that there was an issue about this distinction, but the closest thing I can find now is issues related to specifying local packages vs all packages, which are listed in #3720. I think that whichever behavior we choose should be made consistent across all options that can be specified in a package section. Do you know how ghc options are currently handled?

One issue with making constraints in a package section apply to all uses of the package is that it might seem inconsistent with the way that constraints are applied with the --constraint flag. --constraint="my-package == 1.0" only constrains the top-level use of my-package, and --constraint="any.my-package == 1.0" is required to constrain all uses.

gbaz · 2021-08-11T18:05:26Z

Merged #7490 but still not done. Empirical testing shows that the change I suggest above -- which applies flag constraints at all scopes -- yields significant further speedup. However it is not a correct change, as there may be times when constraints need not be uniform (when the dependency is solely on the binary). Further, while we can add constraints such as any.foo to indicate constraints across all scopes for version bounds, we cannot add such constraints for flags, so there's no workaround in clever use of cabal.project syntax.

One thing we should do, eventually, is enable that scope qualified syntax for flag constraints as well as version constraints. However, that's pretty invasive.

I think a lightweight change would be to apply qualified version and flag constraints both as preferences at other scopes -- this is a uniform improvement to the solver with no changes to the semantics of the solution space. It just means we try harder, earlier, to keep things uniform. I'll work on a PR towards this goal.

gbaz · 2021-08-11T20:16:26Z

Ok, with that last breakthrough, now we're done!

michaelpj · 2021-08-15T13:54:00Z

Finally got the chance to try this out, and it looks like the patches perform as advertised! Thanks again! 🙇

michaelpj mentioned this issue Jul 2, 2021

Cabal takes 3 minutes to resolve dependencies #7466

Closed

michaelpj mentioned this issue Jul 2, 2021

Resolving dependencies is much slower for haskell.nix with lots of source-repository-package stanzas input-output-hk/haskell.nix#1156

Closed

gbaz mentioned this issue Jul 23, 2021

Attempt to speed solving when test is enable projectwide #7490

Merged

grayjay added cabal-install: solver type: bug labels Jul 24, 2021

Mikolaj mentioned this issue Aug 3, 2021

Feature Request: New component-test-depends section #7505

Closed

gbaz closed this as completed in #7490 Aug 11, 2021

gbaz reopened this Aug 11, 2021

gbaz mentioned this issue Aug 11, 2021

defer build-tools-depends choices as well as setup choices #7532

Merged

gbaz closed this as completed in #7532 Aug 11, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Slow solving with `build-tool-depends` on local packages with tests #7472

Slow solving with `build-tool-depends` on local packages with tests #7472

michaelpj commented Jul 2, 2021

michaelpj commented Jul 2, 2021

michaelpj commented Jul 3, 2021

michaelpj commented Jul 3, 2021

gbaz commented Jul 22, 2021

gbaz commented Jul 23, 2021

grayjay commented Jul 27, 2021

michaelpj commented Aug 3, 2021

michaelpj commented Aug 3, 2021

michaelpj commented Aug 3, 2021

grayjay commented Aug 10, 2021

gbaz commented Aug 10, 2021

gbaz commented Aug 10, 2021 •

edited

Loading

grayjay commented Aug 10, 2021

gbaz commented Aug 11, 2021

gbaz commented Aug 11, 2021 •

edited

Loading

michaelpj commented Aug 15, 2021

Slow solving with build-tool-depends on local packages with tests #7472

Slow solving with build-tool-depends on local packages with tests #7472

Comments

michaelpj commented Jul 2, 2021

michaelpj commented Jul 2, 2021

michaelpj commented Jul 3, 2021

michaelpj commented Jul 3, 2021

gbaz commented Jul 22, 2021

gbaz commented Jul 23, 2021

grayjay commented Jul 27, 2021

michaelpj commented Aug 3, 2021

michaelpj commented Aug 3, 2021

michaelpj commented Aug 3, 2021

grayjay commented Aug 10, 2021

gbaz commented Aug 10, 2021

gbaz commented Aug 10, 2021 • edited Loading

grayjay commented Aug 10, 2021

gbaz commented Aug 11, 2021

gbaz commented Aug 11, 2021 • edited Loading

michaelpj commented Aug 15, 2021

Slow solving with `build-tool-depends` on local packages with tests #7472

Slow solving with `build-tool-depends` on local packages with tests #7472

gbaz commented Aug 10, 2021 •

edited

Loading

gbaz commented Aug 11, 2021 •

edited

Loading