-
-
Notifications
You must be signed in to change notification settings - Fork 14.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cabal/Haskell: Importing generated path module leads to large closure #164630
Comments
data-dir
leads to large closure
Setting |
Strangely, changing |
I agree with the problem description. I am sadly not sure if this issue is actionable. i.e. what could we change to fix this without breaking anything else? |
I thought setting Could cabal2nix be smart enough to recognize when |
It does.
You should always set That a reference to GHC is generated seems to be a weird edge case (maybe related to the fact that a |
enableSeparateDataOutput is not seeming to solve the closure size issue in my case, where I am also not even using the Paths_* modules (note that I have remove-references-to in place, and forgot to turn it off, but the references were there in the first place): It's notable that warp-3.3.21-data has nothing in it, so I'm really not sure what's going on there.
|
It looks as though this PR should allow replacing those paths imports for programs that just want to report version info for various dependencies/self: haskell/cabal#8534 It is in need of qualified reviewers (which I am not). |
So, I'm still hitting this issue with git-annex. Exploring the |
Yeah, this is always the same issue, and there is no general fix for this:
This means that the binary contains a reference to the store path of the library, which contains references to a lot of stuff, including GHC. Possible solutions:
Current workaround:
The advantage of all solutions against the workaround is that all three have the possibility to actually figure out at compile time whether the store paths we don’t want referenced are save to remove. |
We can always remove such hacks again. I think it'd be worth trying out after the release.
Is that something each library must choose to do individually?
Could we do that by default on all haskellPackages? I checked again and all of git-annex' direct dependencies have a direct dependence on GHC. I would not be surprised if this issue affected >90% of haskellPackages. |
Yes, sadly.
No, because it means one remove-references-to call for every unwanted dependency. I mean we could theoretically "grep through" the resulting binary to somehow detect these cases, but that feels terribly brittle, and because it can theoretically cause runtime errors, this workaround should be opt-in.
Maybe you are misunderstanding the problem, maybe you don’t. Either way, I am gonna explain: git-annex, like every package, has build-time dependencies and runtime dependencies. Nix tracks those independently. Build-time dependencies are (the runtime closure of) all store paths mentioned in the .drv. Runtime dependencies are all build-time dependencies that are mentioned in an output of the build. git-annex, like basically every Haskell package, has a lot of Haskell libraries as build-time dependencies. Generally, all Haskell libraries have a huge runtime closure because they reference all their dependencies and our ghc package, which is quite huge. Thus, all of git-annex direct dependencies having a huge closure is not the core issue here. It is to be expected. The aim of justStaticExecutables is to create a binary that has none of its Haskell build-time dependencies as runtime dependencies. Thus the issue is, that git-annex has direct runtime dependencies in the first place. That is (probably) caused by the described bug here and can be solved like described above, where the solutions 2. and 3. would fix the problem on the side of all libraries in the git-annex closure that cause the bug (by preventing store-path references to itself in the compiled library), where solution 1 and the workaround fix the problem by preventing the inclusion of that references in the resulting git-annex binary. Thus the workaround can be applied locally in the git-annex package. |
Hm, after rereading your comment and my message, my message does not feel as clear, as it could be. You cannot detect whether a Haskell library triggers this bug by looking at its runtime closure. As I said, the closure is expected to be huge. A Haskell library triggers this bug, if it contains a store-path reference to itself in its compilation output. (A library might also contain self-references in other parts of its outputs, e.g., in the docs, but that’s not an issue because those don’t get linked into the final binary.) I assume you could try to figure that out with the |
This makes `justStaticExecutables` error if the produced store path contains references to GHC. This is almost always erroneous and due to the generated `Paths_*` module being imported. This helps prevent `justStaticExecutables` from producing binaries with closure sizes in the gigabytes. See: NixOS#164630 Co-authored-by: sternenseemann <sternenseemann@systemli.org>
This makes `justStaticExecutables` error if the produced store path contains references to GHC. This is almost always erroneous and due to the generated `Paths_*` module being imported. This helps prevent `justStaticExecutables` from producing binaries with closure sizes in the gigabytes. See: #164630 Co-authored-by: sternenseemann <sternenseemann@systemli.org>
This makes `justStaticExecutables` error if the produced store path contains references to GHC. This is almost always erroneous and due to the generated `Paths_*` module being imported. This helps prevent `justStaticExecutables` from producing binaries with closure sizes in the gigabytes. See: #164630 Co-authored-by: sternenseemann <sternenseemann@systemli.org> (cherry picked from commit d261882) (minus release note)
Describe the bug
When an executable uses the "paths" module exported by a library (due to the
data-dir
ordata-files
directive in the library's cabal file), the closure of that executable is very large. It includes packages such as GHC and its documentation.Steps To Reproduce
Steps to reproduce the behavior:
nix-store --query -R --include-outputs $(nix-build -A app)
.nix-store --query -R --include-outputs $(nix-build -A big-app)
./nix/store/...-ghc-8.10.7-doc
)Expected behavior
I would not expect that using the exported "paths" module from a library would make the closure so much larger. This is pretty impactful when building docker images. In the above repo, the docker image for
big-app
is 425MB, while the docker image forapp
is only 24M:Additional context
app/Main.hs
conditionally imports thePaths_library
module from thelibrary
package. If the module is imported and used, the closure blows up. This is due to a dependency found in thepackage.conf.d
directory installed with library.The
app
executable does not have any dependency on GHC:Note that the
app
andbig-app
derivations compile the same source, butbig-app
passes a CPP flag so that thePaths_library
modules gets imported and used inapp\Main.hs
.This bug has come up before, though not with the
data-dir
aspect, as NixOS/cabal2nix#539 and #155924.Notify maintainers
@sternenseemann @maralorn
Metadata
(Current nixpkgs revision is 3e644bd)
The text was updated successfully, but these errors were encountered: