Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mesa: 21.0.1 -> 21.0.3 #120325

Merged
merged 1 commit into from
May 20, 2021
Merged

mesa: 21.0.1 -> 21.0.3 #120325

merged 1 commit into from
May 20, 2021

Conversation

primeos
Copy link
Member

@primeos primeos commented Apr 23, 2021

Motivation for this change

Builds fine but still causes regressions, see #118753 (comment). This time I've tested it using the r600 driver and Sway doesn't launch either. Given that this wasn't fixed since 20.0.2 it seems like a packaging / Nixpkgs specific problem (I suspected the shader cache could be causing it but removing ~/.cache/mesa_shader_cache doesn't help either). I can hopefully take a look at it later but help is welcome.

Unfortunately I'm not getting much output (even with sway --debug, LIBGL_DEBUG=verbose, and MESA_DEBUG=1) so I might have to analyze the stack trace:

00:00:00.000 [INFO] [sway/main.c:346] Sway version 1.6
00:00:00.004 [INFO] [sway/main.c:154] Linux quorra 5.10.27 #1-NixOS SMP Tue Mar 30 12:32:09 UTC 2021 x86_64 GNU/Linux
00:00:00.005 [INFO] [sway/main.c:170] Contents of /etc/os-release:
00:00:00.005 [INFO] [sway/main.c:154] NAME=NixOS
00:00:00.005 [INFO] [sway/main.c:154] ID=nixos
00:00:00.005 [INFO] [sway/main.c:154] VERSION="21.05.git.d235056d6d6 (Okapi)"
00:00:00.005 [INFO] [sway/main.c:154] VERSION_CODENAME=okapi
00:00:00.005 [INFO] [sway/main.c:154] VERSION_ID="21.05.git.d235056d6d6"
00:00:00.005 [INFO] [sway/main.c:154] PRETTY_NAME="NixOS 21.05 (Okapi)"
00:00:00.005 [INFO] [sway/main.c:154] LOGO="nix-snowflake"
00:00:00.005 [INFO] [sway/main.c:154] HOME_URL="https://nixos.org/"
00:00:00.005 [INFO] [sway/main.c:154] DOCUMENTATION_URL="https://nixos.org/learn.html"
00:00:00.005 [INFO] [sway/main.c:154] SUPPORT_URL="https://nixos.org/community.html"
00:00:00.005 [INFO] [sway/main.c:154] BUG_REPORT_URL="https://github.com/NixOS/nixpkgs/issues"
00:00:00.005 [INFO] [sway/main.c:142] LD_LIBRARY_PATH=
00:00:00.005 [INFO] [sway/main.c:142] LD_PRELOAD=
00:00:00.005 [INFO] [sway/main.c:142] PATH=/run/wrappers/bin:/home/michael/.nix-profile/bin:/etc/profiles/per-user/michael/bin:/nix/var/nix/profiles/default/bin:/run/current-system/sw/bin
00:00:00.005 [INFO] [sway/main.c:142] SWAYSOCK=
00:00:00.005 [DEBUG] [sway/server.c:49] Preparing Wayland server initialization
00:00:00.005 [INFO] [wlr] [backend/session/logind.c:572] Selecting session from XDG_SESSION_ID: 8
00:00:00.030 [INFO] [wlr] [backend/session/logind.c:706] Successfully loaded logind session
00:00:00.032 [INFO] [wlr] [backend/backend.c:167] Found 1 GPUs
00:00:00.032 [INFO] [wlr] [backend/drm/backend.c:155] Initializing DRM backend for /dev/dri/card0 (radeon)
00:00:00.032 [DEBUG] [wlr] [backend/drm/drm.c:65] Atomic modesetting unsupported, using legacy DRM interface
00:00:00.032 [DEBUG] [wlr] [backend/drm/drm.c:82] ADDFB2 modifiers unsupported
00:00:00.032 [INFO] [wlr] [backend/drm/drm.c:244] Found 6 DRM CRTCs
00:00:00.032 [INFO] [wlr] [backend/drm/drm.c:171] Found 6 DRM planes
libGL: Can't open configuration file /etc/drirc: No such file or directory.
libGL: Can't open configuration file /home/michael/.drirc: No such file or directory.
[...]
[dmesg:]
[ 2690.144483] sway[68675]: segfault at a0 ip 00007f6dfcd665c6 sp 00007ffc7ecf7d90 error 4 in r600_dri.so[7f6dfc843000+d9c000]
[ 2690.144491] Code: 64 48 8b 04 25 28 00 00 00 48 89 44 24 58 31 c0 48 85 c9 74 03 48 8b 19 b8 01 00 00 00 89 f1 48 c7 44 24 30 01 00 00 00 d3 e0 <23> 85 a0 00 00 00 48 c7 44 24 38 00 00 00 00 c7 44 24 44 01 00 00

Edit: GDB backtrace:

(gdb) run                                                                                                                       |
Starting program: /nix/store/sv9ijg1zzqj7z35fjkf1s6jsk3dqgp9i-sway-unwrapped-1.6/bin/sway                                       |
[Thread debugging using libthread_db enabled]                                                                                   |
Using host libthread_db library "/nix/store/1jn6apz0fa9h9x7rl3v6vwiymwnjznwv-glibc-2.32-40/lib/libthread_db.so.1".              |
[Detaching after vfork from child process 13323]                                                                                |
[New Thread 0x7fffef24a640 (LWP 13324)]
[New Thread 0x7fffee8d3640 (LWP 13325)]
[New Thread 0x7fffee0d2640 (LWP 13326)]
[New Thread 0x7fffed8d1640 (LWP 13327)]
[New Thread 0x7fffed0d0640 (LWP 13328)]

Thread 1 "sway" received signal SIGSEGV, Segmentation fault.
0x00007ffff58e65c6 in driCreateContextAttribs () from /run/opengl-driver/lib/dri/r600_dri.so
(gdb) bt
#0  0x00007ffff58e65c6 in driCreateContextAttribs () from /run/opengl-driver/lib/dri/r600_dri.so
#1  0x00007ffff58e6a6e in driCreateNewContext () from /run/opengl-driver/lib/dri/r600_dri.so
#2  0x00007fffec8560c6 in dri2_setup_extensions ()
   from /nix/store/83q47kb6kvls9vkd4yg87c50rhkgfw5y-mesa-21.0.3-drivers/lib/libEGL_mesa.so.0
#3  0x00007fffec85bca6 in dri2_initialize_drm ()
   from /nix/store/83q47kb6kvls9vkd4yg87c50rhkgfw5y-mesa-21.0.3-drivers/lib/libEGL_mesa.so.0
#4  0x00007fffec854258 in dri2_initialize ()
   from /nix/store/83q47kb6kvls9vkd4yg87c50rhkgfw5y-mesa-21.0.3-drivers/lib/libEGL_mesa.so.0
#5  0x00007fffec84cb0c in eglInitialize ()
   from /nix/store/83q47kb6kvls9vkd4yg87c50rhkgfw5y-mesa-21.0.3-drivers/lib/libEGL_mesa.so.0
#6  0x00007ffff78d1167 in wlr_egl_create () from /nix/store/jvkgicprzd0sq9akf6kjx9fgcyfljibz-wlroots-0.13.0/lib/libwlroots.so.8
#7  0x00007ffff78d4241 in wlr_renderer_autocreate_with_drm_fd ()
   from /nix/store/jvkgicprzd0sq9akf6kjx9fgcyfljibz-wlroots-0.13.0/lib/libwlroots.so.8
#8  0x00007ffff78dd00c in init_drm_renderer ()
   from /nix/store/jvkgicprzd0sq9akf6kjx9fgcyfljibz-wlroots-0.13.0/lib/libwlroots.so.8
#9  0x00007ffff78d87a2 in wlr_drm_backend_create ()
   from /nix/store/jvkgicprzd0sq9akf6kjx9fgcyfljibz-wlroots-0.13.0/lib/libwlroots.so.8
#10 0x00007ffff78d70a2 in attempt_drm_backend ()
   from /nix/store/jvkgicprzd0sq9akf6kjx9fgcyfljibz-wlroots-0.13.0/lib/libwlroots.so.8
#11 0x00007ffff78d78b8 in wlr_backend_autocreate ()
   from /nix/store/jvkgicprzd0sq9akf6kjx9fgcyfljibz-wlroots-0.13.0/lib/libwlroots.so.8
#12 0x000000000041a5c2 in server_privileged_prepare (server=server@entry=0x47b0c0 <server>) at ../sway/server.c:52
#13 0x000000000041a371 in main (argc=1, argv=0x7fffffffbfa8) at ../sway/main.c:378
Things done
  • Tested using sandboxing (nix.useSandbox on NixOS, or option sandbox in nix.conf on non-NixOS linux)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Determined the impact on package closure size (by running nix path-info -S before and after)
  • Ensured that relevant documentation is up to date
  • Fits CONTRIBUTING.md.

@primeos primeos mentioned this pull request Apr 23, 2021
10 tasks
@primeos
Copy link
Member Author

primeos commented May 2, 2021

FWIW the llvmpipe driver interestingly seems to be fine (noticed that wile working on a VM test for Sway that should also be useful for testing Mesa):
image

I also wondered if it's related to one of those but didn't have time for a closer look so far:

@primeos
Copy link
Member Author

primeos commented May 6, 2021

Ok, I can confirm that #119558 (comment) applies here as well.

So when using a single Nixpkgs revision for the whole system this will work as expected but at least hardware.opengl.package no longer works for testing Mesa updates and I guess this will have additional implications (like not being able to use a Sway built against an older Nixpkgs revision via another channel, Nix Flakes, etc.).

Problems like this where already caused by glibc updates (#95808) and it seems like we can now add Mesa to that list (which would be bad as we update Mesa much more often). Unfortunately I don't know what caused this regression, if it's avoidable, and if it only affects wlroots-based Wayland compositors or also other Wayland compositors and X11.

Not sure what to do about this PR. Should we merge it or not and do we even have a choice?

@GrahamcOfBorg eval
@GrahamcOfBorg test sway

@primeos primeos marked this pull request as ready for review May 6, 2021 18:24
@ofborg ofborg bot requested a review from vcunat May 6, 2021 18:38
@primeos
Copy link
Member Author

primeos commented May 11, 2021

Not sure what to do about this PR. Should we merge it or not and do we even have a choice?

Ok, so my current plan is the following: Wait for the NixOS 21.05 branch-off on May 21st then merge this (likely 21.0.4 by then) into master and merge #119558 a few days/weeks after that (also depending on how it goes).
(Alternatively we could already merge this and revert it for NixOS 21.05 but that likely only makes testing more difficult.)

Note: The update to Mesa 21.0.2 was reverted (25ae1fd) because it
caused major issues with Sway (segfault on startup [0]).
This is still the case and might affect all packages that directly
depend on "mesa" (for libgbm or libglapi) but it only causes issues when
the package depends on a "mesa" version that differs from "mesa.drivers"
used for "/run/opengl-driver/". I've noticed this while testing Mesa
updates with the NixOS option "hardware.opengl.package" (as usual)
instead of rebuilding my whole system (which would work). Unfortunately
this can/will likely also cause issues when mixing different channels,
using Flakes/Overlays, etc.

The cause of this should be similar to [1] ("mesa" updates now cause the
same issues that "glibc" updates already do, maybe triggered by certain
Mesa changes) and some additional discussions is in [2],[3].

Note: Don't backport this to NixOS 21.05, at least not without careful
consideration.

[0]: NixOS#118753 (comment)
[1]: NixOS#95808
[2]: NixOS#120325
[3]: NixOS#119558
@primeos
Copy link
Member Author

primeos commented May 20, 2021

I've updated the commit message to make this more discoverable and to briefly summarize the issues/implications.
I'll merge this into staging now (instead of master as previously planed) to have some cached builds (simplifies a few tests a bit) and we can still merge #119558 into master (I could take care of the resulting merge conflict).

Edit: Forgot to mention: Launching Sway also works with the following configuration:

  system.replaceRuntimeDependencies = [
    ({ original = pkgs.mesa; replacement = (import /srv/nixpkgs-test { }).pkgs.mesa; })
    ({ original = pkgs.mesa.drivers; replacement = (import /srv/nixpkgs-test { }).pkgs.mesa.drivers; })
  ];
  # Or instead of mesa.drivers: hardware.opengl.package = (import /srv/nixpkgs-test { }).pkgs.mesa.drivers;

@primeos primeos merged commit 9a4eaf1 into NixOS:staging May 20, 2021
@primeos primeos linked an issue May 25, 2021 that may be closed by this pull request
fabaff pushed a commit to fabaff/nixpkgs that referenced this pull request Jun 11, 2021
Note: The update to Mesa 21.0.2 was reverted (25ae1fd) because it
caused major issues with Sway (segfault on startup [0]).
This is still the case and might affect all packages that directly
depend on "mesa" (for libgbm or libglapi) but it only causes issues when
the package depends on a "mesa" version that differs from "mesa.drivers"
used for "/run/opengl-driver/". I've noticed this while testing Mesa
updates with the NixOS option "hardware.opengl.package" (as usual)
instead of rebuilding my whole system (which would work). Unfortunately
this can/will likely also cause issues when mixing different channels,
using Flakes/Overlays, etc.

The cause of this should be similar to [1] ("mesa" updates now cause the
same issues that "glibc" updates already do, maybe triggered by certain
Mesa changes) and some additional discussions is in [2],[3].

Note: Don't backport this to NixOS 21.05, at least not without careful
consideration.

[0]: NixOS#118753 (comment)
[1]: NixOS#95808
[2]: NixOS#120325
[3]: NixOS#119558
vcunat pushed a commit that referenced this pull request Jul 25, 2021
Note: The update to Mesa 21.0.2 was reverted (25ae1fd) because it
caused major issues with Sway (segfault on startup [0]).
This is still the case and might affect all packages that directly
depend on "mesa" (for libgbm or libglapi) but it only causes issues when
the package depends on a "mesa" version that differs from "mesa.drivers"
used for "/run/opengl-driver/". I've noticed this while testing Mesa
updates with the NixOS option "hardware.opengl.package" (as usual)
instead of rebuilding my whole system (which would work). Unfortunately
this can/will likely also cause issues when mixing different channels,
using Flakes/Overlays, etc.

The cause of this should be similar to [1] ("mesa" updates now cause the
same issues that "glibc" updates already do, maybe triggered by certain
Mesa changes) and some additional discussions is in [2],[3].

Note: Don't backport this to NixOS 21.05, at least not without careful
consideration.

[0]: #118753 (comment)
[1]: #95808
[2]: #120325
[3]: #119558
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Mesa 21.0: VAAPI recording broken
1 participant