Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash in dyld during program initialization on macOS 10.15.6 Beta (19G36e) #10

Closed
reuben opened this issue Jun 9, 2020 · 11 comments
Closed
Labels
bug Something isn't working help wanted Extra attention is needed

Comments

@reuben
Copy link

reuben commented Jun 9, 2020

$ cat -n test.cpp
     1	#include <stdio.h>
     2
     3	int main() {
     4	  printf("hello world\n");
     5	  return 0;
     6	}
     7
$ clang test.cpp
$ ./a.out
hello world
$ valgrind ./a.out
==70481== Memcheck, a memory error detector
==70481== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==70481== Using Valgrind-3.16.0.GIT and LibVEX; rerun with -h for copyright info
==70481== Command: ./a.out
==70481==
--70481-- run: /usr/bin/dsymutil "./a.out"
warning: no debug symbols in executable (-arch x86_64)
==70481== Invalid read of size 8
==70481==    at 0x10005840E: cerror_nocancel (in /usr/lib/dyld)
==70481==    by 0x1000588B3: kdebug_is_enabled (in /usr/lib/dyld)
==70481==    by 0x10001591D: dyld3::kdebug_trace_dyld_marker(unsigned int, dyld3::kt_arg, dyld3::kt_arg, dyld3::kt_arg, dyld3::kt_arg) (in /usr/lib/dyld)
==70481==    by 0x1000050C7: dyldbootstrap::start(dyld3::MachOLoaded const*, int, char const**, dyld3::MachOLoaded const*, unsigned long*) (in /usr/lib/dyld)
==70481==    by 0x100005024: _dyld_start (in /usr/lib/dyld)
==70481==  Address 0x8 is not stack'd, malloc'd or (recently) free'd
==70481==
==70481==
==70481== Process terminating with default action of signal 11 (SIGSEGV)
==70481==  Access not within mapped region at address 0x8
==70481==    at 0x10005840E: cerror_nocancel (in /usr/lib/dyld)
==70481==    by 0x1000588B3: kdebug_is_enabled (in /usr/lib/dyld)
==70481==    by 0x10001591D: dyld3::kdebug_trace_dyld_marker(unsigned int, dyld3::kt_arg, dyld3::kt_arg, dyld3::kt_arg, dyld3::kt_arg) (in /usr/lib/dyld)
==70481==    by 0x1000050C7: dyldbootstrap::start(dyld3::MachOLoaded const*, int, char const**, dyld3::MachOLoaded const*, unsigned long*) (in /usr/lib/dyld)
==70481==    by 0x100005024: _dyld_start (in /usr/lib/dyld)
==70481==  If you believe this happened as a result of a stack
==70481==  overflow in your program's main thread (unlikely but
==70481==  possible), you can try to increase the size of the
==70481==  main thread stack using the --main-stacksize= flag.
==70481==  The main thread stack size used in this run was 8388608.
==70481==
==70481== HEAP SUMMARY:
==70481==     in use at exit: 0 bytes in 0 blocks
==70481==   total heap usage: 0 allocs, 0 frees, 0 bytes allocated
==70481==
==70481== All heap blocks were freed -- no leaks are possible
==70481==
==70481== For lists of detected and suppressed errors, rerun with: -s
==70481== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
[1]    70481 segmentation fault  valgrind ./a.out
@reuben
Copy link
Author

reuben commented Jun 9, 2020

Built from source on 36f444c

@reuben
Copy link
Author

reuben commented Jun 9, 2020

(I understand it's a beta OS version and thus not supported, just reporting in case others think it's something particular to their machine)

@LouisBrunner LouisBrunner added the bug Something isn't working label Jun 9, 2020
@LouisBrunner
Copy link
Owner

Hi @reuben,

Thanks a lot for your report!

As I don't have an environment to test that myself, it will be a bit tricky to debug (and until Apple releases the new dyld source code as well). However I will keep you up to date if there is any fix or test I can think of.

@LouisBrunner LouisBrunner added the help wanted Extra attention is needed label Jun 9, 2020
@reuben
Copy link
Author

reuben commented Jun 9, 2020

@LouisBrunner thanks a lot for your work on this project!

@LouisBrunner
Copy link
Owner

Maybe if you ran
valgrind -v -v -v -v -v -v -v -d -d -d -d -d -d -d --trace-syscalls=yes --trace-flags=11111111 --trace-children=yes --trace-signals=yes ./a.out 2>&1 > vg.log
And forwarded me the resulting vg.log and the compiled a.out and /usr/lib/dyld, I might be able to work out what is going one?

@reuben
Copy link
Author

reuben commented Jun 10, 2020

Sure, here it is: stuff.zip

@LouisBrunner LouisBrunner pinned this issue Jun 17, 2020
@LouisBrunner
Copy link
Owner

Just to keep you up to date, I think I have found the cause of the crash.
However I don't know (and properly never will) why it is different between 10.15.4 and 10.15.6 and how to properly fix it.

It seems to be linked to a similar bug that appeared in 10.13 or 10.14 linked to TLS (%gs register) which was making every pthread related code crash. As I have fixed that bug in this fork, I don't know why it would be a problem now.

I will try to look into it but the lack of easy way to debug/test makes it a bit difficult, as always I will tell you if there is anything you can do to help.

@LouisBrunner
Copy link
Owner

@reuben I upgraded to 10.15.6 and couldn't replicate the issue. Are you seeing the same? Might just has been a glitch from the beta...

@NicMcPhee
Copy link

NicMcPhee commented Sep 11, 2020

I use valgrind in a university course, and because everyone's online, students are having to install the toolset on their own computers, which vary widely. I use a Mac, and my installation of this version of valgrind (via brew) works fine for me. One of my students (@iwata008) also has a Mac, and ends up with similar output to that reported here (& similar to #15), with loads of dyls output that I don't get. We have the same version of MacOS (10.15.6), XCode (11.7) gcc (10.2.0 from brew), etc., so I'm rather baffled.

Do folks have suggestions for things to check the versions of? Things to report the details of that might be useful? Any help folks have would be greatly appreciated.

@LouisBrunner
Copy link
Owner

Hi @NicMcPhee,

This issue is specifically about a crash on a beta version of 10.15.6 while #15 is about extra warnings in OS libraries which Valgrind doesn't suppress (yet). For maximum support, it would be better for your student to create a new issue with their complete output.

On another note, Valgrind hasn't had macOS support since roughly 10.12 and while this fork allows some features, it is by no means stable or complete. Multiple features are still missing:

  • usage of threads and signals together (I am pretty close to having it done though!)
  • wqthread (which is the backbone of any GUI application on macOS)
  • helgrind has a lot of noise and drd just doesn't work
  • probably more!

Depending on what your course entails (grading, labs, etc), I would recommend looking at running the Linux version of Valgrind using Docker or a virtual machine, otherwise you might run into strange behaviors on macOS and potentially huge discrepancies with students running Linux.

@LouisBrunner LouisBrunner unpinned this issue Sep 11, 2020
@NicMcPhee
Copy link

Thanks a lot @LouisBrunner for the helpful feedback. We're "only" using it for basic memory leaks, so we don't need a lot, and at least my experience on my Mac was that your patched version worked fine. I was thus surprised when my student seemed to have a totally different experience when I thought we had essentially the same setup (at least where I thought it mattered).

I'm meeting with her again this afternoon, and we'll try to create a more detailed issue. In the meantime she's going to ssh into a machine in our lab where she can do the work on the command line on one of our Linux boxes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants