Skip to content
This repository has been archived by the owner on Dec 14, 2023. It is now read-only.

Tracking issue: debug information #49

Closed
quarnster opened this issue Sep 24, 2013 · 15 comments
Closed

Tracking issue: debug information #49

quarnster opened this issue Sep 24, 2013 · 15 comments

Comments

@quarnster
Copy link
Contributor

Would be really cool to have binaries compiled with llgo correctly resolve variable info, function arguments, source code lines etc when running through a debugger.

@axw
Copy link
Member

axw commented Sep 24, 2013

Absolutely. I started on this a while ago (you can see some remnants lying about the place in llgo and gollvm).

This is another reason for implementing #42; LLVM is a bit draconian about DWARF constants when writing LLVM IR, and segfaults when trying to use newer constants it doesn't know about (e.g. DW_LANG_Go). I think this is only a problem in writing the IR, and not in reading.

@quarnster
Copy link
Contributor Author

What do you think about solving #43 via implementing this issue? DWARF should contain everything (and more) needed for that issue, solves this issue, doesn't invent a new format and the standard debug/dwarf package could probably be used when reading the data back. Sure, it might take a bit longer to get #43 up and running, but I believe in the end its a much better solution than any alternative.

I'll start looking into this issue (again can't assign myself, but please assign me if you'd like), unless this was something you really wanted to do yourself :). Let me know.

@axw
Copy link
Member

axw commented Sep 28, 2013

I was in the past thinking along the same lines (though I was thinking it would be based on LLVM debug metadata), pretty much for the same reasons you've come up with. The only reservation I have with using debug information is imposing unnecessary overhead for an importer. Maybe it's nothing to worry about, and anyway, it doesn't preclude doing something else later if needed. Go for it :)

If it's going to be DWARF-based, and not LLVM debug metadata-based, where will the debug info come from? Packages are currently built to bitcode, not native code.

@axw
Copy link
Member

axw commented Sep 28, 2013

Ah just one other thing: I mentioned one problem I came across with LLVM spitting the dummy on DW_LANG_Go; there's another, more difficult problem: AFAIK, the C API doesn't provide a way of creating recursive metadata structures.

So there's several options that I see:

  • Extend the C API to add whatever's necessary (and add DW_LANG_Go, assuming constants from new versions of DWARF will be accepted). The biggest downside is that this won't be generally available until the next release of LLVM, and requires going back to SVN builds.
  • Use the LLVM C++ API. This would require Go 1.2 (or SWIG?)
  • Write a pure-Go LLVM IR writer. Obviously a fair amount of work, though there are various advantages (no cgo performance hit, no native library dependencies). A sub-option would be to implement a subset of the IR writer, specifically for writing debug metadata; llgo would generate debug metadata in this way and link it into the final output.

@quarnster
Copy link
Contributor Author

If it's going to be DWARF-based, and not LLVM debug metadata-based, where will the debug info come from? Packages are currently built to bitcode, not native code.

I'm confused, to me LLVM debug metadata is DWARF based but perhaps you mean that the information stored in the llvm bitcode format in a way that isn't immediately accessible as it would be when compiling to a native .o?

I mentioned one problem I came across with LLVM spitting the dummy on DW_LANG_Go

I haven't run into this. I'm currently battling with some debug information ending up in the .bc but gets stripped out later by clang so doesn't actually end up in the binary. Doing some trial and error comparing with bitcode generated by clang when it compiles similar c code and slowly moving forward.

AFAIK, the C API doesn't provide a way of creating recursive metadata structures.

For function arguments which want to reference back to the function as the context, I solved this by first creating all the arguments with a nil context, then creating the function, then iterating over the function's argument llvm.Values calling ReplaceAllUsesWith providing a new Value with a proper context reference now that the function has been created.

@axw
Copy link
Member

axw commented Sep 29, 2013

I'm confused, to me LLVM debug metadata is DWARF based but perhaps you mean that the information stored in the llvm bitcode format in a way that isn't immediately accessible as it would be when compiling to a native .o?

Well, I mean it's not encoded in the same way, so you couldn't just use a debug/dwarf.Reader. You'd need to load the entire LLVM module, and construct debug/dwarf structs from the metadata nodes. Right?

I mentioned one problem I came across with LLVM spitting the dummy on DW_LANG_Go
I haven't run into this.

IIRC, it only manifested when you try to dump the IR in textual form (e.g. with llgo -dump). It may be fixed, though.

For function arguments which want to reference back to the function as the context, I solved this by first creating all the arguments with a nil context, then creating the function, then iterating over the function's argument llvm.Values calling ReplaceAllUsesWith providing a new Value with a proper context reference now that the function has been created.

Ah, nice. I hadn't thought of using ReplaceAllUsesWith with metadata nodes.

@quarnster
Copy link
Contributor Author

I haven't looked into the bitcode format so I don't know how easy it'd be to extract this info. If it's not straight forward, then storing the information in a minimal custom format in a different file is probably best.

@quarnster
Copy link
Contributor Author

Teaser:
Teaser image

@axw
Copy link
Member

axw commented Sep 29, 2013

Nice!

@quarnster
Copy link
Contributor Author

I've been battling with a Mach-O specific issue that I can't think of a clean way to solve. In short if I understand things correctly debug information is not linked into a Mach-O executable, but rather it references either the individual .o files or a special dSym bundle is created from those .o files once the binary has linked after which the .o files can be deleted.

The problem is that linking together multiple llvm bitcode files appears to break (as in hitting an assert and the compilation aborting) the dsym bundle generation even if I just make clang to compile two C files into two separate bitcode files that are then linked together. As this happens with just plain vanilla C and clang I'm not sure this is something which can be fixed from our side.

So the other option is to first create a .o file and then link that single .o file, but the .o file needs to stick around to have any useful debug information. In other words, if we just write the .o file to the temporary directory and it's then deleted, there's no debug info.

I've been googling for hours without finding a way to get the debug information linked into the final executable, so unless someone knows how to do this I've given up on that.

Currently I'm just thinking of naming the .o file outputname.symbols or something, but I'm open for suggestions and wisdom if you have any on this issue.

@axw
Copy link
Member

axw commented Sep 30, 2013

It sounds to me like the dSYM bundle (using dsymutil?) route produces something closer to what's done on Linux. Since debug symbols are moderately important (for stack traces), I'd say this is preferable. Having to distribute .o files with a binary doesn't sound great.

What does gc do?

@quarnster
Copy link
Contributor Author

dsymutil indeed, which is what asserts and thus makes clang assert too unfortunately.

gc uses their own custom linker which links in the dwarf info into the final executable. Alas it has different expectations on the object files given to it so doesn't look like it'd straight forward to use it.

@quarnster
Copy link
Contributor Author

I'll see if I can throw together a command which merges the final output with the dwarf symbols in the .o file.
Useful references:
http://wiki.dwarfstd.org/index.php?title=Apple's_%22Lazy%22_DWARF_Scheme
https://developer.apple.com/library/mac/documentation/developertools/conceptual/machoruntime/Reference/reference.html
http://golang.org/pkg/debug/macho/

@axw
Copy link
Member

axw commented Sep 30, 2013

I'll see if I can throw together a command which merges the final output with the dwarf symbols in the .o file.

Sounds good. Don't feel like you have to do it immediately to get this merged in, though. Something's better than nothing, as long as it's going in the right direction.

quarnster added a commit to quarnster/llgo that referenced this issue Sep 30, 2013
axw added a commit that referenced this issue Oct 1, 2013
llgo: Emit more debug metadata. For issue #49.
@quarnster
Copy link
Contributor Author

I've split this one into a couple of separate issues up for claim for anyone wanting to chime in. Closing this one, let's track specific debug issues in separate issues.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants