Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: instrument julia to get per-method-instance llvm optimization timings #31616

Closed
wants to merge 17 commits into from

Conversation

NHDaly
Copy link
Member

@NHDaly NHDaly commented Apr 4, 2019

This is a WIP PR that adds Dtrace timings to julia to allow printing timing information per method instance.

This was pair-programmed with @vchuravy 😊

We're currently trying to measure the llvm optimization times, but long-term it'd be nice to do lots of different sections.
Also, dtrace might only work on macOS/solaris, but there's an equivalent tracing tool for linux that should work with the same configuration files.

@NHDaly
Copy link
Member Author

NHDaly commented Apr 4, 2019

This is the D script we're currently running to record the timings:

julia*:::compile-start
{
  self->ts = timestamp;
}
julia*:::compile-end
/timestamp - self->ts > 1000/
{
  @timings[copyinstr(arg0)] = quantize((timestamp - self->ts)/1000);  // microseconds
}


object::OwningBinary<object::ObjectFile> JuliaOJIT::CompilerT::operator()(Module &M)
{
// Instrument the time spent in this block for this method instance
CompileTiming _c(M.getName().str());
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JeffBezanson do you know where I can find the caller of this function in a context that would have the name of the method instance?

Copy link
Member Author

@NHDaly NHDaly Apr 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, okay, i put a breakpoint and got a stacktrace, and it gives me:

    frame #2: 0x00000001001b5f45 libjulia.1.2.dylib`JuliaOJIT::CompilerT::operator(this=0x0000000105039938, M=0x000000011131b280)(llvm::Module&) at jitlayers.cpp:355 [opt]
    frame #3: 0x00000001001b6c06 libjulia.1.2.dylib`llvm::orc::IRCompileLayer<llvm::orc::RTDyldObjectLinkingLayer, JuliaOJIT::CompilerT>::addModule(this=0x0000000105039930, M=<unavailable>, Resolver=std::__1::shared_ptr<llvm::JITSymbolResolver>::element_type @ 0x000000011131c810 strong=1 weak=1) at IRCompileLayer.h:57:48 [opt]
    frame #4: 0x00000001001b6ab7 libjulia.1.2.dylib`JuliaOJIT::addModule(this=<unavailable>, M=unique_ptr<llvm::Module, std::__1::default_delete<llvm::Module> > @ 0x00007ffeefbfe630) at jitlayers.cpp:477:41 [opt]
    frame #5: 0x00000001001b73a3 libjulia.1.2.dylib`jl_finalize_function(llvm::StringRef) [inlined] jl_add_to_ee(m=unique_ptr<llvm::Module, std::__1::default_delete<llvm::Module> > @ scalar) at jitlayers.cpp:702:25 [opt]
    frame #6: 0x00000001001b738f libjulia.1.2.dylib`jl_finalize_function(F=<unavailable>) at jitlayers.cpp:710 [opt]
    frame #7: 0x00000001001596e9 libjulia.1.2.dylib`getAddressForFunction(fname=<unavailable>) at codegen.cpp:1338:5 [opt]
    frame #8: 0x0000000100159a77 libjulia.1.2.dylib`::jl_generate_fptr(output=0x000000010c443c10) at codegen.cpp:1432:39 [opt]
    frame #9: 0x00000001000ddb01 libjulia.1.2.dylib`jl_compile_method_internal(mi=0x000000010c524350, world=25987) at gf.c:1795:5 [opt]
    frame #10: 0x00000001000e02ab libjulia.1.2.dylib`jl_apply_generic(args=0x00007ffeefbfe790, nargs=2) at gf.c:2196:16 [opt]

But i'm surprised; isn't finalize happening after the optimization passes have finished? Why is the JL_TIMING(LLVM_OPT); block there?

@vchuravy
Copy link
Member

vchuravy commented Apr 4, 2019

I would love to have these kinds of instrumentation points, they are really useful for doing application specific probing and this just turned out to be an interesting test-bed to do this.
Right now this is Mac only, but in theory we should be able to use bpftrace on Linux as well.

Notes to self:

@NHDaly
Copy link
Member Author

NHDaly commented Apr 4, 2019

Pasting the updated script I've been using to get a .csv of timings:

test.d

#pragma D option strsize=10000

julia*:::compile-start
{
  self->ts = timestamp;
}
julia*:::compile-end
{
  @timings[copyinstr(arg0)] = max((timestamp - self->ts)/1000);  // microseconds
}
END
{
  printf("\n");
  printf("\n---- DTRACE ----\n");
  printa("%s\t%@d\n", @timings);
}

Then I run it like this, and delete everything up to the ---- DTRACE ----:

$ sudo dtrace -s ~/src/julia/test.d -c '/Users/nathan.daly/src/julia/julia --project test-script.jl' > out.csv

@NHDaly
Copy link
Member Author

NHDaly commented Apr 4, 2019

The one remaining question I have before sending this for review is how to fix the Makefile.

  • Currently, modifying the uprobes.d file doesn't cause subsequent makes to rebuild. You can to do cd src; make.
    Do I maybe also need to add uprobes.d to a rule in the toplevel Makefile?

@vchuravy
Copy link
Member

vchuravy commented Apr 5, 2019

For those following along at home I prototyped a version that can be used to create and trigger probes from Julia functions https://github.com/vchuravy/UProbes.jl

It's a tad bit more expensive than the C version (has to do a ccall instead of a nop). Maybe I can claw back that performance, but I am not to concerned with that right now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants