gperftools for ruby code
(c) 2012 Aman Gupta (tmm1)
http://www.ruby-lang.org/en/LICENSE.txt
gperftools (formerly known as google-perftools): http://gperftools.googlecode.com
require 'rack/perftools_profiler'
config.middleware.use ::Rack::PerftoolsProfiler, :default_printer => 'gif'
Simply add profile=true
to profile a request:
curl -o 10_requests_to_homepage.gif "http://localhost:3000/homepage?profile=true×=10"
Run the profiler with a block:
require 'perftools'
PerfTools::CpuProfiler.start("/tmp/add_numbers_profile") do
5_000_000.times{ 1+2+3+4+5 }
end
Start and stop the profiler manually:
require 'perftools'
PerfTools::CpuProfiler.start("/tmp/add_numbers_profile")
5_000_000.times{ 1+2+3+4+5 }
PerfTools::CpuProfiler.stop
Profile an existing ruby application without modifying it:
$ CPUPROFILE=/tmp/my_app_profile \
RUBYOPT="-r`gem which perftools | tail -1`" \
ruby my_app.rb
The profiler can be run in one of many modes, set via an environment variable before the library is loaded:
-
CPUPROFILE_REALTIME=1
Use walltime instead of cputime profiling. This will capture all time spent in a method, even if it does not involve the CPU.
For example,
sleep()
is not expensive in terms of cputime, but very expensive in walltime. walltime will also show functions spending a lot of time in network i/o. -
CPUPROFILE_OBJECTS=1
Profile object allocations instead of cpu/wall time. Each sample represents one object created inside that function.
-
CPUPROFILE_METHODS=1
Profile method calls. Each sample represents one method call made inside that function.
The sampling interval of the profiler can be adjusted to collect more (for better profile detail) or fewer samples (for lower overhead):
-
CPUPROFILE_FREQUENCY=500
Default sampling interval is 100 times a second. Valid range is 1-4000
pprof.rb --text /tmp/add_numbers_profile
pprof.rb --pdf /tmp/add_numbers_profile > /tmp/add_numbers_profile.pdf
pprof.rb --gif /tmp/add_numbers_profile > /tmp/add_numbers_profile.gif
pprof.rb --callgrind /tmp/add_numbers_profile > /tmp/add_numbers_profile.grind
kcachegrind /tmp/add_numbers_profile.grind
pprof.rb --gif --focus=Integer /tmp/add_numbers_profile > /tmp/add_numbers_custom.gif
pprof.rb --text --ignore=Gem /tmp/my_app_profile
For more options, see pprof documentation
Total: 1735 samples
1487 85.7% 85.7% 1735 100.0% Integer#times
248 14.3% 100.0% 248 14.3% Fixnum#+
-
Simple require 'rubygems' profile
-
Comparing redis-rb with and without SystemTimer based socket timeouts
-
C-level profile of EventMachine + epoll + Ruby threads before and after a 6 line EM bugfix
-
C-level profile of a ruby/rails vm
- 12% time spent in re_match_exec because of excessive calls to rb_str_sub_bang by Date.parse
Just install the gem, which will download, patch and compile gperftools for you:
sudo gem install perftools.rb
Or build your own gem:
git clone git://github.com/tmm1/perftools.rb
cd perftools.rb
gem build perftools.rb.gemspec
gem install perftools.rb
Use via a Gemfile:
gem 'perftools.rb', :git => 'git://github.com/tmm1/perftools.rb.git'
You'll also need graphviz to generate call graphs using dot:
brew install graphviz ghostscript # osx
sudo apt-get install graphviz ps2pdf # debian/ubuntu
If graphviz fails to build on OSX Lion, you may need to recompile libgd, see here
-
Sampling profiler
- perftools samples your process using setitimer() so it can be used in production with minimal overhead.
To profile C code, download and build an unpatched perftools (libunwind or ./configure --enable-frame-pointers required on x86_64).
Download:
wget https://github.com/gperftools/gperftools/releases/download/gperftools-2.8/gperftools-2.8.tar.gz
tar zxvf gperftools-*.tar.gz
cd gperftools-*
Compile:
./configure --prefix=/opt
make
sudo make install
Profile:
export LD_PRELOAD=/opt/lib/libprofiler.so # for linux
export DYLD_INSERT_LIBRARIES=/opt/lib/libprofiler.dylib # for osx
CPUPROFILE=/tmp/ruby_interpreter.profile ruby -e' 5_000_000.times{ "hello world" } '
Report:
pprof `which ruby` --text /tmp/ruby_interpreter.profile
- Add support for heap profiling to find memory leaks (PerfTools::HeapProfiler)
- Allow both C and Ruby profiling
- Add setter for the sampling interval