-
Notifications
You must be signed in to change notification settings - Fork 147
Code Profiling with cProfile
cProfile is a python library that measures the speed of functions and the call frequency. This allows us to see which functions where potential bottle necks are in the code. We used this library to create a code profiling decorator that allows us to add it to any function we want to profile. Below are the steps.
We added this module to the LRS util folder. The decorator is not applied to any function by default, but adding it is very simple.
First, change PROFILE_LOG_BASE
to point to a location where you want the profile files stored.
Next, open the module that contains the function you want to profile and add the following import.
from lrs.util.profile_decorator import profile_func
Then decorate the function with the profile_func
decorator and pass the decorator a base name for the profile docs.
@profile_func('test_function.cprof')
def parse(request, more_id=None):
Although it isn't necessary to use the file extension of cprof
, it will allow you to use the profile aggregation script described below without modification.
Once the PROFILE_LOG_BASE path is configured, the import is added, and the decorator is on the target function you can run your tests. The profiles will be written to the profile log base with the default file name format you passed to the decorator. The decorator adds a timestamp to the filename so that it is possible to run the function more than once without overwriting the profile.
It is possible to generate a cumulative file of the results. This can be accomplished using the pstats
python library. This library is capable of reading the profile data and presenting it in a human readable format. With the number of times we were running the profiling, we needed a way to accumulate the results into a single output. This was done with the following script, which uses pstats to pull all profiles into one set and output the results into cProfile and KCacheGrind formats.
#!/usr/bin/python
import sys
import time
import os
import pstats
from pyprof2calltree import convert
PROFILE_DIR = 'profiles/'
def gather_stats(p):
agg_prof = None
cprof_cnt = 0
for f in os.listdir(p):
if f.endswith('.cprof'):
cprof_cnt += 1
path = f[:-9]
prof = pstats.Stats(os.path.join(p, f))
print("Processing %s" % f)
if agg_prof:
agg_prof.add(prof)
else:
agg_prof = prof
agg_name = os.path.join(p, "%s.agg.prof" % path)
# only sort if you're gonna print next.. sort isn't preserved on write
#agg_prof.sort_stats('time', 'calls')
agg_prof.dump_stats(agg_name)
convert(agg_prof, os.path.join(p, 'agg_kcg_profiler'))
print "processed %s files" % cprof_cnt
def move_em(the_in, new_path_on_the_block):
import shutil
if not os.path.isdir(new_path_on_the_block):
try:
os.makedirs(new_path_on_the_block)
except (OSError, IOError, Error), e:
print >>sys.stderr, 'Error making %s: %s' % (new_path_on_the_block, e)
os.chdir(the_in)
for f in os.listdir("."):
if not os.path.isdir(f):
shutil.move(os.path.join(the_in,f), os.path.join(new_path_on_the_block, f))
print "moved files to %s" % new_path_on_the_block
if __name__ == "__main__":
the_in = sys.argv[1]
new_dir = time.strftime("%Y%m%dT%H%M%S", time.gmtime()) + "_agg/"
new_path_on_the_block = os.path.join(the_in, new_dir)
print "___ Aggregating Stats __"
gather_stats(the_in)
print "___ Moving files ___"
move_em(the_in, new_path_on_the_block)
The pyprof2calltree
dependency is for converting to KCacheGrind, which is a visualization tool for profiles. If you are not planning on using this you can comment out the convert(agg_prof, os.path.join(p, 'agg_kcg_profiler'))
line that creates the KCacheGrind profile.
pstats also provides an interactive prompt that allows you to sort and display profile results. Run it with the path of your profile as an argument. It will start the interactive session and load the profile you passed in. Then you may use built in commands to sort and view the profile stats.
# point to pstats and to the profile
$> python /usr/lib/python2.7/pstats.py ./agent_gen-20130321T11H11M58S958.agg.prof
Welcome to the profile statistics browser.
./agent_gen-20130321T11H11M58S958.agg.prof%
# type help for a list of commands, type help command for details about the command
./agent_gen-20130321T11H11M58S958.agg.prof% sort cumulative
./agent_gen-20130321T11H11M58S958.agg.prof% strip
./agent_gen-20130321T11H11M58S958.agg.prof% stats 20
Thu Mar 21 11:13:56 2013 ./agent_gen-20130321T11H11M58S958.agg.prof
5093770 function calls (4823856 primitive calls) in 12.063 seconds
Random listing order was used
List reduced from 291 to 20 due to restriction <20>
ncalls tottime percall cumtime percall filename:lineno(function)
29321 0.047 0.000 0.079 0.000 datastructures.py:170(values)
...