Python: introducing tprof, a targeting profiler

Can you hit your target?

Profilers measure the performance of a whole program to identify where most of the time is spent. But once you’ve found a target function, re-profiling the whole program to see if your changes helped can be slow and cumbersome. The profiler introduces overhead to execution and you have to pick out the stats for the one function you care about from the report. I have often gone through this loop while optimizing client or open source projects, such as when I optimized Django’s system checks framework (previous post).

The pain here inspired me to create tprof, a targeting profiler for Python 3.12+ that only measures the time spent in specified target functions. Use it to measure your program before and after an optimization to see if it made any difference, with a quick report on the command line.

For example, say you’ve realized that creating pathlib.Path objects is the bottleneck for your code. You could run tprof like so:

tprof in action measuring pathlib.Path performance.

Benchmark with comparison mode

Sometimes when optimizing code, you want to compare several functions, such as “before” and “after” versions of a function you’re optimizing. tprof supports this with its comparison mode, which adds a “delta” column to the report showing how much faster or slower each function is compared to a baseline.

For example, given this code:

def before():
    total = 0
    for i in range(100_000):
        total += i
    return total


def after():
    return sum(range(100_000))


for _ in range(100):
    before()
    after()

…you can run tprof like this to compare the two functions:

$ tprof -x -t before -t after -m example
🎯 tprof results:
 function         calls total  mean ± σ      min … max   delta
 example:before()   100 227ms   2ms ± 34μs   2ms … 2ms   -
 example:after()    100  86ms 856μs ± 15μs 835μs … 910μs -62.27%

The output shows that after() is about 60% faster than before(), in this case.

Python API

tprof also provides a Python API via a context manager / decorator, tprof(). Use it to profile functions within a specific block of code.

For example, to recreate the previous benchmarking example within a self-contained Python file:

from tprof import tprof


def before():
    total = 0
    for i in range(100_000):
        total += i
    return total


def after():
    return sum(range(100_000))


with tprof(before, after, compare=True):
    for _ in range(100):
        before()
        after()

…which produces output like:

$ python example.py
🎯 tprof results:
 function          calls total  mean ± σ      min … max delta
 __main__:before()   100 227ms   2ms ± 83μs   2ms … 3ms -
 __main__:after()    100  85ms 853μs ± 22μs 835μs … 1ms -62.35%

How it works

tprof uses Python’s sys.monitoring, a new API introduced in Python 3.12 for triggering events when functions or lines of code execute. sys.monitoring allows tprof to register callbacks for only specific target functions, meaning it adds no overhead to the rest of the program. Timing is done in C to further reduce overhead.

Thanks to Mark Shannon for contributing sys.monitoring to CPython! This is the second time I’ve used it—the first time was for tracking down an unexpected mutation (see previous post).

Fin

If tprof sounds useful to you, please give it a try and let me know what you think! Install tprof from PyPI with your favourite package manager.

May you hit your Q1 targets,

—Adam


😸😸😸 Check out my new book on using GitHub effectively, Boost Your GitHub DX! 😸😸😸


Subscribe via RSS, Twitter, Mastodon, or email:

One summary email a week, no spam, I pinky promise.

Related posts:

Tags: