Python: spy for changes with sys.monitoring

Python 3.12 introduced sys.monitoring, a new framework for “monitoring” tools like debuggers and profilers to hook into. It provides fine-grained control so tools can listen only to certain events on specific lines of code. The framework came from PEP 669, thanks to Mark Shannon of the Faster CPython team.
I recently had a problem with a list that changed unexpectedly, and I couldn’t figure out where the mutation occurred. I reached for sys.monitoring to help me “spy” on the list at every instruction and open pdb when it changed. Whilst the API is verbose, it worked surprisingly well, and I quickly tracked down the culprit.
This post will show you the technique I used. I’ve adapted it for a small example, spying for changes to the list sys.path. This global list controls where modules are imported from, so it’s not uncommon to debug unexpected changes to it.
Okay, here’s the monitoring code:
import sys
# Define a monitor function to run on every instruction
path_length = len(sys.path)
def monitor(code, line_number):
global path_length
if len(sys.path) != path_length:
print("🛎️ sys.path changed length")
path_length = len(sys.path)
breakpoint()
else:
# Disable for a given instruction, assuming it won’t change sys.path if
# re-executed.
return sys.monitoring.DISABLE
# Register a debugger
sys.monitoring.use_tool_id(sys.monitoring.DEBUGGER_ID, "debugging")
# Enable our debugger for instruction events
sys.monitoring.set_events(
sys.monitoring.DEBUGGER_ID,
sys.monitoring.events.INSTRUCTION,
)
# Run monitor() on every instruction
sys.monitoring.register_callback(
sys.monitoring.DEBUGGER_ID,
sys.monitoring.events.INSTRUCTION,
monitor,
)
A quick breakdown:
monitor()is our function to monitorsys.path. It runs on every instruction, the individual steps of executing Python code.monitor()checks ifsys.pathhas changed length to detect mutations. That won’t spot when elements are changed in place, but it’s sufficient for the example here, assys.pathis typically only prepended to or removed from.When a change is detected,
monitor()prints a bell emoji and opens a debugger throughbreakpoint().If
sys.pathdidn’t change length,monitor()returnssys.monitoring.DISABLE. This constant tellssys.monitoringto avoid calling it again for that instruction. Doing so unlocks one of the key performance benefits ofsys.monitoring: it can avoid repeated, unnecessary calls to monitoring tools. Without this step, the overhead of our monitor function would massively slow down the program, maybe to the point of being impractical.use_tool_id()enables the tool with IDDEBUGGER_ID, and the arbitrary name “debugging”.sys.monitoringsupports only six active tools, which must use unique ID values. In this case, we grab the “debugger” ID.set_events()tellssys.monitoringto enable the debugger tool for “instruction” events. This means our tool is enabled for those events only.Finally,
register_callback()tellssys.monitoringto callmonitor()on every instruction event.
Okay, now to see it in action. Below is a tiny example that mutates sys.path, which you can paste at the end of the previous code. Imagine instead a large tangled blob of packages with non-obvious modifications to sys.path.
# Demo program
print("🐒 ooh ooh aah aah")
sys.path.insert(0, "animals")
print("🐍 hisssssss")
Run the whole program and pdb opens just after the sys.path.insert() call:
$ python example.py
🐒 ooh ooh aah aah
🛎️ sys.path changed length
> /.../example.py(46)<module>()
-> print("🐍 hisssssss")
(Pdb)
Following the program’s monkey message, there’s the bell message from monitor(), the pretty-printed sys.path, and then pdb’s prompt. Note that pdb opens on the instruction after the mutation occurred, the next print() line. This happens because sys.monitoring runs the monitor function before each instruction.
You need to go one step back to find the actual culprit instruction. In this case, that means the line before, but it could be trickier after things like a large if block or a function return.
With pdb open, we can use any debugging commands we wish. For example, pp to pretty-print sys.path:
(Pdb) pp sys.path
['animals',
'/Users/chainz/Documents/Projects/_mine/adamj.eu',
'/Users/chainz/.local/share/uv/python/cpython-3.13.1-macos-aarch64-none/lib/python313.zip',
'/Users/chainz/.local/share/uv/python/cpython-3.13.1-macos-aarch64-none/lib/python3.13',
'/Users/chainz/.local/share/uv/python/cpython-3.13.1-macos-aarch64-none/lib/python3.13/lib-dynload',
'/Users/chainz/.local/share/uv/python/cpython-3.13.1-macos-aarch64-none/lib/python3.13/site-packages']
Or use l (list) to see the surrounding code:
(Pdb) l
39
40 print("🐒 ooh ooh aah aah")
41
42 sys.path.insert(0, "animals")
43
44 -> print("🐍 hisssssss")
[EOF]
Running c (continue) makes pdb continue the program:
(Pdb) c
🐍 hisssssss
If there were other mutations to sys.path, the monitor would also spot them and open pdb again.
Conclusion
Overall, I found using sys.monitoring very powerful. I only used the “instruction” event here, but I can see the others being useful too, like RAISE to hook into any time an exception is raised.
The API is a bit verbose, especially compared to its predecessor, sys.settrace(). I guess this is the cost of fine-grained control. And, to be fair, it’s designed for use in big tools, like Coverage.py, rather than end-user debugging like this. That said, it’s clear, and hopefully, the example here is easy to copy-paste and adapt.
Also, I gave an example that opens pdb, but print-debugging also works fine. It’s probably necessary when dealing with threads or asynchronous code.
Fin
sys.monitoring the pesky bug you seek,—Adam
😸😸😸 Check out my new book on using GitHub effectively, Boost Your GitHub DX! 😸😸😸
One summary email a week, no spam, I pinky promise.
Related posts:
- Python: profile total memory allocated with tracemalloc
- Python: Diffing unit tests to keep a copy-pasted code in sync
- Tips for debugging with
print()
Tags: python