Working Around Memory Leaks in Your Django Application

Several large Django applications that I’ve worked on ended up with memory leaks at some point. The Python processes slowly increased their memory consumption until crashing. Not fun. Even with automatic restart of the process, there was still some downtime.
Memory leaks in Python typically happen in module-level variables that grow unbounded. This might be an lru_cache
with an infinite maxsize
, or a simple list accidentally declared in the wrong scope.
Leaks don’t need to happen in your own code to affect you either. For example, see this excellent write-up by Peter Karp at BuzzFeed where he found one inside Python’s standard library (since fixed!).
Workarounds
The below workarounds all restart worker processes after so many requests or jobs. This is a simple way to clear out any potential infinitely-accumulating Python objects. If your web server, queue worker, or similar has this ability but isn’t featured, let me know and I’d add it!
Even if you don’t see any memory leaks right now, adding these will increase your application’s resilience.
Gunicorn
If you’re using Gunicorn as your Python web server, you can use the --max-requests
setting to periodically restart workers. Pair with its sibling --max-requests-jitter
to prevent all your workers restarting at the same time. This helps reduce the worker startup load.
For example, on a recent project I configured Gunicorn to start with:
gunicorn --max-requests 1000 --max-requests-jitter 50 ... app.wsgi
For the project’s level of traffic, number of workers, and number of servers, this would restart workers about every 1.5 hours. The jitter of 5% was be enough to de-correlate the restart load.
uWSGI
If you’re using uWSGI, you can use its similar max-requests
setting. This also restarts workers after so many requests.
For example, on a previous project I used this setting in the uwsgi.ini
file like:
[uwsgi]
master = true
module = app.wsgi
...
max-requests = 500
Uwsgi also provides the max-requests-delta
setting for adding some jitter. But since it’s an absolute number it’s more annoying to configure than Gunicorn. If you change the number of workers or the value of max-requests
, you will need to recalculate max-requests-delta
to keep your jitter at a certain percentage.
If you’re using the uWSGI Spooler for background tasks, you’ll also want to set the `spooler-max-tasks
setting <https://uwsgi-docs.readthedocs.io/en/latest/Options.html#spooler-max-tasks>`__. This restarts a spooler process after it has processed so many background tasks. This is also set in uwsgi.ini
:
[uwsgi]
...
spooler-max-tasks = 500
Celery
Celery provides a couple of different settings for memory leaks.
First, there’s the worker_max_tasks_per_child
setting. This restarts worker child processes after they have processed so many tasks. There’s no option for jitter, but Celery tasks tend to have a wide range of run times so there will be some natural jitter.
For example:
app = Celery("myapp")
app.conf.worker_max_tasks_per_child = 100
Or if you’re using Django settings:
CELERY_WORKER_MAX_TASKS_PER_CHILD = 100
100 jobs is smaller than I suggested above for web requests. In the past I’ve ended up using smaller values for Celery because I saw more memory consumption in background tasks. (I think I also came upon a memory leak in Celery itself.)
The other setting you could use is worker_max_memory_per_child
. This specifies the maximum kilobytes of memory a child process can use before the parent replaces it. It’s a bit more complicated, so I’ve not used it.
If you do use worker_max_memory_per_child
, you should probably calculate it as a percentage of your total memory, divided per child process. This way if you change the number of child processes, or your servers’ available memory, it automatically scales. For example (untested):
import psutil
celery_max_mem_kilobytes = (psutil.virtual_memory().total * 0.75) / 1024
app.conf.worker_max_memory_per_child = int(
celery_max_mem_kilobytes / app.conf.worker_concurrency
)
This uses psutil
to find the total system memory. It allocates up to 75% (0.75
) to Celery, which you’d only want if it’s a dedicated Celery server.
Tracking Down Leaks
Debugging memory leaks in Python isn’t the easiest, since any function could allocate a global object in any module. They might also occur in extension code integrated with the C API.
Some tools I have used:
- The standard library module
tracemalloc
. - The
objgraph
andguppy3
packages both pre-date tracemalloc and try to do similar things. They’re both a bit less user friendly but I’ve used them successfully before. - Scout APM which instruments every “span” (request, SQL query, template tag, etc.) with CPython memory allocation counts. Few APM solutions do this. Disclosure: I maintain the Python integration.
Some other useful blog posts:
- Buzzfeed Tech’s write up for a how-to guide using
tracemalloc
on a production Python web service. - Fugue’s write up also using
tracemalloc
. - Benoit Bernard’s “Freaky Python Memory Leak” post where he uses a variety of tools to track down a C-level leak.
Make your development more pleasant with Boost Your Django DX.
One summary email a week, no spam, I pinky promise.
Related posts: