How to optimize PostgreSQL queries from Django using pgMustard

2022-06-21 Well, that cuts the mustard.

Slow queries happen, and when they do, it can be tough to dissect why they’re slow. This difficulty is compounded by using Django’s ORM, since it generates the SQL for you, so you may have little idea of the actual queries “under the hood”.

In this post we’ll look at what pgMustard does and how to use it with the Django ORM.

(Disclosure: I know pgMustard co-founder Michael Christofides, but I have not been paid for this post. I just like the product.)

pgMustard… Is This a Mustard?

PostgreSQL has an EXPLAIN statement to obtain a query plan for any given SELECT query. This plan contains structured performance data, and can contain both planned and actual metrics. Query plans are great, but interpreting them requires a broad understanding of PostgreSQL, and knowledge of what to focus on first. I often find understanding query plans overwhelming.

This is where pgMustard enters the picture. It takes a query plan, visualizes it, and provides interpretation of the results. It ranks opportunities for optimization out of five stars (five = most optimizable) and provides detailed information about what you can do next.

For example, here’s one of their demo visualizations:

pgMustard example query visualization, showing a tree diagram with explanation on the left. The main hint is that the query planner estimated nearly 200 times too few rows, with a recommendation to try using ANALYZE on the table to correct the statistics.

This query took 55 seconds to execute. pgMustard’s visualization shows the query plan’s steps in a hierarchical display. The sidebar reveals the why it was slow, with a short explanation showing action points that are expanded on in a blog post.

I like how pgMustard gives you informed optimizations for your queries and tables. It’s like having a knowledgeable DBA on hand.

It Costs… Money!

pgMustard is a paid product, with a free trial that allows you to explain five query plans (no credit card required). This doesn’t include queries with no optimization suggestions.

At current it costs €95 (~$105) a year for a single developer, or €500 (~$525) a year for a team (no limit on members). Both plans allow you to explain an unlimited amount of queries. I think this works well, as optimization is something you may focus on only a few times a year.

There are free tools for visualizing EXPLAIN output; see the list on the PostgreSQL Wiki. I’ve used two of them before: explain.dalibo.com (the original PEV version) and explain.depesz.com. They’re handy, but they’re focused on visualizing, so they don’t have many interpretation features. For me, pgMustard’s prioritized explanations are the main feature. We’ll revisit the free tools at the end of this post.

Django → pgMustard Workflow

Alright, let’s look at how to get from a Django QuerySet to a Query Plan to a pgMustard explanation.

1. Use `QuerySet.explain()` and All the Options

You use the EXPLAIN statement by prefixing it to the target statement, like:

EXPLAIN SELECT * FROM auth_user;

Django’s ORM provides the QuerySet.explain() method to do this for any data-fetching query. It returns the query plan as a string (on PostgreSQL). For example:

In [2]: User.objects.all().explain()
Out[2]: 'Seq Scan on auth_user  (cost=0.00..699.32 rows=25632 width=109)'

Note that you can only use .explain() with methods that return QuerySets. Thus if you want to explain a .count(), .aggregate(), .exists(), etc., try explaining the QuerySet without that last method. This slightly changes the query, but it’s often accurate enough.

PostgreSQL has many options for EXPLAIN, of which pgMustard asks you to use at least:

format json - output in JSON format for machine consumption, rather than the default text format intended for humans
analyze - run the query and gain actual performance data, not just predicted
buffers - track disk read metrics
verbose - add some miscellaneous details

There are also two newer options that pgMustard can use the data from:

settings - PostgreSQL 12+, show settings that affect the query plan
wal - PostgreSQL 13+, add data on the Write Ahead Log, an important data structure

You can pass these options as arguments to QuerySet.explain(), with the value True, except format="json":

User.objects.all().explain(
    format="json", analyze=True, buffers=True, verbose=True, settings=True, wal=True
)

You then want to copy the output, in order to paste it into pgMustard.

The output from explain() is a string, which is a bit fiddly for copying as you don’t want the surrounding quote marks, and it may include escaping. You can print() the string to avoid these problems. Here’s a function you can use to run explain with all the options and provide copy-able output:

def show_explain(qs):
    print(
        qs.explain(
            format="json",
            analyze=True,
            buffers=True,
            verbose=True,
            settings=True,
            wal=True,
        )
    )

(Remove wal if you’re using PostgreSQL < 13, and remove settings if you’re using PostgreSQL < 12.)

Pass the QuerySet you’re interested in, and you’ll get back the query plan as a big chunk of JSON:

In [39]: show_explain(User.objects.filter(email__endswith="@example.org"))
[{"Plan": {"Node Type": "Gather", "Parallel Aware": false, "Startup Cost": 1000.0, "Total Cost": 191384.91, "Plan Rows": 1031, "Plan Width": 72, "Actual Startup Time": 36.64, "Actual Total Time": 4307.309, "Actual Rows": 100001, "Actual Loops": 1, "Output": ["id", "password", "last_login", "is_superuser", "username", "first_name", "last_name", "email", "is_staff", "is_active", "date_joined"], "Workers Planned": 2, "Workers Launched": 2, "Single Copy": false, "Shared Hit Blocks": 16148, "Shared Read Blocks": 120436, "Shared Dirtied Blocks": 0, "Shared Written Blocks": 0, "Local Hit Blocks": 0, "Local Read Blocks": 0, "Local Dirtied Blocks": 0, "Local Written Blocks": 0, "Temp Read Blocks": 0, "Temp Written Blocks": 0, "WAL Records": 0, "WAL FPI": 0, "WAL Bytes": 0, "Plans": [{"Node Type": "Seq Scan", "Parent Relationship": "Outer", "Parallel Aware": true, "Relation Name": "auth_user", "Schema": "public", "Alias": "auth_user", "Startup Cost": 0.0, "Total Cost": 190281.81, "Plan Rows": 430, "Plan Width": 72, "Actual Startup Time": 130.523, "Actual Total Time": 3996.232, "Actual Rows": 33334, "Actual Loops": 3, "Output": ["id", "password", "last_login", "is_superuser", "username", "first_name", "last_name", "email", "is_staff", "is_active", "date_joined"], "Filter": "((auth_user.email)::text ~~ '%@example.org'::text)", "Rows Removed by Filter": 3300013, "Shared Hit Blocks": 16148, "Shared Read Blocks": 120436, "Shared Dirtied Blocks": 0, "Shared Written Blocks": 0, "Local Hit Blocks": 0, "Local Read Blocks": 0, "Local Dirtied Blocks": 0, "Local Written Blocks": 0, "Temp Read Blocks": 0, "Temp Written Blocks": 0, "WAL Records": 0, "WAL FPI": 0, "WAL Bytes": 0, "Workers": [{"Worker Number": 0, "Actual Startup Time": 177.384, "Actual Total Time": 3897.37, "Actual Rows": 31986, "Actual Loops": 1, "JIT": {"Functions": 4, "Options": {"Inlining": false, "Optimization": false, "Expressions": true, "Deforming": true}, "Timing": {"Generation": 12.21, "Inlining": 0.0, "Optimization": 20.075, "Emission": 147.532, "Total": 179.817}}, "Shared Hit Blocks": 5394, "Shared Read Blocks": 38399, "Shared Dirtied Blocks": 0, "Shared Written Blocks": 0, "Local Hit Blocks": 0, "Local Read Blocks": 0, "Local Dirtied Blocks": 0, "Local Written Blocks": 0, "Temp Read Blocks": 0, "Temp Written Blocks": 0, "WAL Records": 0, "WAL FPI": 0, "WAL Bytes": 0}, {"Worker Number": 1, "Actual Startup Time": 179.124, "Actual Total Time": 3900.698, "Actual Rows": 32096, "Actual Loops": 1, "JIT": {"Functions": 4, "Options": {"Inlining": false, "Optimization": false, "Expressions": true, "Deforming": true}, "Timing": {"Generation": 12.141, "Inlining": 0.0, "Optimization": 19.935, "Emission": 149.706, "Total": 181.782}}, "Shared Hit Blocks": 5358, "Shared Read Blocks": 38514, "Shared Dirtied Blocks": 0, "Shared Written Blocks": 0, "Local Hit Blocks": 0, "Local Read Blocks": 0, "Local Dirtied Blocks": 0, "Local Written Blocks": 0, "Temp Read Blocks": 0, "Temp Written Blocks": 0, "WAL Records": 0, "WAL FPI": 0, "WAL Bytes": 0}]}]}, "Settings": {"search_path": "\"$user\", public, tiger"}, "Planning": {"Shared Hit Blocks": 0, "Shared Read Blocks": 0, "Shared Dirtied Blocks": 0, "Shared Written Blocks": 0, "Local Hit Blocks": 0, "Local Read Blocks": 0, "Local Dirtied Blocks": 0, "Local Written Blocks": 0, "Temp Read Blocks": 0, "Temp Written Blocks": 0}, "Planning Time": 2.314, "Triggers": [], "JIT": {"Functions": 12, "Options": {"Inlining": false, "Optimization": false, "Expressions": true, "Deforming": true}, "Timing": {"Generation": 36.606, "Inlining": 0.0, "Optimization": 42.797, "Emission": 328.02, "Total": 407.423}}, "Execution Time": 4355.994}]

Boom, now you can select and copy query plan.

Bug with olde Djangos

Note: format="json" only works properly on Django 4.0+, due to a bug reported in Ticket #32226. On older versions of Django use this adapted version of show_explain() that undoes the broken formatting:

import ast
import json


def show_explain(qs):
    plan = qs.explain(
        format="json",
        analyze=True,
        buffers=True,
        verbose=True,
        settings=True,
        wal=True,
    )
    print(json.dumps(ast.literal_eval(plan)))

But also, upgrade 😉

2. Analyze with pgMustard

pgMustard’s “New plan” screen has a big text box to paste the query plan into:

pgMustard’s “New Plan” page, with a text box for pasting the plan and a “Review” button.

After doing so and pushing “Review”, you will see the visualization and recommendations:

pgMustard’s “Plan Analysis” page, showing a tree breakdown of the query and three recommendations.

For this example query, there are three recommendations, two of which have five stars, meaning a lot of potential for speeding up the query. Expanding the first one, it says:

5.0 ⭐️ Index Potential
100,002 out of 10,000,041 table rows are returned after filtering (1.0%)
Sequential scans are often slow when not all rows are used.
Adding an index may allow the planner to look up the rows required more efficiently. Consider adding an index, or investigating why the query planner considers any existing index(es) unhelpful.
The table being scanned is:
auth_user
The filter being used is:
((auth_user.email)::text ~~ '%@example.org'::text)

So here you may consider adding an index to the email field. Note that a vanilla index cannot help with this end-of-string matching. You could use a functional index to store the split-out email domain, and adjust the query accordingly to use the same function calls. (Specifically it would be Substr("email", StrIndex("email", Value("@"))).)

By default, pgMustard stores your analyzed query plans in your browser’s local storage, keeping them private. Query plans can include sensitive data in table names, filters, etc. so it’s nice that it defaults to keeping this secret. You also have the option to “publish” a plan, which stores it on their server and gives it a random public URL you can share with colleagues:

pgMustard’s “Publish plan” dialog, warning about the implications and asking for a name before publishing.

Here’s the link for this plan: app.pgmustard.com/#/explore/aea44714-89ef-4de6-95b2-1da18d595ebc. Great stuff.

Analyzing Other Types of Queries

QuerySet.explain() is very handy, but its restriction to methods that return QuerySets can limit your analysis. It’s possible to EXPLAIN most PostgreSQL statements, such as INSERT (from e.g. Model.save()).

Below is a snippet of code containing a context manager to perform the same EXPLAIN on any query and output the query plan. This uses database instrumentation to modify the SQL and capture the explain result.

Keep in mind that the ANALYZE option actually executes the query. So if you’re running something that modifies data, like Model.save(), QuerySet.delete(), etc., the changes are applied. This can also make it tricky to analyze a modification query after it has run, since it may no longer do anything.

Also beware that some Django ORM methods map onto multiple queries. This snippet will only analyze the first one, since it uses an exception to stop execution.

import json
from contextlib import contextmanager

from django.db import connection


class ExplainResult(Exception):
    pass


def explain_hook(execute, sql, params, many, context):
    sql = "EXPLAIN (ANALYZE, FORMAT JSON, BUFFERS, VERBOSE, SETTINGS, WAL) " + sql
    execute(sql, params, many, context)
    result = context["cursor"].fetchone()[0]
    raise ExplainResult(result)


@contextmanager
def explain():
    try:
        with connection.execute_wrapper(explain_hook):
            yield
    except ExplainResult as exc:
        print(json.dumps(exc.args[0]))

Use the explain() context manager to wrap a code block that executes your query, and you may get back a wall of JSON:

In [59]: with explain():
    ...:     User.objects.count()
[{"Plan": {"Node Type": "Aggregate", "Strategy": "Plain", "Partial Mode": "Finalize", "Parallel Aware": false, "Startup Cost": 191282.03, "Total Cost": 191282.04, "Plan Rows": 1, "Plan Width": 8, "Actual Startup Time": 4159.731, "Actual Total Time": 4178.435, "Actual Rows": 1, "Actual Loops": 1, "Output": ["count(*)"], "Shared Hit Blocks": 16145, "Shared Read Blocks": 120439, "Shared Dirtied Blocks": 0, "Shared Written Blocks": 0, "Local Hit Blocks": 0, "Local Read Blocks": 0, "Local Dirtied Blocks": 0, "Local Written Blocks": 0, "Temp Read Blocks": 0, "Temp Written Blocks": 0, "WAL Records": 0, "WAL FPI": 0, "WAL Bytes": 0, "Plans": [{"Node Type": "Gather", "Parent Relationship": "Outer", "Parallel Aware": false, "Startup Cost": 191281.81, "Total Cost": 191282.02, "Plan Rows": 2, "Plan Width": 8, "Actual Startup Time": 4156.926, "Actual Total Time": 4178.27, "Actual Rows": 3, "Actual Loops": 1, "Output": ["(PARTIAL count(*))"], "Workers Planned": 2, "Workers Launched": 2, "Single Copy": false, "Shared Hit Blocks": 16145, "Shared Read Blocks": 120439, "Shared Dirtied Blocks": 0, "Shared Written Blocks": 0, "Local Hit Blocks": 0, "Local Read Blocks": 0, "Local Dirtied Blocks": 0, "Local Written Blocks": 0, "Temp Read Blocks": 0, "Temp Written Blocks": 0, "WAL Records": 0, "WAL FPI": 0, "WAL Bytes": 0, "Plans": [{"Node Type": "Aggregate", "Strategy": "Plain", "Partial Mode": "Partial", "Parent Relationship": "Outer", "Parallel Aware": false, "Startup Cost": 190281.81, "Total Cost": 190281.82, "Plan Rows": 1, "Plan Width": 8, "Actual Startup Time": 3950.091, "Actual Total Time": 3950.119, "Actual Rows": 1, "Actual Loops": 3, "Output": ["PARTIAL count(*)"], "Shared Hit Blocks": 16145, "Shared Read Blocks": 120439, "Shared Dirtied Blocks": 0, "Shared Written Blocks": 0, "Local Hit Blocks": 0, "Local Read Blocks": 0, "Local Dirtied Blocks": 0, "Local Written Blocks": 0, "Temp Read Blocks": 0, "Temp Written Blocks": 0, "WAL Records": 0, "WAL FPI": 0, "WAL Bytes": 0, "Workers": [{"Worker Number": 0, "Actual Startup Time": 3846.571, "Actual Total Time": 3846.613, "Actual Rows": 1, "Actual Loops": 1, "JIT": {"Functions": 3, "Options": {"Inlining": false, "Optimization": false, "Expressions": true, "Deforming": true}, "Timing": {"Generation": 7.396, "Inlining": 0.0, "Optimization": 11.115, "Emission": 116.999, "Total": 135.51}}, "Shared Hit Blocks": 5405, "Shared Read Blocks": 38639, "Shared Dirtied Blocks": 0, "Shared Written Blocks": 0, "Local Hit Blocks": 0, "Local Read Blocks": 0, "Local Dirtied Blocks": 0, "Local Written Blocks": 0, "Temp Read Blocks": 0, "Temp Written Blocks": 0, "WAL Records": 0, "WAL FPI": 0, "WAL Bytes": 0}, {"Worker Number": 1, "Actual Startup Time": 3847.801, "Actual Total Time": 3847.84, "Actual Rows": 1, "Actual Loops": 1, "JIT": {"Functions": 3, "Options": {"Inlining": false, "Optimization": false, "Expressions": true, "Deforming": true}, "Timing": {"Generation": 7.155, "Inlining": 0.0, "Optimization": 10.949, "Emission": 118.094, "Total": 136.199}}, "Shared Hit Blocks": 5246, "Shared Read Blocks": 38227, "Shared Dirtied Blocks": 0, "Shared Written Blocks": 0, "Local Hit Blocks": 0, "Local Read Blocks": 0, "Local Dirtied Blocks": 0, "Local Written Blocks": 0, "Temp Read Blocks": 0, "Temp Written Blocks": 0, "WAL Records": 0, "WAL FPI": 0, "WAL Bytes": 0}], "Plans": [{"Node Type": "Seq Scan", "Parent Relationship": "Outer", "Parallel Aware": true, "Relation Name": "auth_user", "Schema": "public", "Alias": "auth_user", "Startup Cost": 0.0, "Total Cost": 179542.25, "Plan Rows": 4295825, "Plan Width": 0, "Actual Startup Time": 97.703, "Actual Total Time": 2546.515, "Actual Rows": 3333347, "Actual Loops": 3, "Shared Hit Blocks": 16145, "Shared Read Blocks": 120439, "Shared Dirtied Blocks": 0, "Shared Written Blocks": 0, "Local Hit Blocks": 0, "Local Read Blocks": 0, "Local Dirtied Blocks": 0, "Local Written Blocks": 0, "Temp Read Blocks": 0, "Temp Written Blocks": 0, "WAL Records": 0, "WAL FPI": 0, "WAL Bytes": 0, "Workers": [{"Worker Number": 0, "Actual Startup Time": 136.892, "Actual Total Time": 2509.312, "Actual Rows": 3223800, "Actual Loops": 1, "Shared Hit Blocks": 5405, "Shared Read Blocks": 38639, "Shared Dirtied Blocks": 0, "Shared Written Blocks": 0, "Local Hit Blocks": 0, "Local Read Blocks": 0, "Local Dirtied Blocks": 0, "Local Written Blocks": 0, "Temp Read Blocks": 0, "Temp Written Blocks": 0, "WAL Records": 0, "WAL FPI": 0, "WAL Bytes": 0}, {"Worker Number": 1, "Actual Startup Time": 138.042, "Actual Total Time": 2489.045, "Actual Rows": 3182375, "Actual Loops": 1, "Shared Hit Blocks": 5246, "Shared Read Blocks": 38227, "Shared Dirtied Blocks": 0, "Shared Written Blocks": 0, "Local Hit Blocks": 0, "Local Read Blocks": 0, "Local Dirtied Blocks": 0, "Local Written Blocks": 0, "Temp Read Blocks": 0, "Temp Written Blocks": 0, "WAL Records": 0, "WAL FPI": 0, "WAL Bytes": 0}]}]}]}]}, "Settings": {"search_path": "\"$user\", public, tiger"}, "Planning": {"Shared Hit Blocks": 0, "Shared Read Blocks": 0, "Shared Dirtied Blocks": 0, "Shared Written Blocks": 0, "Local Hit Blocks": 0, "Local Read Blocks": 0, "Local Dirtied Blocks": 0, "Local Written Blocks": 0, "Temp Read Blocks": 0, "Temp Written Blocks": 0}, "Planning Time": 4.296, "Triggers": [], "JIT": {"Functions": 11, "Options": {"Inlining": false, "Optimization": false, "Expressions": true, "Deforming": true}, "Timing": {"Generation": 24.741, "Inlining": 0.0, "Optimization": 23.089, "Emission": 251.59, "Total": 299.42}}, "Execution Time": 4190.409}]

Ta-daa! Ready to be pasted into pgMustard.

Skip the Legwork With `auto_explain`

auto_explain is a PostgreSQL module that can run EXPLAIN on slow queries automatically (based on a threshold of execution time). It outputs the explain plans in the PostgreSQL log file. This way you can gather query plans from your production environment without having to go through the steps above. It is also great for capturing query plans of queries that are only slow occasionally, e.g. with certain data.

pgMustard have a guide on enabling auto_explain with appropriate options. This can save a human time, at the cost of a small overhead. I would think it’s generally worth it.

Using the Free Tools

If you finish the free trial on pgMustard and find it beyond your price range, you can use one of the free tools mentioned earlier. They’re listed on the PostgreSQL Wiki page “Using EXPLAIN”. All of the above advice on obtaining EXPLAIN output still applies, as they also take query plans in JSON format

For example, here’s that first query plan analyzed by explain.dalibo.com.

explain.dalibo.com showing a query plan visualization.

I made this plan public: explain.dalibo.com/plan/RYV.

explain.dalibo.com offers a little interpretation, in terms of some icons that highlight the scan as slow and expensive. But it doesn’t give any advice on what to do to fix this. Still, with some research you can get ideas on what to do next.

Fin

May your queries run ever leaner and meaner,

—Adam

😸😸😸 Check out my new book on using GitHub effectively, Boost Your GitHub DX! 😸😸😸

One summary email a week, no spam, I pinky promise.

Related posts:

Tags: postgresql, django