Adam Johnson

Home | Blog | Projects | Colophon

"Create Table As Select" in Django

2019-04-29

Learn How to Create This Table!

A discussion recently came up on django-developers mailing list about how to build the SQL query CREATE TABLE ... AS SELECT (CTAS) with Django’s ORM. This statement and its cousin INSERT ... SELECT are useful for re-shaping data inside your database, using SELECT queries.

They can be useful for building aggregate tables from complicated, slow queries, as a way of imitating materialized views on databases that don’t support them. Django doesn’t support these at current, but I found it’s not hard to implement using some Django ORM internals, and have an example function below. (Support might have changed by the time you’re reading this - check the linked mailing list thread!)

Example Use Case

For example, you might have a widget_summary table for analytics purposes built with a SELECT from your widget table joined with flimflams. You can then regularly rebuild the table with that CTAS query.

We do this in a transaction on databases with transactional DDL (PostgreSQL, SQLite). Start a transaction, drop the table, recreate it with CTAS, and commit.

On databases without transactional DDL, we need a temporary table. Create the table with a new name, hot-swap it in place with the atomic RENAME, then drop the old one.

Let’s look at the non-transactional method. The SQL looks like:

/* Drop temporary tables in case the process got cut off before */
DROP TABLE IF EXISTS widget_summary_old, widget_summary_new;
/* Create the new table from the summary query */
CREATE TABLE widget_summary_new AS
SELECT
   widget.id,
   widget.name,
   flimflam.garbles
FROM
   products
   INNER JOIN
      flimflam
      ON widget.flimflam_id = flimflam.id;
/* Hot-swap rename */
RENAME TABLE widget_summary TO widget_summary_old,
             widget_summary_new TO widget_summary;
/* Drop old table */
DROP TABLE widget_summary_old;

In Django

Here’s a function that allows CTAS in Django today:

from django.db import DEFAULT_DB_ALIAS, connections


def create_table_as(table_name, queryset, using=DEFAULT_DB_ALIAS):
    compiler = queryset.query.get_compiler(using=using)
    sql, params = compiler.as_sql()
    connection = connections[DEFAULT_DB_ALIAS]
    sql = "CREATE TABLE " + connection.ops.quote_name(table_name) + " AS " + sql
    with connection.cursor() as cursor:
        cursor.execute(sql, params)

I’ve tested it on Django 2.2 but it should work with older versions.

It works by:

Modifying this code to do INSERT ... SELECT should need changes only to the sql = line.

With this, we can recreate the above widgets example like so:

with connection.cursor() as cursor:
    # /Drop temporary tables in case the process got cut off before
    cursor.execute(
        "DROP TABLE IF EXISTS widget_summary_old, widget_summary_new"
    )
    # Create the new table from the summary query
    create_table_as(
        "widget_summary_new",
        Widget.objects.values_list('id', 'name', 'flimflam__garbles'),
    )
    # Hot-swap rename
    cursor.execute("""\
        RENAME TABLE widget_summary TO widget_summary_old,
                     widget_summary_new TO widget_summary
    """)
    # Drop old table
    cursor.execute("DROP TABLE widget_summary_old")

The QuerySet here is quite simple, but you can use all the ORM’s features to extend it to dizzying heights.

Fin

Hope this helps! If we add such a function into Django I’ll update this post accordingly.

—Adam


Tags: django

Subscribe to my weekly digest email