"Create Table As Select" in Django2019-04-29
A discussion recently came up on django-developers mailing list about how to build the SQL query
CREATE TABLE ... AS SELECT (CTAS) with Django’s ORM.
This statement and its cousin
INSERT ... SELECT are useful for re-shaping data inside your database, using
They can be useful for building aggregate tables from complicated, slow queries, as a way of imitating materialized views on databases that don’t support them. Django doesn’t support these at current, but I found it’s not hard to implement using some Django ORM internals, and have an example function below. (Support might have changed by the time you’re reading this - check the linked mailing list thread!)
Example Use Case
For example, you might have a
widget_summary table for analytics purposes built with a
SELECT from your
widget table joined with
You can then regularly rebuild the table with that CTAS query.
We do this in a transaction on databases with transactional DDL (PostgreSQL, SQLite). Start a transaction, drop the table, recreate it with CTAS, and commit.
On databases without transactional DDL, we need a temporary table.
Create the table with a new name, hot-swap it in place with the atomic
RENAME, then drop the old one.
Let’s look at the non-transactional method. The SQL looks like:
Here’s a function that allows CTAS in Django today:
I’ve tested it on Django 2.2 but it should work with older versions.
It works by:
- Getting the connection and creating a cursor for the database named by
- Using the
queryproperty to get its
compiler. This is part of Django’s internal API, meaning it’s undocumented and may change between Django versions without warning. In practice, the SQL compilation is quite stable, and I don’t believe these have changed much in a while. For an internal example, see
- Building SQL with the compiler, but not executing it. This is the main job of the compiler. See the full code if you dare!
- Modifying the SQL with the CTAS prefix.
We use the
quote_namemethod to ensure the table name is escaped on the database backend. This is something the Django ORM always does to avoid accidental self-SQL injection. Table names can contain spaces!
- Executing the SQL with its parameters using the standard Python DB-API.
Modifying this code to do
INSERT ... SELECT should need changes only to the
sql = line.
With this, we can recreate the above widgets example like so:
QuerySet here is quite simple, but you can use all the ORM’s features to extend it to dizzying heights.
Hope this helps! If we add such a function into Django I’ll update this post accordingly.
Working on a Django project? Check out my book Speed Up Your Django Tests which covers loads of best practices so you can write faster, more accurate tests.
One summary email a week, no spam, I pinky promise.
- Getting a Django Application to 100% Test Coverage
- Working Around Memory Leaks in Your Django Application
- Synchronizing Django model definitions
© 2020 All rights reserved.