How to Modernize a Django Index Definition with Zero Downtime2020-07-27
If you’ve read the Django documentation for
Model.Meta.index_together recently, you may have noticed this note:
indexesoption instead. The newer
indexesoption provides more functionality than
index_togethermay be deprecated in the future.
Django historically provided index control for a single field with
Field(db_index=True), and for multiple fields in
These are good for specifying indexes for one or more fields, but they don’t give you access to the full power of database indexes.
Meta.indexes option was added in Django 1.11 (2017) to allow use of more index features through the
Index() added support for indexes with descending ordering.
It now supports
db_tablespace to control storage,
opclasses to use PostgreSQL’s various operator classes for indexes, and
condition to create partial indexes that don’t contain every row.
So, how do you “upgrade” from
Well first, this isn’t necessary.
Neither feature is actually deprecated, and they’re not likely to be either.
If you have an old project using either
Meta.index_together, you’re best leaving it in place and using
indexes for new indexes.
But this change is a good example of how to make a “zero downtime” migration, with low risk. It can be a good be a nice exercise for learning more about Django’s migrations.
Let’s take this model:
Status class is using Django 3.0’s new enumeration types.)
Our model uses
index_together, which we’ll change to use
The process should be similar to change
Field(db_index=True) to use
We’ll look at two methods. The first uses a rebuild of the index, which can take some time to run on large tables. The second retains the existing index for “zero downtime.”
Note we’ll not be changing the definition of the index at all.
If you want to upgrade an index to use any of the extra features of
Index(), such as
condition, databases typically cannot change the index in-place.
You’ll need to add a new index in one migration, then remove the original index in a second migration.
To rebuild, we’d need only to drop
index_together and add
indexes with an equivalent
When we run
makemigrations, we’ll end up with a migration file like this:
This is functional.
But if we run
sqlmigrate, we’ll see that it does
DROP INDEX followed by
This isn’t great for large tables, where creating an index might take hours.
Additionally PostgreSQL and SQLite will lock the table for writes whilst they make the new index.
On PostgreSQL, we could swap
AddIndexConcurrently (Django 3.0+) to prevent the lock.
Let’s look at the second method that avoids the work of recreating the index.
Zero Downtime Method
To achieve zero down, we need to add the new
Index() definition using the existing index name, and then write a migration that tells Django nothing needs to change in the database.
The first thing we need is the name that Django auto-generated for the index.
This combines the table name, included field names, and a hash.
The hashing algorithm has changed a couple of times in Django’s history versions, so to be safe we’ll retrieve the index name from the database.
We can do this with a little SQL in
For example, on SQLite, we can run the
.indexes command to list the indexes on our model’s table, and pick ours from the list:
On MariaDB/MySQL, the query to run is:
On PostgreSQL, the query to run is:
It’s worth checking the index name is identical across all your environments (development, staging, production). The name might differ between environments if their databases were initially created with different Django versions, and thus different hashing algorithms. If the names do differ, we’d probably want to rename the index on all environments to match production.
Second, we want to move this into an
Index() definition, inside
Meta.indexes, using the found name:
If we run the
check command at this point, we’ll see an error:
Index() restricts its names to 30 characters to be compatible with Oracle.
This is fair enough, and especially applicable to Django core and third party packages which should be compatible with all database backends.
If you’re using Oracle, the old
index_together name should be < 30 characters.
For other backends we have more characters to work with:
- SQLite - 1,000,000
- PostgreSQL - 63
- MariaDB/MySQL - 64
In this case, we can safely disable the check.
Do this by adding the check ID to the
This is a little bit dangerous as it removes the check for every index. However tests should discover if any future index has an overly long index name, because the database should raise an error during migrations.
(N.B. there’s an open ticket to allow more granular system check silencing.)
check will show it is now silenced:
We should then run
makemigrations with flags to make a new migration:
--name to avoid the automatic migration name.)
Our new migration is identical to the previous downtime-inducing one. We need to modify it to allow Django’s migrations to consider these changes as applied, without running any SQL.
This operation class takes two lists of migration operations.
database_operations is compiled to SQL and run on the database.
state_operations is applied to the in-memory version of models.
This separation allows us to perform some operations in the database, and tell Django “another” thing happened to the actual model classes.
It’s useful for changes that can’t be auto-detected correctly, for example Changing a ManyToManyField to use a through model.
In our case, we want to do nothing to the database, so we provide
In the state layer, we want to tell Django that we’ve “removed”
index_together and “added” the new index in the database.
To do this, we can move the auto-generated
AddIndex operations into our
Our migration ends up like this:
We can verify both sets of operations before running the migration.
First, we can check
database_operations really does nothing with
Great - no SQL statements there, except the normal
Second, we can check that
state_operations does tell Django our migrations match the latest defition of our models.
We do this by running
makemigrations --dry-run to ensure the autodetector doesn’t find anything to change:
This should now be ready to deploy, after our normal test suite passes :)
I hope this helps you understand this improved way of defining indexes, and how to write zero downtime migrations,
Working on a Django project? Check out my book Speed Up Your Django Tests which covers loads of best practices so you can write faster, more accurate tests.
One summary email a week, no spam, I pinky promise.
© 2020 All rights reserved.