How to Add Database Modifications Beyond Migrations to Your Django Project

On several Django projects I’ve worked on, there has been a requirement for performing database modifications beyond Django migrations. For example:
- Managing stored procedures
- Managing check constraints, which weren’t supported before Django 2.2
- Importing static data from a file
- Recording migration operations in a log
Let’s look at three approaches to extending Django to do this as neatly as possible.
1. Use Django migrations
Often I find developers have only been taught how to use Django migrations for model operations. They might know some SQL, but since they haven’t used it within Django, they assume migrations can’t use SQL directly. They can! Many of the uses of custom migration-style SQL code that I’ve seen could be better implemented within migrations.
Let’s take a look at an example we’ll use for the rest of the article. Imagine you’re running a Django version before 2.2 that doesn’t support database check constraints. You might add one to a database table with this SQL:
ALTER TABLE
myapp_book
ADD
CONSTRAINT percent_lovers_haters_sum CHECK (
(percent_lovers + percent_haters) = 100
);
This constraint will make the database raise an error for any rows added or updated in the myapp_book
table that have a percentage of lovers and haters not equal to 100. Neat!
With Django’s default table naming, the table myapp_book
would be created for a model called Book
inside the app myapp
. We’ll use those names in this article too.
The SQL above could be run using django.db.connection
:
from django.db import connection
with connection.cursor() as cursor:
cursor.execute(
"""
ALTER TABLE
myapp_book
ADD
CONSTRAINT percent_lovers_haters_sum CHECK (
(percent_lovers + percent_haters) = 100
);
"""
)
This works, but it’s a bit ad-hoc. You might run it using manage.py shell
or better in a custom management command (more on those later!).
Instead of doing those though, we can turn it into a migration.
First, you can create a new migration with:
$ python manage.py makemigrations --empty myapp
Migrations for 'myapp':
myapp/migrations/0101_auto_20190715_1057.py
You can then edit that migration file to use the RunSQL
operation:
from django.db import migrations
class Migration(migrations.Migration):
dependencies = [
("myapp", "0100_add_book"),
]
operations = [
migrations.RunSQL(
"""
ALTER TABLE
myapp_book
ADD
CONSTRAINT percent_lovers_haters_sum CHECK (
(percent_lovers + percent_haters) = 100
);
"""
)
]
A little bonus step is to rename the migration file to something descriptive. For example, instead of 0101_auto_20190715_1057.py
we could call this 0101_add_book_percentage_sum_constraint.py
. This helps a lot in the long term.
To improve the migration beyond our initial SQL, we can add the reverse_sql
argument. This tells Django how to reverse the migration. You’ll rarely need to reverse a migration, but when you do… you really do! Be prepared, as I learned in the Scouts.
For our example, we can expand the migration operation to:
migrations.RunSQL(
sql="""
ALTER TABLE
myapp_book
ADD
CONSTRAINT percent_lovers_haters_sum CHECK (
(percent_lovers + percent_haters) = 100
);
""",
reverse_sql="ALTER TABLE myapp_book DROP CONSTRAINT percent_lovers_haters_sum",
)
This will remove the check constraint when reversing the operation.
If the code you want to run is more complex, for example you want to run something on every model/table, you might use the RunPython
operation. With it you can do pretty much anything - the power of Python!
It also supports a reverse option, which is worth adding - check out the docs.
Evaluation
This approach fits many use-cases, and it also helps provide a lifetime for any objects created by your custom SQL. For example, if we wanted to drop our constraint at a later date, we could create a similar migration with DROP CONSTRAINT
as the forwards SQL. Beautiful symmetry!
One drawback is that it will only run once. This might not make sense if you have something that needs running frequently, as you might need to add a lot of operations.
You might consider writing a custom migration operation or hooking into the pre_migrate
signal. However in my experience, the following two approaches would be easier.
2. Override the ‘migrate’ management command
This feels like a bit of a secret feature, given how few projects I’ve seen use it. However, I have found it handy on several occasions.
Django allows you to override management commands by adding another with the same name. You can override commands from Django core, or those in other apps. This is documented in “custom management commands”.
When running a command, Django searches through all the apps in INSTALLED_APPS
, and then the core. The first place it finds a management command with the given name wins.
Thus, your apps can override any built-ins.
To override migrate
and add your custom behaviour, you’ll want to create myapp/management/commands/migrate.py
(replacing myapp
with the name of one of your apps). Inside that file you can then subclass the built-in migrate command and add your own behaviour. For example:
from django.core.management.commands.migrate import Command as CoreMigrateCommand
from myapp.db import create_constraints
class Command(CoreMigrateCommand):
def handle(self, *args, **options):
# Do normal migrate
super().handle(*args, **options)
# Then our custom extras
create_constraints()
(The custom management commands documentation is likely helpful for writing your own command.)
Overriding works because all throughout Django’s code, migrate
is called as a command rather than imported directly. This is done with the call_command
function. So, if you’ve overridden migrate
, your new version is called instead of the one from core.
We can see this in Django’s test framework code. Its setup_databases
function calls each connections’ create_test_db
method. These in turn run call_command('migrate')
like so (as of version 2.2.3):
# We report migrate messages at one level lower than that requested.
# This ensures we don't get flooded with messages during testing
# (unless you really ask to be flooded).
call_command(
"migrate",
verbosity=max(verbosity - 1, 0),
interactive=False,
database=self.connection.alias,
run_syncdb=True,
)
You can override any built-in command.
Evaluation
This approach is more flexible than using RunSQL
in migrations. We can add any code we want before or after migrate
runs - or even “during” with a context manager.
The major drawback here is we can only override once, in a single app, so it could feel a bit clumsy if we have several app-specific extensions. However, for most projects, I’d recommend keeping it simple.
Having a single “project app” can work really well - I endorse the recommendation in Kristian Glass’ Unofficial FAQ. If you already have multiple apps, you can make one the “core,” have it contain your custom migrate, and then import code from the others. This will be just fine.
That said, sometimes we we want looser coupling, for example when creating third party packages. So let’s look at a final approach.
3. Adding a post_migrate
signal handler
This approach is slightly more advanced again. It uses Django’s signals, which have a mixed reputation due to their “action at a distance.”
Django sends the post_migrate
signal at the very end of migration operations. You can see this in the migrate
source code.
To run some extra code at that point, write it as a signal handler. Registering a signal handler is best done in an `AppConfig.ready()
method, which Django will call at initialization time.
For an example, let’s look at Django’s contenttypes framework. This is included as django.contrib.contenttypes
. It uses a post_migrate
signal handler to create one ContentType
model instance for each model.
The create_contenttypes
handler is registered in its AppConfig.ready()
like so:
class ContentTypesConfig(AppConfig):
name = "django.contrib.contenttypes"
verbose_name = _("Content Types")
def ready(self):
pre_migrate.connect(inject_rename_contenttypes_operations, sender=self)
post_migrate.connect(create_contenttypes)
checks.register(check_generic_foreign_keys, checks.Tags.models)
checks.register(check_model_name_lengths, checks.Tags.models)
The handler is defined in the app’s management/__init__.py
. This is not the most descriptive filename to contain a signal handler - I’d normally use a handlers.py
within the app. The contenttypes framework has it there for historical reasons.
In Django 2.2.3, create_contenttypes
is defined like so:
def create_contenttypes(
app_config,
verbosity=2,
interactive=True,
using=DEFAULT_DB_ALIAS,
apps=global_apps,
**kwargs
):
"""
Create content types for models in the given app.
"""
if not app_config.models_module:
return
app_label = app_config.label
try:
app_config = apps.get_app_config(app_label)
ContentType = apps.get_model("contenttypes", "ContentType")
except LookupError:
return
content_types, app_models = get_contenttypes_and_models(
app_config, using, ContentType
)
if not app_models:
return
cts = [
ContentType(
app_label=app_label,
model=model_name,
)
for (model_name, model) in app_models.items()
if model_name not in content_types
]
ContentType.objects.using(using).bulk_create(cts)
if verbosity >= 2:
for ct in cts:
print("Adding content type '%s | %s'" % (ct.app_label, ct.model))
There is quite a lot of logic here. We can ignore most of it right now though, as it’s use-case specific, but you can get the gist by reading it.
The interesting thing to look at is the function signature.
Signal handlers are only called with keyword arguments. For forwards compatibility, they should accept any extras at the end in **kwargs
, so that unrecognized arguments added by the sender don’t break the handler - more loose coupling.
The arguments listed here are all as per the post_migrate
documentation:
app_config
is the currentAppConfig
- the signal is sent once for each app.verbosity
is the current logging level.interactive
tells us if it’s safe to ask the user for input.using
is the database connection alias, which will vary fromDEFAULT_DB_ALIAS
when using multiple databases.apps
is an application registry containing the specific state of all models after the migrations have run. Because the user might not have run every migration available, this should be used to access model classes, instead of direct imports.
For writing most custom handlers, I think two of these are the most useful.
First, app_config
can be used to restrict your handler to only run for a specific app or set of apps. You can also do this with the sender
argument to Signal.connect()
.
Second, using
is worth passing through for any database operations you use, even if for future proofing. Even if your project uses a single database now, it might not in the future, so you should make sure you operate on the same connection that was migrated.
Evaluation
As we’ve discussed, the benefit of this approach is the looser coupling. If you’re writing a third party package, this is probably the way to go, as it reduces the amount of things that users need to install. Signals do require caution, but since we’ve seen Django itself uses this in a contrib app, it’s a sanctioned use.
Learn how to make your tests run quickly in my book Speed Up Your Django Tests.
One summary email a week, no spam, I pinky promise.
Related posts: