Using Django Check Constraints to Limit the Range of an IntegerField

2021-05-08 Measure twice, cut out bad data.

Another way to use database constraints via Django’s CheckConstraint class.

A classic bit of data validation is to check input values lie within the expected range. This can prevent obvious data accidents, such as the NHS recently recording a journalist’s height as 6cm, and thus calculating his BMI as 28,000(!).

Django’s built-in numerical fields have ranges that match the limits that databases support. For example, IntegerField supports the range −2,147,483,648 (−231) to 2,147,483,647 (231 − 1). Most real world numbers lie in much more limited ranges, so we can have our application reject obviously wrong numbers.

Example

Imagine we have a Book model with a field for the page count, which we know only for some books. We know the page count cannot be negative, so we would use a PositiveIntegerField:

from django.db import models


class Book(models.Model):
    ...
    page_count = models.PositiveIntegerField(null=True)

This is a great start but the maximum value of 231 − 1 is still really high.

With a little bit of research we can find Wikipedia’s list of longest novels page. This pegs the longest (work-in-progress) novel at 22,400 pages (Venmurasu). If we round this figure up to 25,000 pages for our upper bound, we can reject outlandishly wrong values. A common mistake could be mixing up word count and page count, which such a bound would prevent as most books have more than 25,000 words.

We also know that books have at least 1 page, so we can add that as a lower bound. For books where the page count is not known, we want to use NULL rather than 0.

Adding a check constraint and form validation

We can set up these bounds in a CheckConstraint.

First, we add the constraint to Meta.constraints:

from django.db import models


class Book(models.Model):
    ...
    page_count = models.PositiveIntegerField(null=True)

    class Meta:
        constraints = [
            models.CheckConstraint(
                name="%(app_label)s_%(class)s_page_count_range",
                check=models.Q(page_count__range=(1, 25_000)),
            ),
        ]

Our check here uses a range lookup. This does an inclusive check for the value between the two bounds, which we pass as a tuple. We don’t need to special-case the NULL values - any comparison with NULL results in NULL, which the constraint interprets as “pass”.

Second, we generate the migration:

$ ./manage.py makemigrations core
Migrations for 'core':
  example/core/migrations/0002_book_core_book_page_count.py
    - Create constraint core_book_page_count on model book

We open the migration to check it looks correct:

from django.db import migrations, models


class Migration(migrations.Migration):

    dependencies = [
        ("core", "0001_initial"),
    ]

    operations = [
        migrations.AddConstraint(
            model_name="book",
            constraint=models.CheckConstraint(
                check=models.Q(("page_count__range", (1, 25000))),
                name="core_book_page_count_range",
            ),
        ),
    ]

All looks good. There are minor differences in the CheckConstraint definition, as the migrations framework has normalized the order of arguments and the construction of the Q() object.

Third, we can add a couple of tests to ensure that our constraint works as expected:

from django.db import IntegrityError
from django.test import TestCase

from example.core.models import Book


class BookTests(TestCase):
    def test_page_count_range_constraint_too_few_pages(self):
        constraint_name = "core_book_page_count_range"
        with self.assertRaisesMessage(IntegrityError, constraint_name):
            Book.objects.create(page_count=0)

    def test_page_count_range_constraint_too_many_pages(self):
        constraint_name = "core_book_page_count_range"
        with self.assertRaisesMessage(IntegrityError, constraint_name):
            Book.objects.create(page_count=25_001)

The tests attempt to store out-of-range values to trigger the IntegrityError from the database. We assert that the error message contains the constraint name, to ensure that we aren’t accidentally triggering a different IntegrityError.

Fourth, we need to consider what we should do with existing bad data. If we try and apply our new migration with bad data in the database, it will crash with another IntegrityError.

For this example, let’s assume we can discard out-of-range page counts and replace them with NULL. We can add a short RunPython operation to our migration to do this:

from django.db import migrations, models


def forwards_func(apps, schema_editor):
    Book = apps.get_model("core", "Book")
    db_alias = schema_editor.connection.alias
    Book.objects.using(db_alias).exclude(page_count__range=(1, 25_000)).update(
        page_count=None
    )


class Migration(migrations.Migration):

    dependencies = [
        ("core", "0001_initial"),
    ]

    operations = [
        migrations.RunPython(
            code=forwards_func,
            reverse_code=migrations.RunPython.noop,
            elidable=True,
        ),
        migrations.AddConstraint(
            model_name="book",
            constraint=models.CheckConstraint(
                check=models.Q(("page_count__range", (1, 25000))),
                name="core_book_page_count_range",
            ),
        ),
    ]

Note:

Now the migration can run when bad data exists.

Fifth, we should consider how to present friendly error messages to users when entering values outside of the range. By default, if we try to store data rejected by a CheckConstraint, Django will only raise the IntegrityError exception. We want to add validation for forms so that users can see and correct mistakes.

IntegerField and its subclasses already add form validators to check values lie in their supported ranges. We can follow this example and add our own validators for the new, more limited range. Adding the validators on the model field means Django can copy them to any form fields derived from the model with ModelForm.

We can add the validators to Field.validators like so:

from django.core.validators import MaxValueValidator, MinValueValidator
from django.db import models


class Book(models.Model):
    ...
    page_count = models.PositiveIntegerField(
        null=True,
        validators=[MinValueValidator(1), MaxValueValidator(25_000)],
    )

    class Meta:
        constraints = [
            models.CheckConstraint(
                name="%(app_label)s_%(class)s_page_count_range",
                check=models.Q(page_count__range=(1, 25_000)),
            ),
        ]

Because we’ve changed the field definition, we need to generate a new migration:

$ ./manage.py makemigrations core
Migrations for 'core':
  example/core/migrations/0003_alter_book_page_count.py
    - Alter field page_count on book

This migration uses AlterField to redefine the field with the new validators argument:

import django.core.validators
from django.db import migrations, models


class Migration(migrations.Migration):

    dependencies = [
        ("core", "0002_book_core_book_page_count"),
    ]

    operations = [
        migrations.AlterField(
            model_name="book",
            name="page_count",
            field=models.PositiveIntegerField(
                null=True,
                validators=[
                    django.core.validators.MinValueValidator(1),
                    django.core.validators.MaxValueValidator(25000),
                ],
            ),
        ),
    ]

We can show the SQL for this migration and check it won’t actually modify the database:

$ ./manage.py sqlmigrate core 0003
BEGIN;
--
-- Alter field page_count on book
--
COMMIT;

Nothing but the standard transaction BEGIN and END - great!

Fin

May your data always be true,

—Adam


🤩 Django-Related Sales for Black Friday Week 🤩


Subscribe via RSS, Twitter, or email:

One summary email a week, no spam, I pinky promise.

Related posts:

Tags: django