Synchronizing Django model definitions2018-02-25
This is about a small problem we faced with the models used for customers in YPlan, now Time Out Checkout.
Customers are stored in two models:
Customer for active customers, and
RemovedCustomer for inactive customers.
When a customer closes their account, a subset of the fields are copied to
RemovedCustomer, to comply with data retention policies, and then the original
Customer is wiped.
The two models are defined something like this:
class Customer(models.Model): name = models.CharField(max_length=128, blank=True) email = models.CharField(max_length=128, null=True, unique=True) # etc. class RemovedCustomer(models.Model): name = models.CharField(max_length=128, blank=True) email = models.CharField(max_length=128, null=True) # etc.
RemovedCustomer.email is not unique, because the same email address could be used for multiple accounts that get removed one after another.
The problem we faced was keeping the definitions of these fields synchronized, differences like
Initially the two model classes were declared in the usual way, as above, with the field definitions copy-pasted.
This meant that changes to one model needed copying to the other.
Unfortunately this got forgotten when a field on
Customer had its
max_length extended, so it wasn’t copied to
RemovedCustomer, and the account close function broke for customers using the new longer
max_length as their data couldn’t be copied into
The solution was obvious: we wanted a way to declare that this field should be the same as that field, allowing for overrides like
Firstly, there is nothing special about constructing a field in a Django model’s class body.
Python class bodies are code contexts like any other, populating a
dict that goes on to become the class.
Any model ‘magic’ from Django happens after the class body finishes executing, when its Model metaclass rearranges the fields in the class
dict and does other processing.
Therefore we don’t need to use field classes to create field objects - we can use a function that returns one instead, for example:
class RemovedCustomer(models.Model): name = plz_clone_field(Customer, "name")
Secondly, Django fields are fairly easy to clone.
They can’t be copied with
copy.deepcopy(), because they get ‘attached’ to the model class by the model meta ‘magic’.
However, they do have a handy method called
deconstruct(), used for serializing in migrations, which returns a 4-tuple that describes how to reconstruct the field object.
Using this we can create a fresh clone of a field, doing something like:
from django.utils.module_loading import import_string name, klass_path, fargs, fkwargs = field.deconstruct() field_class = import_string(klass_path) new_field = field_class(*fargs, **fkwargs)
In our code we created a simple function
clone_field based on this snippet.
Given a model class, the name of a field to clone from it, and any keyword-arg overrides, it returns a clone of that field.
Using it for our models above, it looks like:
from django.utils.module_loading import import_string def clone_field(model_class, name, **kwargs): name, klass_path, fargs, fkwargs = model_class._meta.get_field(name).deconstruct() fkwargs.update(kwargs) field_class = import_string(klass_path) return field_class(*fargs, **fkwargs) class RemovedCustomer(models.Model): name = clone_field(Customer, "name") email = clone_field(Customer, "email", unique=False)
This elegantly declares what to copy with any differences.
Because this happens at class definition time, it can’t affect any of the model meta ‘magic’, as the fields ‘look’ as if they were normally constructed.
And this prevents the bug we saw - a change in, e.g.,
Customer.name would be synchronized to
RemovedCustomer.name automatically, and Django migrations would detect it for both models equally.
🎉 My book Speed Up Your Django Tests is now up to date for Django 3.2. 🎉
Buy now on Gumroad
One summary email a week, no spam, I pinky promise.
© 2018 All rights reserved.