What happens when you run manage.py test?

2020-09-05 A tangle of classes

This is a blog post version of the talk I gave at DjangoCon Australia 2020 today. The video is on YouTube and the slides are on GitHub (including full example code).

You run your tests with manage.py test. You know what happens inside your tests, since you write them. But how does the test runner work to execute them, and put the dots, E’s, and F’s on your screen?

When you learn how Django middleware works, you unlock a huge number of use cases, such as changing cookies, setting global headers, and logging requests. Similarly, learning how your tests run will help you customize the process, for example loading tests in a different order, configuring test settings without a separate file, or blocking outgoing HTTP requests.

In this post, we’ll make a vital customization of our test run’s output - we’ll swap the “dots and letters” default style to use emojis to represent test success, failure, etc:

$ python manage.py test
Creating test database for alias 'default'...
System check identified no issues (0 silenced).
💥❎❌⏭✅✅✅✳️

...

----------------------------------------------------------------------
Ran 8 tests in 0.003s

FAILED (failures=1, errors=1, skipped=1, expected failures=1, unexpected successes=1)
Destroying test database for alias 'default'...

But before we can write that, we need to deconstruct the testing process.

Test Output

Let’s investigate the output from a test run. We’ll use a project containing only this vacuous test:

from django.test import TestCase


class ExampleTests(TestCase):
    def test_one(self):
        pass

When we run the tests, we get some familiar output:

$ python manage.py test
Creating test database for alias 'default'...
System check identified no issues (0 silenced).
.
----------------------------------------------------------------------
Ran 1 test in 0.001s

OK
Destroying test database for alias 'default'...

To investigate what’s going on, we can ask for more detail by increasing the verbosity to maximum with -v 3:

$ python manage.py test -v 3
Creating test database for alias 'default' ('file:memorydb_default?mode=memory&cache=shared')...
Operations to perform:
  Synchronize unmigrated apps: core
  Apply all migrations: (none)
Synchronizing apps without migrations:
  Creating tables...
    Running deferred SQL...
Running migrations:
  No migrations to apply.
System check identified no issues (0 silenced).
test_one (example.core.tests.test_example.ExampleTests) ... ok

----------------------------------------------------------------------
Ran 1 test in 0.004s

OK
Destroying test database for alias 'default' ('file:memorydb_default?mode=memory&cache=shared')...

Okay great, that’s plenty! Let’s take it apart.

The first line, “Creating test database…”, is Django reporting the creation of a test database. If your project uses multiple databases, you’ll see one line per database.

I’m using SQLite in this example project, so Django has automatically set mode=memory in the database address. This makes database operations about ten times faster. Other databases like PostgreSQL don’t have such modes, but there are other techniques to run them in-memory.

The second “Operations to perform” line and several following lines are the output of the migrate command on our test databases. This output is identical to what we’d see running manage.py migrate on an empty database. Here I’m using a small project with no migrations - on a typical project you’d see one line for each migration.

After that, we have the line saying “System check identified no issues”. This output is from Django’s system check framework, which runs a number of “preflight checks” to ensure your project is well configured. You can run it alone with manage.py check, and it also runs automatically in most management commands. Typically it’s the first step before a management command, but for tests it’s deferred until after the test databases are ready, since some checks use database connections.

You can write your own checks to detect configuration bugs. Because they run first, they’re sometimes a better fit than writing a test. I’d love to go into more detail, but that would be another post’s worth of content.

The following lines cover our tests. By default the test runner only prints a single character per test, but with a higher verbosity Django prints a whole line per test. Here we only have one test, “test_one”, and as it finished running, the test runner appended its status “ok” to the line.

To signify the end of the run, there’s a divider made with many dashes: “—–”. If we had any failures or errors, their stack traces would appear before this divider. This is followed by a summary of the tests that ran, their total runtime, and “OK” to indicate the test run was successful.

The final line reports the destruction of our test database.

This gives us a rough order of steps in a test run:

  1. Create the test databases.
  2. Migrate the databases.
  3. Run the system checks.
  4. Run the tests.
  5. Report on the test count and success/failure.
  6. Destroy the test databases.

Let’s track down which components inside Django are responsible for these steps.

Django and unittest

As you may be aware, Django’s test framework extends the unittest framework from the Python standard library. Every component responsible for the above steps is either built in to unittest, or one of Django’s extensions. We can represent this with a basic diagram:

We can find the components on each side by tracing through the code.

The “test” Management Command

The first place to look is the test management command, which Django finds and executes when we run manage.py test. This lives in django.core.management.commands.test.

As management commands go, it’s quite short - under 100 lines. Its handle() method is mostly concerned with handing off to a a “Test Runner”. Simplifying it down to three key lines:

def handle(self, *test_labels, **options):
    TestRunner = get_runner(settings, options['testrunner'])
    ...
    test_runner = TestRunner(**options)
    ...
    failures = test_runner.run_tests(test_labels)
    ...

(Full source.)

So what’s this TestRunner class? It’s a Django component that coordinates the test process. It’s customizable, but the default class, and the only one in Django itself, is django.test.runner.DiscoverRunner. Let’s look at that next!

The DiscoverRunner Class

DiscoverRunner is the main coordinator of the test process. It handles adding extra arguments to the management command, creating and handing off to sub-components, and doing some environment setup.

It starts like this:

class DiscoverRunner:
    """A Django test runner that uses unittest2 test discovery."""

    test_suite = unittest.TestSuite
    parallel_test_suite = ParallelTestSuite
    test_runner = unittest.TextTestRunner
    test_loader = unittest.defaultTestLoader

(Documentation, Full source.)

These class level attributes point to other classes that perform different steps in the test process. You can see most of them are unittest components.

Note that one of them is called test_runner, so we have two distinct concepts called “test runner” - Django’s DiscoverRunner and unittest’s TextTestRunner. DiscoverRunner does a lot more than TextTestRunner and has a different interface. Perhaps Django could have used a different name for DiscoverRunner, like TestCoordinator, but it’s probably too late to change that now.

The main flow in DiscoverRunner is in its run_tests() method. Stripping out a bunch of details, run_tests() looks something like this:

def run_tests(self, test_labels, extra_tests=None, **kwargs):
    self.setup_test_environment()
    suite = self.build_suite(test_labels, extra_tests)
    databases = self.get_databases(suite)
    old_config = self.setup_databases(aliases=databases)
    self.run_checks(databases)
    result = self.run_suite(suite)
    self.teardown_databases(old_config)
    self.teardown_test_environment()
    return self.suite_result(suite, result)

That’s quite a few steps! Many of the called methods correspond with steps on our above list:

A couple of other methods are things we may have expected:

All these methods are useful to investigate for customizing those parts of the test process. But they’re all part of Django itself. The other methods hand off to components in unittest - build_suite() and run_suite(). Let’s investigate those in turn.

build_suite()

build_suite() is concerned with finding the tests to run, and putting them into a “suite” object. It’s again a long method, but simplified, it looks something like this:

def build_suite(self, test_labels=None, extra_tests=None, **kwargs):
    suite = self.test_suite()
    test_labels = test_labels or ['.']

    for label in test_labels:
        tests = self.test_loader.loadTestsFromName(label)
        suite.addTests(tests)

    if self.parallel > 1:
        suite = self.parallel_test_suite(suite, self.parallel, self.failfast)

    return suite

This method uses three of the four classes that we saw DiscoverRunner refers to:

run_suite()

The other DiscoverRunner method to look into is run_suite(). We don’t need to simplify this one - its entire implementation looks like this:

def run_suite(self, suite, **kwargs):
    kwargs = self.get_test_runner_kwargs()
    runner = self.test_runner(**kwargs)
    return runner.run(suite)

Its only concern is constructing a test runner and telling it to run the constructed test suite. This the final one of the unittest components referred to by a class attribute. It uses unittest.TextTestRunner, which is the default runner for outputting results as text, as opposed to, for example, an XML file to communicate results to your CI system.

Let’s finish our investigation by looking inside that class.

The TextTestRunner Class

This component inside unittest takes a test case or suite, and executes it. It starts:

class TextTestRunner(object):
    """A test runner class that displays results in textual form.
    It prints out the names of tests as they are run, errors as they
    occur, and a summary of the results at the end of the test run.
    """
    resultclass = TextTestResult

    def __init__(self, ..., resultclass=None, ...):

(Full source.)

Similarly to DiscoverRunner, it uses a class-level attribute to refer to another class. The default TextTestResult class is the thing that actually writes the text-based output. Unlike DiscoverRunner’s class references, we can override resultclass by passing an alternative to TextTestRunner.__init__().

We’re now ready to customize the test process. But first, let’s review our investigation.

A Map

We can now expand our map to show the classes we’ve found:

There’s certainly more detail we could add, such as the contents of several important methods in DiscoverRunner. But what we have is by far enough to implement many useful customizations.

How to Customize

Django offers us two ways to customize the test running process:

Because the test command is so simple, most of the time we’ll customize by overriding DiscoverRunner. Since DiscoverRunner refers to the unittest components via class-level attributes, we can replace them by redefining the attributes in our custom subclass.

Super Fast Test Runner

For a basic example, imagine we want to skip all our tests and report success every time. We can do this by creating a DiscoverRunner subclass with a new run_tests() method that doesn’t call its super() method:

# example/test.py
from django.test.runner import DiscoverRunner


class SuperFastTestRunner(DiscoverRunner):
    def run_tests(self, *args, **kwargs):
        print("All tests passed! A+")
        failures = 0
        return failures

Then we use it in our settings file like so:

TEST_RUNNER = "example.test.SuperFastTestRunner"

When we run manage.py test now, it completes in record time!

$ python manage.py test
All tests passed! A+

Great, that is very useful.

Let’s finally get on to our much more practical emoji example.

Emoji-Based Output

From our investigation we found that the unittest component TextTestResult is responsible for performing the output. We can replace it in DiscoverRunner, by having it pass a value for resultclass to TextTestRunner.

Django already has some options to swap resultclass, for example the --debug-sql option which prints executed queries for failing tests.

DiscoverRunner.run_suite() constructs TextTestRunner with arguments from the DiscoverRunner.get_test_runner_kwargs() method:

def get_test_runner_kwargs(self):
    return {
        'failfast': self.failfast,
        'resultclass': self.get_resultclass(),
        'verbosity': self.verbosity,
        'buffer': self.buffer,
    }

This in turn calls get_resultclass(), which returns a different class if one of two test command options (--debug-sql or --pdb) have been used:

def get_resultclass(self):
    if self.debug_sql:
        return DebugSQLTextTestResult
    elif self.pdb:
        return PDBDebugResult

If neither option is set, this method implicitly returns None, telling TextTestResult to use the default resultclass. We can detect this None in our custom subclass and replace it with our own TextTestResult subclass:

class EmojiTestRunner(DiscoverRunner):
    def get_resultclass(self):
        klass = super().get_resultclass()
        if klass is None:
            return EmojiTestResult
        return klass

Our EmojiTestResult class extends TextTestResult and replaces the default “dots” output with emoji. It ends up being quite long since it has one method for each type of test result:

class EmojiTestResult(unittest.TextTestResult):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        # If the "dots" style was going to be used, show emoji instead
        self.emojis = self.dots
        self.dots = False

    def addSuccess(self, test):
        super().addSuccess(test)
        if self.emojis:
            self.stream.write('✅')
            self.stream.flush()

    def addError(self, test, err):
        super().addError(test, err)
        if self.emojis:
            self.stream.write('💥')
            self.stream.flush()

    def addFailure(self, test, err):
        super().addFailure(test, err)
        if self.emojis:
            self.stream.write('❌')
            self.stream.flush()

    def addSkip(self, test, reason):
        super().addSkip(test, reason)
        if self.emojis:
            self.stream.write("⏭")
            self.stream.flush()

    def addExpectedFailure(self, test, err):
        super().addExpectedFailure(test, err)
        if self.emojis:
            self.stream.write("❎")
            self.stream.flush()

    def addUnexpectedSuccess(self, test):
        super().addUnexpectedSuccess(test)
        if self.emojis:
            self.stream.write("✳️")
            self.stream.flush()

    def printErrors(self):
        if self.emojis:
            self.stream.writeln()
        super().printErrors()

After pointing the TEST_RUNNER setting at EmojiTestRunner, we can run tests and see our emoji:

$ python manage.py test
Creating test database for alias 'default'...
System check identified no issues (0 silenced).
💥❎❌⏭✅✅✅✳️

...

----------------------------------------------------------------------
Ran 8 tests in 0.003s

FAILED (failures=1, errors=1, skipped=1, expected failures=1, unexpected successes=1)
Destroying test database for alias 'default'...

Yay! 👍

No Composition

After our spelunking, we’ve seen the unittest design is relatively straightforward. We can swap classes for subclasses to change any behaviour in the test process.

This works for some project-specific customizations, but it’s not very easy to combine others’ customizations. This is because the design uses inheritance and not composition. We have to use multiple inheritance across the web of classes to combine customizations, and whether this works depends very much on the implementation of the customizations. Because of this not really a plugin ecosystem for unittest.

I know of only two libraries providing custom DiscoverRunner subclasses:

I haven’t tried, but combining them may not work, since they both override parts of the output process.

In contrast, pytest has a flourishing ecosystem with over 700 plugins. This is because its design uses composition with hooks, which act similar to Django signal. Plugins register for only the hooks they need, and pytest calls every registered hook function at the corresponding point of the test processes. Many of pytest’s built-in features are even implemented as plugins.

If you’re interested in heavier customization of your test process, do check out pytest.

Fin

Thank you for joining me on this tour. I hope you’ve learned something about how Django runs your tests, and can make your own customizations,

—Adam


Working on a Django project? Check out my book Speed Up Your Django Tests which covers loads of best practices so you can write faster, more accurate tests.


Subscribe via RSS, Twitter, or email:

One summary email a week, no spam, I pinky promise.

Related posts:

Tags: django