Solving Algorithmic Problems in Python with pytest

2019-04-21 pytest Flute

I occasionally enjoy solving algorithmic problems, for example Project Euler or The Advent of Code. I’ve been doing them since I was a pimply PHP-slinging teenager, competing with my peers across the UK in the British Informatics Olympiad. It was terribly nerdy and terribly fun.

Algorithmic problems are often used in technical interviews, which I find less fun. Strict deadlines stress me out.

In either case, using automated tests makes these problems way easier. I wish I knew this earlier!

Online algorithmic platforms, such as Codility, provide their own automated tests. You use these for basic validation on some examples, but edge cases are sneakily reserved for final marking. To ensure your solution covers all the bases, you need to write your own tests.

In Python, there’s no better test framework than pytest - I use it on all my open source projects. It makes it easy for you to add tests for any Python code. And it’s much better than Python’s built-in unittest.

unittest is an old horse and cart, while pytest is the batmobile.

An Example

Let’s look at a small problem together using pytest. Here’s the problem statement:

Write a function solve(items) that takes items, a list of integers, and returns the smallest integer greater than 0 that appears in the list, or 0 if there is none.
For example solution([1, 2]) should return 1, and solution([-1]) should return 0.
The list items will contain up to 1,000,000 integers.

First, let’s write the most basic sketch of a solve function. Since 0 is a default value to return, if we always return that we can score some easy points. Create a file called example.py with the contents:

def solve(items):
    return 0

Let’s test the function manually by running it in the Python 3 interpreter. Open the command line and run the examples:

$ python
Python 3.7.3 (default, Apr 14 2019, 21:08:36)
[Clang 10.0.1 (clang-1001.0.46.3)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from example import solve
>>> solve([1, 2])
0
>>> solve([-1])
0

So we failed on the first example, but succeeded with the second.

Doing this each time we modify the code would get tiresome quickly. We have to restart the interpreter, rerun the import and tests, and manually compare the outputs. Writing automated tests now, before we’ve tried to solve the problem properly, makes the whole loop faster. It also makes us think about all the cases up front.

Let’s write the tests and solve the problem with the Red, Green, Refactor pattern.

Red

First, let’s add tests that fail.

pytest can work with tests in the same file as our code, as simple functions with names starting test_. We use Python’s assert statement to compare the results of our code with our expectations.

Add tests for the two examples:

def solve(items):
    return 0


def test_example_1():
    assert solve([1, 2]) == 1


def test_example_2():
    assert solve([-1]) == 0

Then install pytest and run the tests:

$ python -m pip install pytest
Collecting pytest
  Downloading
...
Successfully installed atomicwrites-1.3.0 attrs-19.1.0 more-itertools-7.0.0 pluggy-0.9.0 py-1.8.0 pytest-4.4.1 six-1.12.0
$ pytest example.py
============================= test session starts ==============================
platform darwin -- Python 3.7.3, pytest-4.4.1, py-1.8.0, pluggy-0.chainz.0
rootdir: /Users/chainz/tmp/pytest_algos
collected 2 items

example.py F.                                                            [100%]

=================================== FAILURES ===================================
________________________________ test_example_1 ________________________________

    def test_example_1():
>       assert solve([1, 2]) == 1
E       assert 0 == 1
E        +  where 0 = solve([1, 2])

example.py:5: AssertionError
====================== 1 failed, 1 passed in 0.03 seconds ======================

In the middle of the output, we see the tests in example.py running. pytest outputs an F for the failing test, and a dot . for the passing one.

The FAILURES section expands on the failure with the line that failed. It then shows that solve([1, 2]) returned 0 which doesn’t compare equal to the expected value 1.

The tests we have are a start but they don’t cover all the cases. We don’t have any lists that contain 0. Also two extremes are missing: the empty list and a list of 1,000,000 items. Long lists could also contain lots of negative or positive integers, so we should try both of these.

Let’s add more tests. We can make a list of 1,000,000 items in Python by multiplying a short list. Add these tests at the end of the file:

def test_empty_list():
    assert solve([]) == 0


def test_list_with_zero():
    assert solve([0, 2]) == 2


def test_long_positive_list():
    assert solve([1, 2, 3, 4] * 250_000) == 1


def test_long_negative_list():
    assert solve([-1, -2, -3, -4] * 250_000) == 0

Run them again:

============================= test session starts ==============================
platform darwin -- Python 3.7.3, pytest-4.4.1, py-1.8.0, pluggy-0.9.0
rootdir: /Users/chainz/tmp/pytest_algos
collected 6 items

example.py F..FF.                                                        [100%]

=================================== FAILURES ===================================
________________________________ test_example_1 ________________________________

    def test_example_1():
>       assert solve([1, 2]) == 1
E       assert 0 == 1
E        +  where 0 = solve([1, 2])

example.py:6: AssertionError
_____________________________ test_list_with_zero ______________________________

    def test_list_with_zero():
>       assert solve([0, 2]) == 2
E       assert 0 == 2
E        +  where 0 = solve([0, 2])

example.py:18: AssertionError
___________________________ test_long_positive_list ____________________________

    def test_long_positive_list():
>       assert solve([1, 2, 3, 4] * 250_000) == 1
E       assert 0 == 1
E        +  where 0 = solve(([1, 2, 3, 4] * 250000))

example.py:22: AssertionError
====================== 3 failed, 3 passed in 0.08 seconds ======================

There are more failures - which is good!

Green

Now we can work to a solution that makes all the tests pass. Replace solve with this first attempt:

def solve(items):
    minimum = 0
    for i in items:
        if minimum == 0 or i < minimum:
            minimum = i
    return minimum

Re-run the tests:

============================= test session starts ==============================
platform darwin -- Python 3.7.3, pytest-4.4.1, py-1.8.0, pluggy-0.9.0
rootdir: /Users/chainz/tmp/pytest_algos
collected 6 items

example.py .F...F                                                        [100%]

=================================== FAILURES ===================================
________________________________ test_example_2 ________________________________

    def test_example_2():
>       assert solve([-1]) == 0
E       assert -1 == 0
E        +  where -1 = solve([-1])

example.py:14: AssertionError
___________________________ test_long_negative_list ____________________________

    def test_long_negative_list():
>       assert solve([-1, -2, -3, -4] * 250_000) == 0
E       assert -4 == 0
E        +  where -4 = solve(([-1, -2, -3, -4] * 250000))

example.py:30: AssertionError
====================== 2 failed, 4 passed in 0.17 seconds ======================

We have two different failures now. It seems we have yet to ignore negative numbers, whoops!

Modify the solution to do so, by adding the i > 0 condition:

def solve(items):
    minimum = 0
    for i in items:
        if i > 0 and (minimum == 0 or i < minimum):
            minimum = i
    return minimum

Now re-run the tests:

============================= test session starts ==============================
platform darwin -- Python 3.7.3, pytest-4.4.1, py-1.8.0, pluggy-0.9.0
rootdir: /Users/chainz/tmp/pytest_algos
collected 6 items

example.py ......                                                        [100%]

=========================== 6 passed in 0.11 seconds ===========================

Yay, they’re all green!

Refactor

This code we have is functional and clear. However, it’s not the most idiomatic Python. We can use built-ins to do much of the work here.

Built-ins work as higher-level building blocks that make this algorithm clearer and can run a bit faster. Using them also improves our ability to work on harder problems.

Since we have tests, we can refactor the code and verify it again quickly. And we can always undo!

Refactor the code to use a generator to remove the negative numbers, and min to find the smallest, with the default argument to return the default 0 (added in Python 3.4):

def solve(items):
    return min((i for i in items if i > 0), default=0)

Re-running the tests produces the same, all-passing output as above. Yay, we got the code down to one line!

Fin

I hope this tutorial has been useful, and you use this technique to level up your algorithmic game,

—Adam

Thanks to Mafalda Marques for testing this tutorial.

😸😸😸 Check out my new book on using GitHub effectively, Boost Your GitHub DX! 😸😸😸

One summary email a week, no spam, I pinky promise.

Related posts:

Tags: pytest, python