How to Build a Webhook Receiver in Django2021-05-09
A common way to receive data in a web application is with a webhook. The external system pushes data to yours with an HTTP request.
Correctly receiving and processing webhook data can be vital to your application working. In this post we’ll create a Django view to receive incoming webhook data.
Example use case
Imagine our site receives messages via webhook from a system at the infamous Acme Corporation. They follow the convention of sending POST requests with JSON bodies to a path on our site that we provide. They send a header with a secret token which we can use to authenticate their requests.
For the purposes of the example, we’ll ignore what we do with these messages and instead focus on the “scaffolding”.
Message log model
Before we start building a view, we should consider storing all incoming messages. Logging all incoming messages allows us to debug failures, check their structure is as documented, and otherwise audit what’s happening.
We could use any data store for the messages, but the simplest solution is to use a database model. This provides all the benefits of Django’s ORM and our database server’s durability guarantees.
The messages are JSON, so we can store them directly in a
Since Django 3.1 this works for all database backends.
We should also store the time we received the message, and index it to improve query performance. This will allow us to see the messages in order. We can also use use it to clear old messages, avoiding indefinite table growth.
Combining these requirements we get this model:
Note we’re using
models.Index, the modern way to define indexes.
Our view should verify the request, receive the incoming message, store it, process it, and reply with a success response. We can do these steps like so:
@csrf_exemptdisables Django’s default cross-site request forgery (CSRF) protection. Normally we wouldn’t want to accept a POST request without a CSRF token, as it could indicate a user being tricked into submitting a malicious form to our site from another. But for webhooks, we verify requests with different authentication schemes, so we can disable CSRF.
@require_POSTblocks non-POST requests.
ATOMIC_REQUESTS(transaction-per-request) for this view. Using
ATOMIC_REQUESTSis normally a good idea, and a straightforward way of adding transactions to your Django application. Here, we’re using direct transaction control—the
process_webhook_payload—to ensure that if our business logic crashes, we’ve at least saved the
AcmeWebhookMessagefor debugging. Therefore we don’t want a transaction around the whole view.
Acme’s system provides some authentication with a token in the
Acme-Webhook-Tokenheader. We check this header against the token they should be using, which we store in an environment variable and read in our settings. If the two do not match, we can reject the incoming message.
secrets.compare_digest()to perform the comparison. Unlike normal string comparison, this is guaranteed to take the same amount of time no matter the input string. This prevents timing attacks from retrieving our secret token. (Thanks to Florian Apolloner for reminding me to add this protection.)
Authentication is very important for webhook receivers since they are on the public web, and anyone could potentially discover them. Since there’s no real standard for webhooks, different callers use different authentication methods. If you’re adapting this code, check your caller’s documentation.
Before storing the new message, we clean up stored messages older than a week. This is a simple way to remove old data.
If our webhook ends up running frequently, executing this delete query each time may get expensive. In this case we could move the deletion out to a periodic background task, similar to Django’s
json.loads()to load the request body. We do this without any checking of the
Content-Typeheader or error handling if the body isn’t valid JSON. If an error does occur, the view will crash, and our error reporting software (e.g. Sentry) will alert us.
This is a fine failure mode for our example. Since we’ve verified the message is from Acme, if the body is not JSON, something has gone wrong, and we’d like to know about it.
We store the data in the
AcmeWebhookMessagemodel before attempting to process it. This ensures we have it logged even if we crash later.
We call our business logic handler. This has a stub implementation, left empty for the purposes of this example. In a real world application we’d add some code here. That said, deploying a first version with an empty handler is a good way to test messages are being received correctly.
We return a plain-text OK response from our view. Typically webhook callers check only the status code, so we can keep the body minimal.
We can add a URL mapping to our view with the standard
The path contains a random string, generated with a password manager. This adds a little extra security-by-obscurity, since we won’t provide this URL to anyone but Acme. This prevents at least URL enumeration attacks from discovering our receiver.
Random URL’s in the strings don’t provide real protection. URL’s often get copied to insecure places, such as logs, emails, or sticky notes. Unfortunately some webhook callers do not support any authentication mechanism, so this can be the best option.
To test our webhook view, we can make requests to it with Django’s test client:
@override_settingsto replace the token setting for every test in the test case. This means we don’t need to set a value in our test settings, nor use the sensitive real token, which we should not save in our code base.
To check the
@csrf_exemptdecorator, we create our test client with the
enforce_csrf_checksflag on. The test client would raise a CSRF error if we accidentally removed the decorator from the view.
We first test the view’s various failure modes before testing its success case. Testing both the missing and bad token cases is not strictly necessary for coverage, but done for completeness in case the code changes.
When making assertions on the response status codes, we compare them with the
HTTPStatusenum from the Python standard library.
To send the
Acme-Webhook-Tokenheader, we have to use the slightly unfriendly
There are many ways we might need to improve our webhook receiver, beyond finishing its business logic. Here are some ideas:
We might extract more data from the JSON body into separate fields on our
AcmeWebhookMessagemodel. For example, if there are multiple types of message we might want to be able to query them.
The caller might, by necessity, send us messages more than once. We’d want to guard against reprocessing repeat messages, to behave idempotently. We can do this by querying past messages, but we’d need more fields and maybe an index.
We could offload the processing of the messages to a background task, so we don’t make the caller wait for our success response. To do this we could extend our
AcmeWebhookMessagemodel with more fields. Background processing could also make us more robust, allowing retries etc.
We could prevent the caller system from overwhelming us by adding rate-limiting, using django-ratelimit.
I hope you’re web-hooked to my blog!
Want better tests? Check out my book Speed Up Your Django Tests which teaches you to write faster, more accurate tests.
One summary email a week, no spam, I pinky promise.
© 2021 All rights reserved.