How to download a documentation website with Wget

Time to grab a whole document tree.

This post is an adapted extract from my book Boost Your Django DX, available now.

Sometimes you want to download a whole website so you have a local copy that you can browse offline. When programming, this is often useful for documentation that sites that do not provide downloadable versions, and are not available in offline tools like DevDocs.

One tool for making such copies is Wget (from “web get”). With the right flags, Wget can download a whole website and convert it for offline browsing.

Install Wget

Wget is widely available from platform package managers.

On macOS, you can use Homebrew:

$ brew install wget

On Windows, you can use Chocolatey:

> choco install wget

On Linux, most distributions have Wget pre-installed. If not, it’s normally installable from a wget package.

How to download a website

You can invoke this single big Wget command to download a site, replacing <website> the URL of the site:

$ wget --mirror --convert-links --adjust-extension --page-requisites --no-parent <website>

The URL may be either the full domain such as, or have a path prefix such as (We’ll take apart all those flags in a few sections.)

Downloading a website can take a little while, even on a fast connection. This is because Wget downloads pages one at a time, in order to discover links as it goes.

Wget stores the downloaded pages in a directory named after the website’s domain name, such as After Wget has completed, you can open pages from there in your web browser, and navigate as usual.

Example: the Django REST Framework documentation

The DRF documentation is available on DevDocs, but it can be out of date. And unfortunately, the DRF site doesn’t provide downloads.

You can use the above Wget command to download the Django REST Framework documentation like so:

$ wget --mirror --convert-links --adjust-extension --page-requisites --no-parent

Wget prints a lot of output, starting:

--2021-10-27 10:56:12--
Resolving (,,, ...
Connecting to (||:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 30663 (30K) [text/html]
Saving to: ‘’

www.django-rest-fr 100%[==============>]  29.94K  --.-KB/s    in 0.002s

2021-10-27 10:56:12 (13.2 MB/s) - ‘’ saved [30663/30663]

Loading robots.txt; please ignore errors.
--2021-10-27 10:56:12--
Reusing existing connection to
HTTP request sent, awaiting response... 404 Not Found
2021-10-27 10:56:12 ERROR 404: Not Found.

…after downloading every file, Wget finishes by converting links:

Converting links in 109.
Converting links in nothing to do.
Converting links in 2.
Converting links in 1.
Converting links in nothing to do.
Converted links in 73 files in 0.3 seconds.

…and it’s done.

Once Wget has finished, you can check the downloaded files:

$ ls
api-guide  css        index.html search     tutorial
community  img        js         topics

Things seem in place. To read the offline copy, you can open index.html in the browser, and browse away as usual.

Read offline documentation with Python’s web server

Some websites do not work when opened as a .html file in the web browser. This is because they use web features that browsers block on file:// URL’s, for security. To make such offline copies work, you need to open them over http:// URL’s, via a local web server, and luckily there’s one built in to Python.

For example, take the Django Girls Tutorial at . After downloading the site with Wget, you can open its pages in the browser, but navigation doesn’t work. If you open the browser’s developer console, you’ll see errors from clicking links, such as:

Security Error: Content at file:///.../ may not load data from file:///.../

Uncaught DOMException: The operation is insecure.

These messages are the browser reporting that it is blocking the website’s use of JavaScript for navigation.

You can fix these errors by loading the site through Python’s built-in web server. (This server is only suitable for local development, like Django’s runserver.)

To do so, navigate to the site folder:

$ cd
$ ls
en gitbook

…then, start the web server:

$ python -m http.server 8001
Serving HTTP on :: port 8001 (http://[::]:8001/) ...

Note this command explicitly uses port 8001, to avoid colliding with Django’s runserver, which you probably have running. Both http.server and runserver default to port 8000.

With the server running, open http://localhost:8001 in the browser, and you’ll find the documentation loads with working navigation. Huzzah!

An Explanation of All the Flags

Wget has very many options. Here’s a brief explanation of the flags we’re using:

Another flag that you may find useful is --wait <n>, which limits bandwidth consumption by adding a delay of <n> seconds between requests. This can lighten the load both for others on your internet connection and the web server you’re downloading from.

For more info see the Wget documentation.


Enjoy your time offline,


Make your development more pleasant with Boost Your Django DX.

Subscribe via RSS, Twitter, Mastodon, or email:

One summary email a week, no spam, I pinky promise.

Related posts: