Docker on BitFolk worked Django example

From BitFolk
Jump to navigation Jump to search

This is a worked example in building a simple web app setup you can use to put the concepts in Containers 101 into play. As it is fully containerised you can set it up and play with it anywhere.

Description

This is a web application with three microservices. Each is a separate Docker container. We will use docker-compose to orchestrate them.

  • The front door (TCP port 80, visible to the outside world) is provided by nginx.
    • nginx serves the static resources from the web app itself, while operating as a reverse proxy to the dynamic part of the web app.
  • The web application itself is based on the Docker Django project example. It is truly trivial, containing no real application content, but the Django built-in admin interface gives us a way to demonstrate that the database is working.
  • The persistent data store is postgresql.

Django containerised web app example.png

If you want to play along but not create the files yourself, look at the corresponding example repo on github.

Health Warning

This is not a fully production-ready example. It has security holes, some of which are documented in Exercises for the reader.

Prerequisites

  • Docker and Docker Compose. I am using my Ubuntu desktop ( `apt install docker.io` ) but you could develop this on Windows or Mac using Docker Desktop, then deploy to your VPS at leisure.
  • Some text editor that you're comfortable with. (TODO: recommendations for Visual Studio Code extensions?)

Worked Example

Basic web app setup

Follow the steps given in the Docker Django example project

When you get to the point where you browse to http://localhost:8000 and see the Congratulations image (as pictured here), stop following the guide.

Docker-django-rocket.png

Some points to note:

  • In compose.yaml (or docker-compose.yml), we told Docker Compose to include a Postgres image named db in the stack. We specified version 14 of the image as that was current at the time.
  • We configured our Django project to use the Postgres back-end, but told it to connect to a database on host db. This hostname is created for us in the virtual container network by naming our database service db.
  • This is a default Django project with DEBUG=True, so is not production-ready.

Proving the database connection works

If you try to do anything with the app right now like browse to the admin site, you likely will get a bunch of errors in the Docker Compose console. If you're sharp-eyed, when you do this you will spot the message:

You have 18 unapplied migration(s). Your project may not work properly until you apply the migrations for app(s): admin, auth, contenttypes, sessions.
Run 'python manage.py migrate' to apply them.

So let's run the initial Django db migrations now.

If the containers are running, stop them (either press Ctrl-C, or run `docker-compose down` in another terminal). Now run:

docker-compose run web ./manage.py migrate

(This is, generally, how you do stuff in the container. You run a command on the container. That command can also be sh.)

Now you can create yourself a superuser account on your database:

docker-compose run web ./manage.py createsuperuser

Let's prove that the Django admin site and db are working:

You should see the Django admin site (pictured) offering to set you up users and groups. It's not much, but it proves that your database setup is working.

Docker-django-admin.png

If you want to prove that the postgres data is being stored on your host filesystem and not in the container, have a look at the contents of `data/db/`. If you set an email address for the superuser, you should be able to grep for it in there.

Shrinking the images

This article is focussing on the VPS environment where space is often at a premium, so let's have a look at resource usage.

$ docker images
REPOSITORY              TAG           IMAGE ID       CREATED          SIZE
container-web-app_web   latest        2cd26d4ee97e   37 minutes ago   958MB
postgres                14            b2261d3c6ce0   9 days ago       376MB
python                  3             0f95b1e38607   9 days ago       920MB

The container-web-app_web image is build as layers on top of the python:3 image, so that 920MB is shared between the two, but that's still pretty chunky.

Let's go through some steps to reduce our needs.

A smaller Postgres image

First up, we don't need to use full-blown postgres; most of the time, the Alpine build will do.

In your docker-compose file, change the db image line to read `image: postgres:14-alpine`. (By the way, be sure to preserve the indentation; white space is significant.)

docker-compose up and go back to `localhost:8000`, satisfy yourself that things are still working.

You can demonstrate that the running container is using the Alpine image (note the 'image' column):

$ docker ps
CONTAINER ID   IMAGE                   COMMAND                  CREATED         STATUS         PORTS                                       NAMES
b70698807a5d   container-web-app_web   "python manage.py ru…"   6 minutes ago   Up 6 minutes   0.0.0.0:8000->8000/tcp, :::8000->8000/tcp   container-web-app_web_1
26750557f65a   postgres:14-alpine      "docker-entrypoint.s…"   6 minutes ago   Up 6 minutes   5432/tcp                                    container-web-app_db_1
$ docker images postgres\*
REPOSITORY   TAG         IMAGE ID       CREATED        SIZE
postgres     latest      b2261d3c6ce0   9 days ago     376MB
postgres     14-alpine   07c710d28b91   11 days ago    216MB

A 42% saving on that one, not too shabby. But postgres wasn't the big one here; our web app was nearly 1GB. Onwards ...

Squeezing the web app

In the Dockerfile, edit the FROM line to read `FROM python:3.10-alpine`. (Note: 3.10 was current at the time this was written. You probably want to use a later version now.)

As it happens the Python Postgres connector (psycopg2) doesn't work out of the box in Alpine(*), but there is a binary package that does. So in requirements.txt, edit that line to read: `psycopg2-binary>=2.8`

(*) This is common with Python modules that need to be compiled. This is because Alpine doesn't use glibc, preferring the smaller musl.

Having done this, rebuild the app. I found that I needed to fix permissions on the database directory to make this work (chown -R myuserid data/db).

$ docker-compose up --build web
Building web
Sending build context to Docker daemon  44.79MB
Step 1/7 : FROM python:3.10-alpine
 ---> 27edb73bd1fc
[ ... much output skipped ... ]
web_1  | Django version 3.2.13, using settings 'composeexample.settings'
web_1  | Starting development server at http://0.0.0.0:8000/
web_1  | Quit the server with CONTROL-C.

Well, it claims to be up and running. Go back to http://localhost:8000 and prove to yourself that it is indeed working.

$ docker images
REPOSITORY              TAG           IMAGE ID       CREATED              SIZE
container-web-app_web   latest        05f6d0fec756   About a minute ago   134MB
<none>                  <none>        2cd26d4ee97e   53 minutes ago       958MB
postgres                latest        b2261d3c6ce0   9 days ago           376MB
python                  3             0f95b1e38607   9 days ago           920MB
postgres                14-alpine     07c710d28b91   11 days ago          216MB
python                  3.10-alpine   27edb73bd1fc   3 weeks ago          47.6MB

Our app is now 134MB, down from 958MB. Now we're getting somewhere!

Taking out the trash

You'll have noticed that the older container images are still present. The line tagged <none>/<none> is the previous image of our web app (check its ID with the output above; that's a shortened sha256 hash).

We don't need to keep them around. You can remove images by their symbolic name (note the repo:tag syntax) or by their ID:

$ docker rmi postgres:latest python:3 2cd26d4ee97e
Untagged: postgres:latest
Untagged: postgres@sha256:4ba3b78788bb284687376b9c1e0565b245375ddee0fe14cef25e315b6bd88b1a
Deleted: sha256:b2261d3c6ce0b23bed32e7567a92646b880de73d802550b1275baa0997aa34d0
[...]
Untagged: python:3
Untagged: python@sha256:eeed7cac682f9274d183f8a7533ee1360a26acb3616aa712b2be7896f80d8c5f
Error response from daemon: conflict: unable to delete 2cd26d4ee97e (must be forced) - image is being used by stopped container 7f22a78f3dea

Horrors! What does this mean?

Long story short, I had pressed Ctrl-C to stop my containers earlier, so the older instances had not been reaped. (If you used docker-compose down, go to the top of the class.)

$ docker ps -a
CONTAINER ID   IMAGE                   COMMAND                  CREATED          STATUS                      PORTS      NAMES
0d017beff44d   container-web-app_web   "python manage.py ru…"   10 minutes ago   Exited (0) 28 seconds ago              container-web-app_web_1
26750557f65a   postgres:14-alpine      "docker-entrypoint.s…"   23 minutes ago   Up 10 minutes               5432/tcp   container-web-app_db_1
77c28372ec3e   2cd26d4ee97e            "./manage.py creates…"   43 minutes ago   Exited (0) 43 minutes ago              container-web-app_web_run_a4f2ec75a142
eecb98c070c8   2cd26d4ee97e            "./manage.py migrate"    44 minutes ago   Exited (0) 44 minutes ago              container-web-app_web_run_2ad73f6f659c
277d06e9ad08   2cd26d4ee97e            "./manage.py creates…"   44 minutes ago   Exited (1) 44 minutes ago              container-web-app_web_run_f0a410da1bc3
7f22a78f3dea   2cd26d4ee97e            "manage.py createsup…"   44 minutes ago   Created                                container-web-app_web_run_bc39f7d97031

Notice the errant container id 7f22a78f3dea (from the error message) appears in the list. That's OK, we created our containers with docker-compose, so we just need to ask it to tidy up. This is a safe operation because we have been careful to ensure that our application data lives outside of the containers.

$ docker-compose down
Stopping container-web-app_db_1 ... done
Removing container-web-app_web_1                ... done
Removing container-web-app_db_1                 ... done
Removing container-web-app_web_run_a4f2ec75a142 ... done
Removing container-web-app_web_run_2ad73f6f659c ... done
Removing container-web-app_web_run_f0a410da1bc3 ... done
Removing container-web-app_web_run_bc39f7d97031 ... done
Removing network container-web-app_default
$ docker rmi 2cd26d4ee97e
Deleted: sha256:2cd26d4ee97e59c8d0a4c824100096ec93af832c34c8a8781c739b6b37d945bb
[...]
$ docker images
REPOSITORY              TAG           IMAGE ID       CREATED          SIZE
container-web-app_web   latest        05f6d0fec756   13 minutes ago   134MB
postgres                14-alpine     07c710d28b91   11 days ago      216MB
python                  3.10-alpine   27edb73bd1fc   3 weeks ago      47.6MB

Much better. You can restart the containers and prove to yourself that they're still working, and rerun docker images to show that the larger versions have not been re-downloaded.

Adding nginx to Django

So far so good. Let's set up a front door so we can switch off Django's debug mode and serve the static parts of the site more efficiently. (This is one of the steps you have to take in moving a Django site from development to production.)

I'm going to use nginx here, and again the Alpine based image at that (24MB vs 142MB).

First, we need to put the static parts of our web app somewhere we can serve them from.

Django static files

If you've used Django before, you'll be ahead of me here. We are going to ask Django to put all the static parts of its app in a single directory that they can be served from, and configure nginx to serve them up.

In composeexample/settings.py add a line declaring a STATIC_ROOT. The trick is to remember that it is a filesystem path inside the container, and we want the files to be visible outside the container so this has to be in a directory that is shared with the host filesystem. How to find one? Well, our Dockerfile sets the webapp's work directory to be /code, and that directory is shared, so this is suitable:

STATIC_ROOT = '/code/static/'

Now we can have Django do its stuff:

$ docker-compose run web ./manage.py collectstatic
Creating container-web-app_web_run ... done

128 static files copied to '/code/static'.

And sure enough, the static directory has appeared in our working directory. (If you needed to fix up the permissions of the generated files in the Django tutorial, consider repeating it for the static data.)

N.B. collectstatic is a one-off operation that needs to be repeated any time the static assets in your app change. You might consider scripting it to happen as part of app startup, but I'll leave that as an exercise for another day.

nginx config

If you're up for the challenge, you might try to configure this yourself. I'll describe what I did. Or if you prefer, check out the final docker-compose.yml.

  • I used the nginx:1.23-alpine image.
  • This container depends on the web container. (Or in other words, don't start nginx until the django app is up. You may decide that you would prefer the server to start faster; not having this dependency allows that, but if a user hits the site before Django's server is ready they will get a 503 Bad Gateway.)
  • I read the instructions for the official nginx image. They speak of letting you deploy configuration templates which apply environment variables and overwrite the contents of /etc/nginx/conf.d .
  • I created a basic nginx config file (see below) and put it in nginx-templates/default.conf.template
  • I gave the image some read-only volumes:
    • ./static is mounted into /usr/share/nginx/html/static
    • ./nginx-templates is mounted into /etc/nginx/templates
  • The container has the environment variable NGINX_PORT set to 80 (to prove that the environment variable mechanism works)
  • Port 80 inside this container is bound to port 80 on the host.


My nginx default.conf.template is pretty short and to the point:

  • It declares an upstream named webapp pointing to server web:8000 (this is the virtual host named in our docker compose file, on its given port)
  • There is a server block:
    • listening on port ${NGINX_PORT}
    • serving location /static/ from root /usr/share/nginx/html
    • serving location / by proxying to http://webapp

In composeexample/settings.py it is necessary to add the host 'webapp' to the ALLOWED_HOSTS setting.

Now you can fire up the containers. This time:

You can prove that the environment variable substitution is working by editing NGINX_PORT in the docker-compose file, changing the port forward next to it, and browsing to the new port number.

Last few touches

Now that nginx is running, tell compose to no longer expose webapp's port 8000 on the host. Delete that line (and the ports: section header above) and fire it up again; try connecting to http://localhost:8000 will demonstrate that it is no longer listening.

On a live site you would always turn off debug mode before deployment. In composeexample/settings.py, set `DEBUG = False` . Restart the containers and this time, the rocket logo on http://localhost/ should have been replaced with a Not Found error. This is good! You haven't set up any real pages in your Django app, so this is the expected behaviour. But http://localhost/admin/ should look like it did before.

Resource budgeting

Storage

The biggest storage consumer is the container images your stack uses.

$ docker images
REPOSITORY              TAG           IMAGE ID       CREATED         SIZE
container-web-app_web   latest        05f6d0fec756   2 hours ago     134MB
nginx                   1.23-alpine   f246e6f9d0b2   10 days ago     23.5MB
postgres                14-alpine     07c710d28b91   11 days ago     216MB
python                  3.10-alpine   27edb73bd1fc   3 weeks ago     47.6MB

You will need to allow for:

  • Enough space for all the built container images and layers your stack uses
    • Note that the total space is not necessarily the same as the sum of the SIZE column. This is because of the way Docker images are constructed, from a bunch of overlaid filesystem layers. Our web app image is built on python:3.10-alpine, so those 47.6MB are shared. Conversely, if you 'docker rmi python:3.10-alpine' you don't save any disk space!
  • Your app's data (e.g. an empty postgres DB runs to about 40MB these days)
  • Your application code itself (this is unlikely to be the limiting factor!)

Running `docker system prune` from time to time is a good idea for having a tidyup. This also purges out the accreted cache of image layers that is assembled whenever you build a container.

RAM

At runtime each container process appears as a process in the host machine. You can use the usual tools (`ps`, `htop`, etc) to monitor RAM usage. In this example, immediately after starting I see:

  • 80M of python processes belonging to the containers
  • 51M of postgres processes
  • 10M of nginx

My VPS running my "real" web app is only using about 300M, which includes the kernel and the other non-containerised services I'm running; perfectly acceptable in a VPS with a base package of 1.5GB.

Nevertheless you should carefully consider how you configure your nginx and web app processes. It may be prudent to set up docker resource constraints, especially if you run other services on your VPS.

Exercises for the reader

  • Set the postgres password to something more secure. This is essential.
  • Use gunicorn instead of Django's built-in development server. This is strongly recommended.
  • Rearrange the directories so the Django app did not have read-write access to the parent directory of the postgres data directory. Strongly recommended.
  • Configure nginx's access_log and error_log usefully (side note: if you use systemd to invoke your docker compose file, the stdout will be captured and visible via journalctl, which may be enough for you)
  • Create a more functional demo app
    • At the time of writing the sample Docker Django project uses Django 3, so here is the Django 3.2 tutorial project. We've already created a project so you probably want to start where it says "Write your first view".
    • You probably want to switch debug mode back on before you do this.
    • Don't forget to rerun the collectstatic step when you've made a change that affects static resources.
    • You don't need to run `python manage.py runserver`; the container is doing this for you.
    • Whenever the tutorial tells you to run `python manage.py <somethingelse>` you will need to stop the containers and run it through `docker-compose`, like we did earlier to create the superuser account.
  • Have the app automatically run manage.py collectstatic on startup.
  • Add a TLS key to nginx and have it listen on port 443.
    • Add support for the ACME challenge (Lets Encrypt).
  • Set up systemd to launch your container composition