Monday, November 25, 2024

Making a dev container with multiple data services

A dev container is a specification that describes how to open up a project in VS Code, GitHub Codespaces, or any other IDE supporting dev containers, in a consistent and repeatable manner. It builds on Docker and docker-compose, and also allows for IDE settings like extensions and settings. These days, I always try to add a .devcontainer/ folder to my GitHub templates, so that developers can open them up quickly and get the full environment set up for them.

In the past, I've made dev containers to bring in PostgreSQL, pgvector, and Redis, but I'd never made a dev container that could bring in multiple data services at the same time. I finally made a multi-service dev container today, as part of a pull request to flask-admin, so I'm sharing my approach here.

devcontainer.json

The entry point for a dev container is devcontainer.json, which tells the IDE to use a particular Dockerfile, docker-compose, or public image. Here's what it looks like for the multi-service container:

{
  "name": "Multi-service dev container",
  "dockerComposeFile": "docker-compose.yaml",
  "service": "app",
  "workspaceFolder": "/workspace"
}

That dev container tells the IDE to build a container using docker-compose.yaml and to treat the "app" service as the main container for the editor to open.

docker-compose.yaml

The docker-compose.yaml file needs to describe first the "app" container that will be used for the IDE's editing environment, and then describe any additional services. Here's what one looks like for a Python app bringing in PostgreSQL, Azurite, and MongoDB:

version: '3'

services:
  app:
    build:
      context: .
      dockerfile: Dockerfile
      args:
        IMAGE: python:3.12

    volumes:
      - ..:/workspace:cached

    # Overrides default command so things don't shut down after the process ends.
    command: sleep infinity
    environment:
      AZURE_STORAGE_CONNECTION_STRING: DefaultEndpointsProtocol=http;AccountName=devstoreaccount1;AccountKey=Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq/K1SZFPTOtr/KBHBeksoGMGw==;BlobEndpoint=http://127.0.0.1:10000/devstoreaccount1;
      POSTGRES_HOST: localhost
      POSTGRES_PASSWORD: postgres
      MONGODB_HOST: localhost

  postgres:
    image: postgis/postgis:16-3.4
    restart: unless-stopped
    environment:
      POSTGRES_PASSWORD: postgres
      POSTGRES_DB: flask_admin_test
    volumes:
      - postgres-data:/var/lib/postgresql/data
    network_mode: service:app

  azurite:
    container_name: azurite
    image: mcr.microsoft.com/azure-storage/azurite:latest
    restart: unless-stopped
    volumes:
      - azurite-data:/data
    network_mode: service:app

  mongo:
    image: mongo:5.0.14-focal
    restart: unless-stopped
    network_mode: service:app

volumes:
  postgres-data:
  azurite-data:

A few things to point out:

  • The "app" service is based on a local Dockerfile with a base Python image. It also sets environment variables for connecting to the subsequent services.
  • The "postgres" service is based off the official postgis image. The postgres or pgvector image would also work there. It specifies environment variables matching those used by the "app" service. It sets up a volume so that the data can persist inside the container.
  • The "azurite" service is based off the official azurite image, and also uses a volume for data persistance.
  • The "mongo service" is based off the official mongo image, and in this case, I did not set up a volume for it.
  • Each of the data services uses network_mode: service:app so that they are on the same network as the "app" service. This means that the app can access them at a localhost URL. The other approach is to use network_mode: bridge, the default, which would mean the services were only available at their service names, like "http://postgres:5432" or "http://azurite:10000". Either approach works, as long as your app code knows how to find the service ports.

Dockerfile

Any of the services can be defined with a Dockerfile, but the example above only uses a Dockerfile for the default "app" service, shown below:

ARG IMAGE=bullseye
FROM mcr.microsoft.com/devcontainers/${IMAGE}

RUN apt-get update && export DEBIAN_FRONTEND=noninteractive \
    && apt-get -y install --no-install-recommends postgresql-client \
     && apt-get clean -y && rm -rf /var/lib/apt/lists/*

That file brings in a devcontainer-optimized Python image, and then goes on to install the psql client for interaction with the PostgreSQL database. You can also install other tools here, plus install Python requirements. It just depends on what you want to be available in the environment, versus what commands you want developers to be running themselves.

No comments: